Forums - PetaSAN

ForumGeneral DiscussionAlert! Disk already exists.
You need to log in to create posts and topics. Login · Register
Alert! Disk already exists.

hjallisnorra
19 Posts

October 5, 2017, 2:29 pm
Quote from hjallisnorra on October 5, 2017, 2:29 pm
when creating a new iscsi disk

any thing i do ends up with Alert! Disk already exists.

logs from the node:

05/10/2017 07:27:46 INFO Could not stop disk 00001
Exception: Get rbd image size : Connection Error.
raise Exception("Get rbd image size : Connection Error.")
File "/usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/api.py", line 354, in __get_image_size
disk.size= self.__get_image_size(disk_id)
File "/usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/api.py", line 364, in get_disk_meta
Traceback (most recent call last):
05/10/2017 07:27:46 ERROR Get rbd image size : Connection Error.
Exception: Get rbd image size error.
raise Exception("Get rbd image size error.")
File "/usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/api.py", line 345, in __get_image_size
Traceback (most recent call last):
05/10/2017 07:27:46 ERROR Get rbd image size error.
05/10/2017 07:27:46 ERROR Get ceph image size 00001 error
ImageNotFound: error opening image 00001 at snapshot None
File "rbd.pyx", line 1061, in rbd.Image.init (/opt/petasan/config/ceph-10.2.9/src/build/rbd.c:9939)
size= int(math.ceil(float( rbd.Image(ioctx, image).size())/ float(1024 3)))
File "/usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/api.py", line 341, in __get_image_size
Traceback (most recent call last):
05/10/2017 07:27:46 ERROR error opening image 00001 at snapshot None
05/10/2017 07:27:46 ERROR Get ceph image size 00001 error
05/10/2017 07:27:46 INFO Stopping disk 00001
05/10/2017 07:27:36 INFO Could not stop disk 00001
Exception: Get rbd image size : Connection Error.
raise Exception("Get rbd image size : Connection Error.")
File "/usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/api.py", line 354, in __get_image_size
disk.size= self.__get_image_size(disk_id)
File "/usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/api.py", line 364, in get_disk_meta
Traceback (most recent call last):
05/10/2017 07:27:36 ERROR Get rbd image size : Connection Error.
Exception: Get rbd image size error.
raise Exception("Get rbd image size error.")
File "/usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/api.py", line 345, in __get_image_size
Traceback (most recent call last):
05/10/2017 07:27:36 ERROR Get rbd image size error.
05/10/2017 07:27:36 ERROR Get ceph image size 00001 error
ImageNotFound: error opening image 00001 at snapshot None
File "rbd.pyx", line 1061, in rbd.Image.init (/opt/petasan/config/ceph-10.2.9/src/build/rbd.c:9939)
size= int(math.ceil(float( rbd.Image(ioctx, image).size())/ float(1024 3)))
File "/usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/api.py", line 341, in __get_image_size
Traceback (most recent call last):
05/10/2017 07:27:36 ERROR error opening image 00001 at snapshot None
05/10/2017 07:27:36 ERROR Get ceph image size 00001 error
05/10/2017 07:27:36 INFO Stopping disk 00001
05/10/2017 07:27:26 INFO Could not stop disk 00001
Exception: Get rbd image size : Connection Error.
raise Exception("Get rbd image size : Connection Error.")
File "/usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/api.py", line 354, in __get_image_size
disk.size= self.__get_image_size(disk_id)
File "/usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/api.py", line 364, in get_disk_meta
Traceback (most recent call last):
05/10/2017 07:27:26 ERROR Get rbd image size : Connection Error.
Exception: Get rbd image size error.
raise Exception("Get rbd image size error.")
File "/usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/api.py", line 345, in __get_image_size
Traceback (most recent call last):
05/10/2017 07:27:26 ERROR Get rbd image size error.
05/10/2017 07:27:26 ERROR Get ceph image size 00001 error
ImageNotFound: error opening image 00001 at snapshot None
File "rbd.pyx", line 1061, in rbd.Image.init (/opt/petasan/config/ceph-10.2.9/src/build/rbd.c:9939)
size= int(math.ceil(float( rbd.Image(ioctx, image).size())/ float(1024 3)))
File "/usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/api.py", line 341, in __get_image_size
Traceback (most recent call last):
05/10/2017 07:27:26 ERROR error opening image 00001 at snapshot None
05/10/2017 07:27:26 ERROR Get ceph image size 00001 error
05/10/2017 07:27:26 INFO Stopping disk 00001
05/10/2017 07:27:15 INFO Could not stop disk 00001
Exception: Get rbd image size : Connection Error.
raise Exception("Get rbd image size : Connection Error.")
File "/usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/api.py", line 354, in __get_image_size
disk.size= self.__get_image_size(disk_id)
File "/usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/api.py", line 364, in get_disk_meta
Traceback (most recent call last):
05/10/2017 07:27:15 ERROR Get rbd image size : Connection Error.
Exception: Get rbd image size error.
raise Exception("Get rbd image size error.")
File "/usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/api.py", line 345, in __get_image_size
Traceback (most recent call last):

when creating a new iscsi disk

any thing i do ends up with Alert! Disk already exists.

logs from the node:

05/10/2017 07:27:46 INFO Could not stop disk 00001
Exception: Get rbd image size : Connection Error.
raise Exception("Get rbd image size : Connection Error.")
File "/usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/api.py", line 354, in __get_image_size
disk.size= self.__get_image_size(disk_id)
File "/usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/api.py", line 364, in get_disk_meta
Traceback (most recent call last):
05/10/2017 07:27:46 ERROR Get rbd image size : Connection Error.
Exception: Get rbd image size error.
raise Exception("Get rbd image size error.")
File "/usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/api.py", line 345, in __get_image_size
Traceback (most recent call last):
05/10/2017 07:27:46 ERROR Get rbd image size error.
05/10/2017 07:27:46 ERROR Get ceph image size 00001 error
ImageNotFound: error opening image 00001 at snapshot None
File "rbd.pyx", line 1061, in rbd.Image.init (/opt/petasan/config/ceph-10.2.9/src/build/rbd.c:9939)
size= int(math.ceil(float( rbd.Image(ioctx, image).size())/ float(1024 3)))
File "/usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/api.py", line 341, in __get_image_size
Traceback (most recent call last):
05/10/2017 07:27:46 ERROR error opening image 00001 at snapshot None
05/10/2017 07:27:46 ERROR Get ceph image size 00001 error
05/10/2017 07:27:46 INFO Stopping disk 00001
05/10/2017 07:27:36 INFO Could not stop disk 00001
Exception: Get rbd image size : Connection Error.
raise Exception("Get rbd image size : Connection Error.")
File "/usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/api.py", line 354, in __get_image_size
disk.size= self.__get_image_size(disk_id)
File "/usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/api.py", line 364, in get_disk_meta
Traceback (most recent call last):
05/10/2017 07:27:36 ERROR Get rbd image size : Connection Error.
Exception: Get rbd image size error.
raise Exception("Get rbd image size error.")
File "/usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/api.py", line 345, in __get_image_size
Traceback (most recent call last):
05/10/2017 07:27:36 ERROR Get rbd image size error.
05/10/2017 07:27:36 ERROR Get ceph image size 00001 error
ImageNotFound: error opening image 00001 at snapshot None
File "rbd.pyx", line 1061, in rbd.Image.init (/opt/petasan/config/ceph-10.2.9/src/build/rbd.c:9939)
size= int(math.ceil(float( rbd.Image(ioctx, image).size())/ float(1024 3)))
File "/usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/api.py", line 341, in __get_image_size
Traceback (most recent call last):
05/10/2017 07:27:36 ERROR error opening image 00001 at snapshot None
05/10/2017 07:27:36 ERROR Get ceph image size 00001 error
05/10/2017 07:27:36 INFO Stopping disk 00001
05/10/2017 07:27:26 INFO Could not stop disk 00001
Exception: Get rbd image size : Connection Error.
raise Exception("Get rbd image size : Connection Error.")
File "/usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/api.py", line 354, in __get_image_size
disk.size= self.__get_image_size(disk_id)
File "/usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/api.py", line 364, in get_disk_meta
Traceback (most recent call last):
05/10/2017 07:27:26 ERROR Get rbd image size : Connection Error.
Exception: Get rbd image size error.
raise Exception("Get rbd image size error.")
File "/usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/api.py", line 345, in __get_image_size
Traceback (most recent call last):
05/10/2017 07:27:26 ERROR Get rbd image size error.
05/10/2017 07:27:26 ERROR Get ceph image size 00001 error
ImageNotFound: error opening image 00001 at snapshot None
File "rbd.pyx", line 1061, in rbd.Image.init (/opt/petasan/config/ceph-10.2.9/src/build/rbd.c:9939)
size= int(math.ceil(float( rbd.Image(ioctx, image).size())/ float(1024 3)))
File "/usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/api.py", line 341, in __get_image_size
Traceback (most recent call last):
05/10/2017 07:27:26 ERROR error opening image 00001 at snapshot None
05/10/2017 07:27:26 ERROR Get ceph image size 00001 error
05/10/2017 07:27:26 INFO Stopping disk 00001
05/10/2017 07:27:15 INFO Could not stop disk 00001
Exception: Get rbd image size : Connection Error.
raise Exception("Get rbd image size : Connection Error.")
File "/usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/api.py", line 354, in __get_image_size
disk.size= self.__get_image_size(disk_id)
File "/usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/api.py", line 364, in get_disk_meta
Traceback (most recent call last):
05/10/2017 07:27:15 ERROR Get rbd image size : Connection Error.
Exception: Get rbd image size error.
raise Exception("Get rbd image size error.")
File "/usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/api.py", line 345, in __get_image_size
Traceback (most recent call last):

#1

hjallisnorra
19 Posts

October 5, 2017, 3:21 pm
Quote from hjallisnorra on October 5, 2017, 3:21 pm
ooh and can anyone help me to resolve this without reinstalling the cluster, this is probably on account of me deleting a disk or two using the gui.

ooh and can anyone help me to resolve this without reinstalling the cluster, this is probably on account of me deleting a disk or two using the gui.

Last edited on October 5, 2017, 3:22 pm by hjallisnorra · #2

admin
2,961 Posts

October 6, 2017, 9:28 am
Quote from admin on October 6, 2017, 9:28 am
Hello

If you can supply a bit more info on steps how we can reproduce this. Was the ui used to create the disks and try to remove them or have you used any cli commands ? any additional info on what steps you did will help us.

If you start a fresh browser session and go to the disks list, what disks do you see and what is their status?

Can you add/stop or delete other disks than 00001 ?

What is the output of

rbd ls --cluster CLUSTER_NAME

If the problem is deleting a disk, you can as a last resort do it via ceph command

rbd rm image-00001 --cluster CLUSTER_NAME

Hello

If you can supply a bit more info on steps how we can reproduce this. Was the ui used to create the disks and try to remove them or have you used any cli commands ? any additional info on what steps you did will help us.

If you start a fresh browser session and go to the disks list, what disks do you see and what is their status?

Can you add/stop or delete other disks than 00001 ?

What is the output of

rbd ls --cluster CLUSTER_NAME

If the problem is deleting a disk, you can as a last resort do it via ceph command

rbd rm image-00001 --cluster CLUSTER_NAME

Last edited on October 6, 2017, 9:29 am by admin · #3

hjallisnorra
19 Posts

October 6, 2017, 9:40 am
Quote from hjallisnorra on October 6, 2017, 9:40 am
the cluster was created with 3 machines each with 24 2 TB disks using the default install from the iso, then i created 1 30TB iscsi test-disk-one and mounted in xen server moved a running vm to test-disk-one vm was running there for 3 hours then moved back to local storage on the xen server, then i removed test-disk-one from xenserver pool, then deleted test-disk-one from in petasan using web gui.

then later going to create a iscsi disk 5TB but that was when the error popped up.

Yesterday I installed the cluster again with the iso and will try and recreate this that is create 1 30TB disk then delete and create a new disk.

I will let you know if i have issues.

thanks.

the cluster was created with 3 machines each with 24 2 TB disks using the default install from the iso, then i created 1 30TB iscsi test-disk-one and mounted in xen server moved a running vm to test-disk-one vm was running there for 3 hours then moved back to local storage on the xen server, then i removed test-disk-one from xenserver pool, then deleted test-disk-one from in petasan using web gui.

then later going to create a iscsi disk 5TB but that was when the error popped up.

Yesterday I installed the cluster again with the iso and will try and recreate this that is create 1 30TB disk then delete and create a new disk.

I will let you know if i have issues.

thanks.

#4

davlaw
35 Posts

November 28, 2017, 7:44 pm
Quote from davlaw on November 28, 2017, 7:44 pm
Looks like I have fallen into this hole as well. PetaSan 1.4.0

Base on logs it looks like a metadata issue, but not sure which disk its looking for.

I'm still working in a VM environment so nothing really at stake, just learning.

At this point no iscsi disks can be added, no matter what name I give it and the log just fills up with error. Both management consoles have same issue.

** I have learned you have to give this beast some time * Changes are far from instantaneous.

Oh wait, does acquire path 00003/1 mean iscsi disk 3 lun 1? If so, how to clear old iscsi data???

Just got another one

Could not acquire path 00004/4 (guess lun# makes no sense)

MetadataException: Cannot get metadata.
28/11/2017 14:35:24 WARNING PetaSAN Could not complete __process, there are too many exceptions.
28/11/2017 14:35:25 ERROR Could not acquire path 00003/1
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/PetaSAN/backend/iscsi_service.py", line 402, in __acquire_path
all_image_meta = ceph_api.read_image_metadata(image_name)
File "/usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/api.py", line 179, in read_image_metadata
raise MetadataException("Cannot get metadata.")
MetadataException: Cannot get metadata.
28/11/2017 14:35:25 ERROR Error during __process.
28/11/2017 14:35:25 ERROR Cannot get metadata.
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/PetaSAN/backend/iscsi_service.py", line 90, in start
self.__process()
File "/usr/lib/python2.7/dist-packages/PetaSAN/backend/iscsi_service.py", line 116, in process
while self.do_process() != True:
File "/usr/lib/python2.7/dist-packages/PetaSAN/backend/iscsi_service.py", line 183, in __do_process
self.__acquire_path(str(path), self.__paths_consul_unlocked_firstborn.get(path))
File "/usr/lib/python2.7/dist-packages/PetaSAN/backend/iscsi_service.py", line 472, in __acquire_path
raise e
MetadataException: Cannot get metadata.

Looks like I have fallen into this hole as well. PetaSan 1.4.0

Base on logs it looks like a metadata issue, but not sure which disk its looking for.

I'm still working in a VM environment so nothing really at stake, just learning.

At this point no iscsi disks can be added, no matter what name I give it and the log just fills up with error. Both management consoles have same issue.

I have learned you have to give this beast some time * Changes are far from instantaneous.

Oh wait, does acquire path 00003/1 mean iscsi disk 3 lun 1? If so, how to clear old iscsi data???

Just got another one

Could not acquire path 00004/4 (guess lun# makes no sense)

MetadataException: Cannot get metadata.
28/11/2017 14:35:24 WARNING PetaSAN Could not complete __process, there are too many exceptions.
28/11/2017 14:35:25 ERROR Could not acquire path 00003/1
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/PetaSAN/backend/iscsi_service.py", line 402, in __acquire_path
all_image_meta = ceph_api.read_image_metadata(image_name)
File "/usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/api.py", line 179, in read_image_metadata
raise MetadataException("Cannot get metadata.")
MetadataException: Cannot get metadata.
28/11/2017 14:35:25 ERROR Error during __process.
28/11/2017 14:35:25 ERROR Cannot get metadata.
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/PetaSAN/backend/iscsi_service.py", line 90, in start
self.__process()
File "/usr/lib/python2.7/dist-packages/PetaSAN/backend/iscsi_service.py", line 116, in process
while self.do_process() != True:
File "/usr/lib/python2.7/dist-packages/PetaSAN/backend/iscsi_service.py", line 183, in __do_process
self.__acquire_path(str(path), self.__paths_consul_unlocked_firstborn.get(path))
File "/usr/lib/python2.7/dist-packages/PetaSAN/backend/iscsi_service.py", line 472, in __acquire_path
raise e
MetadataException: Cannot get metadata.

Last edited on November 28, 2017, 7:47 pm by davlaw · #5

davlaw
35 Posts

November 28, 2017, 8:30 pm
Quote from davlaw on November 28, 2017, 8:30 pm
shutdown of all 3 nodes seem to do it. Still unclear why. Some disk that were added before shutdown have appeared as not used. Hope this will clear it up...

shutdown of all 3 nodes seem to do it. Still unclear why. Some disk that were added before shutdown have appeared as not used. Hope this will clear it up...

#6

admin
2,961 Posts

November 28, 2017, 9:29 pm
Quote from admin on November 28, 2017, 9:29 pm
This is not normal. What is the health of the cluster as displayed in the dashboard..does it show Ok ?

If there is serious issue, like network issue or Ceph disks not able to sync/peer together then the cluster health will not show Ok and may not be accessible/responsive which may lead to metadata read issues among other things.

It could also be a configuration issue, if you just installed it maybe re-install it and double check your conncections

This is not normal. What is the health of the cluster as displayed in the dashboard..does it show Ok ?

If there is serious issue, like network issue or Ceph disks not able to sync/peer together then the cluster health will not show Ok and may not be accessible/responsive which may lead to metadata read issues among other things.

It could also be a configuration issue, if you just installed it maybe re-install it and double check your conncections

Last edited on November 28, 2017, 9:31 pm by admin · #7

davlaw
35 Posts

November 29, 2017, 11:53 am
Quote from davlaw on November 29, 2017, 11:53 am
Its currently within a test enviroment, Proxmox, with 5 vmbr. So all backplane traffic should be internal, but not clear what that "blackbox" may contain (internal networking of Prox).

Health was "Normal", but unable to add iscsi config, gave a "name in use" error. After reboot several disk/OSD would not come back online, reason for the next question ->

* Couple of things came to mind last evening, GlusterFS. Does it keep track of disk uuid so that it can not be reused if marked bad (have not looked at that yet). Being more careful that disk that get marked down/bad are fully deleted and new ones are created, hopefully new uuid. I have an older GPFS san here that is really picky about that. Or is ceph dealing with that?*

Trying to keep in mind such a resource starved setup I have, already over allocated disk and ran of of room on the Proxmox storage, memory starved etc. Sometimes difficult to really analyze an issue with my test environment is so flaky. 1 TB allocated does not fit well in 600 mb disk space 😉

Tx!

Its currently within a test enviroment, Proxmox, with 5 vmbr. So all backplane traffic should be internal, but not clear what that "blackbox" may contain (internal networking of Prox).

Health was "Normal", but unable to add iscsi config, gave a "name in use" error. After reboot several disk/OSD would not come back online, reason for the next question ->

* Couple of things came to mind last evening, GlusterFS. Does it keep track of disk uuid so that it can not be reused if marked bad (have not looked at that yet). Being more careful that disk that get marked down/bad are fully deleted and new ones are created, hopefully new uuid. I have an older GPFS san here that is really picky about that. Or is ceph dealing with that?*

Trying to keep in mind such a resource starved setup I have, already over allocated disk and ran of of room on the Proxmox storage, memory starved etc. Sometimes difficult to really analyze an issue with my test environment is so flaky. 1 TB allocated does not fit well in 600 mb disk space 😉

Tx!

#8

admin**
2,961 Posts

November 29, 2017, 12:32 pm
Quote from admin on November 29, 2017, 12:32 pm
Hi,

The OSDs not coming up on reboot is not a good sign. Since this is a test cluster i suggest you re-install, it takes only a few minutes. After installation make sure the nodes can ping each other on the management, backend 1 and 2 networks without too much latency. My feeling is you either have a network config issue or your are tool low in RAM.

Regarding gluster it does not have a direct effect on what you see. It does not take part in storage and no OSD depend on it. It is only used to create a shared configuration filesystem on the system disk and not on the storage disks.

Hi,

The OSDs not coming up on reboot is not a good sign. Since this is a test cluster i suggest you re-install, it takes only a few minutes. After installation make sure the nodes can ping each other on the management, backend 1 and 2 networks without too much latency. My feeling is you either have a network config issue or your are tool low in RAM.

Regarding gluster it does not have a direct effect on what you see. It does not take part in storage and no OSD depend on it. It is only used to create a shared configuration filesystem on the system disk and not on the storage disks.

#9

Post Reply: Alert! Disk already exists.

Cancel