Alert! Disk already exists.
hjallisnorra
19 Posts
October 5, 2017, 2:29 pmQuote from hjallisnorra on October 5, 2017, 2:29 pmwhen creating a new iscsi disk
any thing i do ends up with Alert! Disk already exists.
logs from the node:
when creating a new iscsi disk
any thing i do ends up with Alert! Disk already exists.
logs from the node:
hjallisnorra
19 Posts
October 5, 2017, 3:21 pmQuote from hjallisnorra on October 5, 2017, 3:21 pmooh and can anyone help me to resolve this without reinstalling the cluster, this is probably on account of me deleting a disk or two using the gui.
ooh and can anyone help me to resolve this without reinstalling the cluster, this is probably on account of me deleting a disk or two using the gui.
Last edited on October 5, 2017, 3:22 pm by hjallisnorra · #2
admin
2,930 Posts
October 6, 2017, 9:28 amQuote from admin on October 6, 2017, 9:28 amHello
If you can supply a bit more info on steps how we can reproduce this. Was the ui used to create the disks and try to remove them or have you used any cli commands ? any additional info on what steps you did will help us.
If you start a fresh browser session and go to the disks list, what disks do you see and what is their status?
Can you add/stop or delete other disks than 00001 ?
What is the output of
rbd ls --cluster CLUSTER_NAME
If the problem is deleting a disk, you can as a last resort do it via ceph command
rbd rm image-00001 --cluster CLUSTER_NAME
Hello
If you can supply a bit more info on steps how we can reproduce this. Was the ui used to create the disks and try to remove them or have you used any cli commands ? any additional info on what steps you did will help us.
If you start a fresh browser session and go to the disks list, what disks do you see and what is their status?
Can you add/stop or delete other disks than 00001 ?
What is the output of
rbd ls --cluster CLUSTER_NAME
If the problem is deleting a disk, you can as a last resort do it via ceph command
rbd rm image-00001 --cluster CLUSTER_NAME
Last edited on October 6, 2017, 9:29 am by admin · #3
hjallisnorra
19 Posts
October 6, 2017, 9:40 amQuote from hjallisnorra on October 6, 2017, 9:40 amthe cluster was created with 3 machines each with 24 2 TB disks using the default install from the iso, then i created 1 30TB iscsi test-disk-one and mounted in xen server moved a running vm to test-disk-one vm was running there for 3 hours then moved back to local storage on the xen server, then i removed test-disk-one from xenserver pool, then deleted test-disk-one from in petasan using web gui.
then later going to create a iscsi disk 5TB but that was when the error popped up.
Yesterday I installed the cluster again with the iso and will try and recreate this that is create 1 30TB disk then delete and create a new disk.
I will let you know if i have issues.
thanks.
the cluster was created with 3 machines each with 24 2 TB disks using the default install from the iso, then i created 1 30TB iscsi test-disk-one and mounted in xen server moved a running vm to test-disk-one vm was running there for 3 hours then moved back to local storage on the xen server, then i removed test-disk-one from xenserver pool, then deleted test-disk-one from in petasan using web gui.
then later going to create a iscsi disk 5TB but that was when the error popped up.
Yesterday I installed the cluster again with the iso and will try and recreate this that is create 1 30TB disk then delete and create a new disk.
I will let you know if i have issues.
thanks.
davlaw
35 Posts
November 28, 2017, 7:44 pmQuote from davlaw on November 28, 2017, 7:44 pmLooks like I have fallen into this hole as well. PetaSan 1.4.0
Base on logs it looks like a metadata issue, but not sure which disk its looking for.
I'm still working in a VM environment so nothing really at stake, just learning.
At this point no iscsi disks can be added, no matter what name I give it and the log just fills up with error. Both management consoles have same issue.
**** I have learned you have to give this beast some time ***** Changes are far from instantaneous.
Oh wait, does acquire path 00003/1 mean iscsi disk 3 lun 1? If so, how to clear old iscsi data???
Just got another one
Could not acquire path 00004/4 (guess lun# makes no sense)
MetadataException: Cannot get metadata.
28/11/2017 14:35:24 WARNING PetaSAN Could not complete __process, there are too many exceptions.
28/11/2017 14:35:25 ERROR Could not acquire path 00003/1
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/PetaSAN/backend/iscsi_service.py", line 402, in __acquire_path
all_image_meta = ceph_api.read_image_metadata(image_name)
File "/usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/api.py", line 179, in read_image_metadata
raise MetadataException("Cannot get metadata.")
MetadataException: Cannot get metadata.
28/11/2017 14:35:25 ERROR Error during __process.
28/11/2017 14:35:25 ERROR Cannot get metadata.
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/PetaSAN/backend/iscsi_service.py", line 90, in start
self.__process()
File "/usr/lib/python2.7/dist-packages/PetaSAN/backend/iscsi_service.py", line 116, in __process
while self.__do_process() != True:
File "/usr/lib/python2.7/dist-packages/PetaSAN/backend/iscsi_service.py", line 183, in __do_process
self.__acquire_path(str(path), self.__paths_consul_unlocked_firstborn.get(path))
File "/usr/lib/python2.7/dist-packages/PetaSAN/backend/iscsi_service.py", line 472, in __acquire_path
raise e
MetadataException: Cannot get metadata.
Looks like I have fallen into this hole as well. PetaSan 1.4.0
Base on logs it looks like a metadata issue, but not sure which disk its looking for.
I'm still working in a VM environment so nothing really at stake, just learning.
At this point no iscsi disks can be added, no matter what name I give it and the log just fills up with error. Both management consoles have same issue.
**** I have learned you have to give this beast some time ***** Changes are far from instantaneous.
Oh wait, does acquire path 00003/1 mean iscsi disk 3 lun 1? If so, how to clear old iscsi data???
Just got another one
Could not acquire path 00004/4 (guess lun# makes no sense)
MetadataException: Cannot get metadata.
28/11/2017 14:35:24 WARNING PetaSAN Could not complete __process, there are too many exceptions.
28/11/2017 14:35:25 ERROR Could not acquire path 00003/1
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/PetaSAN/backend/iscsi_service.py", line 402, in __acquire_path
all_image_meta = ceph_api.read_image_metadata(image_name)
File "/usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/api.py", line 179, in read_image_metadata
raise MetadataException("Cannot get metadata.")
MetadataException: Cannot get metadata.
28/11/2017 14:35:25 ERROR Error during __process.
28/11/2017 14:35:25 ERROR Cannot get metadata.
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/PetaSAN/backend/iscsi_service.py", line 90, in start
self.__process()
File "/usr/lib/python2.7/dist-packages/PetaSAN/backend/iscsi_service.py", line 116, in __process
while self.__do_process() != True:
File "/usr/lib/python2.7/dist-packages/PetaSAN/backend/iscsi_service.py", line 183, in __do_process
self.__acquire_path(str(path), self.__paths_consul_unlocked_firstborn.get(path))
File "/usr/lib/python2.7/dist-packages/PetaSAN/backend/iscsi_service.py", line 472, in __acquire_path
raise e
MetadataException: Cannot get metadata.
Last edited on November 28, 2017, 7:47 pm by davlaw · #5
davlaw
35 Posts
November 28, 2017, 8:30 pmQuote from davlaw on November 28, 2017, 8:30 pmshutdown of all 3 nodes seem to do it. Still unclear why. Some disk that were added before shutdown have appeared as not used. Hope this will clear it up...
shutdown of all 3 nodes seem to do it. Still unclear why. Some disk that were added before shutdown have appeared as not used. Hope this will clear it up...
admin
2,930 Posts
November 28, 2017, 9:29 pmQuote from admin on November 28, 2017, 9:29 pmThis is not normal. What is the health of the cluster as displayed in the dashboard..does it show Ok ?
If there is serious issue, like network issue or Ceph disks not able to sync/peer together then the cluster health will not show Ok and may not be accessible/responsive which may lead to metadata read issues among other things.
It could also be a configuration issue, if you just installed it maybe re-install it and double check your conncections
This is not normal. What is the health of the cluster as displayed in the dashboard..does it show Ok ?
If there is serious issue, like network issue or Ceph disks not able to sync/peer together then the cluster health will not show Ok and may not be accessible/responsive which may lead to metadata read issues among other things.
It could also be a configuration issue, if you just installed it maybe re-install it and double check your conncections
Last edited on November 28, 2017, 9:31 pm by admin · #7
davlaw
35 Posts
November 29, 2017, 11:53 amQuote from davlaw on November 29, 2017, 11:53 amIts currently within a test enviroment, Proxmox, with 5 vmbr. So all backplane traffic should be internal, but not clear what that "blackbox" may contain (internal networking of Prox).
Health was "Normal", but unable to add iscsi config, gave a "name in use" error. After reboot several disk/OSD would not come back online, reason for the next question ->
*** Couple of things came to mind last evening, GlusterFS. Does it keep track of disk uuid so that it can not be reused if marked bad (have not looked at that yet). Being more careful that disk that get marked down/bad are fully deleted and new ones are created, hopefully new uuid. I have an older GPFS san here that is really picky about that. Or is ceph dealing with that?***
Trying to keep in mind such a resource starved setup I have, already over allocated disk and ran of of room on the Proxmox storage, memory starved etc. Sometimes difficult to really analyze an issue with my test environment is so flaky. 1 TB allocated does not fit well in 600 mb disk space 😉
Tx!
Its currently within a test enviroment, Proxmox, with 5 vmbr. So all backplane traffic should be internal, but not clear what that "blackbox" may contain (internal networking of Prox).
Health was "Normal", but unable to add iscsi config, gave a "name in use" error. After reboot several disk/OSD would not come back online, reason for the next question ->
*** Couple of things came to mind last evening, GlusterFS. Does it keep track of disk uuid so that it can not be reused if marked bad (have not looked at that yet). Being more careful that disk that get marked down/bad are fully deleted and new ones are created, hopefully new uuid. I have an older GPFS san here that is really picky about that. Or is ceph dealing with that?***
Trying to keep in mind such a resource starved setup I have, already over allocated disk and ran of of room on the Proxmox storage, memory starved etc. Sometimes difficult to really analyze an issue with my test environment is so flaky. 1 TB allocated does not fit well in 600 mb disk space 😉
Tx!
admin
2,930 Posts
November 29, 2017, 12:32 pmQuote from admin on November 29, 2017, 12:32 pmHi,
The OSDs not coming up on reboot is not a good sign. Since this is a test cluster i suggest you re-install, it takes only a few minutes. After installation make sure the nodes can ping each other on the management, backend 1 and 2 networks without too much latency. My feeling is you either have a network config issue or your are tool low in RAM.
Regarding gluster it does not have a direct effect on what you see. It does not take part in storage and no OSD depend on it. It is only used to create a shared configuration filesystem on the system disk and not on the storage disks.
Hi,
The OSDs not coming up on reboot is not a good sign. Since this is a test cluster i suggest you re-install, it takes only a few minutes. After installation make sure the nodes can ping each other on the management, backend 1 and 2 networks without too much latency. My feeling is you either have a network config issue or your are tool low in RAM.
Regarding gluster it does not have a direct effect on what you see. It does not take part in storage and no OSD depend on it. It is only used to create a shared configuration filesystem on the system disk and not on the storage disks.
Alert! Disk already exists.
hjallisnorra
19 Posts
Quote from hjallisnorra on October 5, 2017, 2:29 pmwhen creating a new iscsi disk
any thing i do ends up with Alert! Disk already exists.
logs from the node:
when creating a new iscsi disk
any thing i do ends up with Alert! Disk already exists.
logs from the node:
hjallisnorra
19 Posts
Quote from hjallisnorra on October 5, 2017, 3:21 pmooh and can anyone help me to resolve this without reinstalling the cluster, this is probably on account of me deleting a disk or two using the gui.
ooh and can anyone help me to resolve this without reinstalling the cluster, this is probably on account of me deleting a disk or two using the gui.
admin
2,930 Posts
Quote from admin on October 6, 2017, 9:28 amHello
If you can supply a bit more info on steps how we can reproduce this. Was the ui used to create the disks and try to remove them or have you used any cli commands ? any additional info on what steps you did will help us.
If you start a fresh browser session and go to the disks list, what disks do you see and what is their status?
Can you add/stop or delete other disks than 00001 ?
What is the output of
rbd ls --cluster CLUSTER_NAME
If the problem is deleting a disk, you can as a last resort do it via ceph command
rbd rm image-00001 --cluster CLUSTER_NAME
Hello
If you can supply a bit more info on steps how we can reproduce this. Was the ui used to create the disks and try to remove them or have you used any cli commands ? any additional info on what steps you did will help us.
If you start a fresh browser session and go to the disks list, what disks do you see and what is their status?
Can you add/stop or delete other disks than 00001 ?
What is the output of
rbd ls --cluster CLUSTER_NAME
If the problem is deleting a disk, you can as a last resort do it via ceph command
rbd rm image-00001 --cluster CLUSTER_NAME
hjallisnorra
19 Posts
Quote from hjallisnorra on October 6, 2017, 9:40 amthe cluster was created with 3 machines each with 24 2 TB disks using the default install from the iso, then i created 1 30TB iscsi test-disk-one and mounted in xen server moved a running vm to test-disk-one vm was running there for 3 hours then moved back to local storage on the xen server, then i removed test-disk-one from xenserver pool, then deleted test-disk-one from in petasan using web gui.
then later going to create a iscsi disk 5TB but that was when the error popped up.
Yesterday I installed the cluster again with the iso and will try and recreate this that is create 1 30TB disk then delete and create a new disk.
I will let you know if i have issues.
thanks.
the cluster was created with 3 machines each with 24 2 TB disks using the default install from the iso, then i created 1 30TB iscsi test-disk-one and mounted in xen server moved a running vm to test-disk-one vm was running there for 3 hours then moved back to local storage on the xen server, then i removed test-disk-one from xenserver pool, then deleted test-disk-one from in petasan using web gui.
then later going to create a iscsi disk 5TB but that was when the error popped up.
Yesterday I installed the cluster again with the iso and will try and recreate this that is create 1 30TB disk then delete and create a new disk.
I will let you know if i have issues.
thanks.
davlaw
35 Posts
Quote from davlaw on November 28, 2017, 7:44 pmLooks like I have fallen into this hole as well. PetaSan 1.4.0
Base on logs it looks like a metadata issue, but not sure which disk its looking for.
I'm still working in a VM environment so nothing really at stake, just learning.
At this point no iscsi disks can be added, no matter what name I give it and the log just fills up with error. Both management consoles have same issue.
**** I have learned you have to give this beast some time ***** Changes are far from instantaneous.
Oh wait, does acquire path 00003/1 mean iscsi disk 3 lun 1? If so, how to clear old iscsi data???
Just got another one
Could not acquire path 00004/4 (guess lun# makes no sense)
MetadataException: Cannot get metadata.
28/11/2017 14:35:24 WARNING PetaSAN Could not complete __process, there are too many exceptions.
28/11/2017 14:35:25 ERROR Could not acquire path 00003/1
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/PetaSAN/backend/iscsi_service.py", line 402, in __acquire_path
all_image_meta = ceph_api.read_image_metadata(image_name)
File "/usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/api.py", line 179, in read_image_metadata
raise MetadataException("Cannot get metadata.")
MetadataException: Cannot get metadata.
28/11/2017 14:35:25 ERROR Error during __process.
28/11/2017 14:35:25 ERROR Cannot get metadata.
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/PetaSAN/backend/iscsi_service.py", line 90, in start
self.__process()
File "/usr/lib/python2.7/dist-packages/PetaSAN/backend/iscsi_service.py", line 116, in __process
while self.__do_process() != True:
File "/usr/lib/python2.7/dist-packages/PetaSAN/backend/iscsi_service.py", line 183, in __do_process
self.__acquire_path(str(path), self.__paths_consul_unlocked_firstborn.get(path))
File "/usr/lib/python2.7/dist-packages/PetaSAN/backend/iscsi_service.py", line 472, in __acquire_path
raise e
MetadataException: Cannot get metadata.
Looks like I have fallen into this hole as well. PetaSan 1.4.0
Base on logs it looks like a metadata issue, but not sure which disk its looking for.
I'm still working in a VM environment so nothing really at stake, just learning.
At this point no iscsi disks can be added, no matter what name I give it and the log just fills up with error. Both management consoles have same issue.
**** I have learned you have to give this beast some time ***** Changes are far from instantaneous.
Oh wait, does acquire path 00003/1 mean iscsi disk 3 lun 1? If so, how to clear old iscsi data???
Just got another one
Could not acquire path 00004/4 (guess lun# makes no sense)
MetadataException: Cannot get metadata.
28/11/2017 14:35:24 WARNING PetaSAN Could not complete __process, there are too many exceptions.
28/11/2017 14:35:25 ERROR Could not acquire path 00003/1
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/PetaSAN/backend/iscsi_service.py", line 402, in __acquire_path
all_image_meta = ceph_api.read_image_metadata(image_name)
File "/usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/api.py", line 179, in read_image_metadata
raise MetadataException("Cannot get metadata.")
MetadataException: Cannot get metadata.
28/11/2017 14:35:25 ERROR Error during __process.
28/11/2017 14:35:25 ERROR Cannot get metadata.
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/PetaSAN/backend/iscsi_service.py", line 90, in start
self.__process()
File "/usr/lib/python2.7/dist-packages/PetaSAN/backend/iscsi_service.py", line 116, in __process
while self.__do_process() != True:
File "/usr/lib/python2.7/dist-packages/PetaSAN/backend/iscsi_service.py", line 183, in __do_process
self.__acquire_path(str(path), self.__paths_consul_unlocked_firstborn.get(path))
File "/usr/lib/python2.7/dist-packages/PetaSAN/backend/iscsi_service.py", line 472, in __acquire_path
raise e
MetadataException: Cannot get metadata.
davlaw
35 Posts
Quote from davlaw on November 28, 2017, 8:30 pmshutdown of all 3 nodes seem to do it. Still unclear why. Some disk that were added before shutdown have appeared as not used. Hope this will clear it up...
shutdown of all 3 nodes seem to do it. Still unclear why. Some disk that were added before shutdown have appeared as not used. Hope this will clear it up...
admin
2,930 Posts
Quote from admin on November 28, 2017, 9:29 pmThis is not normal. What is the health of the cluster as displayed in the dashboard..does it show Ok ?
If there is serious issue, like network issue or Ceph disks not able to sync/peer together then the cluster health will not show Ok and may not be accessible/responsive which may lead to metadata read issues among other things.
It could also be a configuration issue, if you just installed it maybe re-install it and double check your conncections
This is not normal. What is the health of the cluster as displayed in the dashboard..does it show Ok ?
If there is serious issue, like network issue or Ceph disks not able to sync/peer together then the cluster health will not show Ok and may not be accessible/responsive which may lead to metadata read issues among other things.
It could also be a configuration issue, if you just installed it maybe re-install it and double check your conncections
davlaw
35 Posts
Quote from davlaw on November 29, 2017, 11:53 amIts currently within a test enviroment, Proxmox, with 5 vmbr. So all backplane traffic should be internal, but not clear what that "blackbox" may contain (internal networking of Prox).
Health was "Normal", but unable to add iscsi config, gave a "name in use" error. After reboot several disk/OSD would not come back online, reason for the next question ->
*** Couple of things came to mind last evening, GlusterFS. Does it keep track of disk uuid so that it can not be reused if marked bad (have not looked at that yet). Being more careful that disk that get marked down/bad are fully deleted and new ones are created, hopefully new uuid. I have an older GPFS san here that is really picky about that. Or is ceph dealing with that?***
Trying to keep in mind such a resource starved setup I have, already over allocated disk and ran of of room on the Proxmox storage, memory starved etc. Sometimes difficult to really analyze an issue with my test environment is so flaky. 1 TB allocated does not fit well in 600 mb disk space 😉
Tx!
Its currently within a test enviroment, Proxmox, with 5 vmbr. So all backplane traffic should be internal, but not clear what that "blackbox" may contain (internal networking of Prox).
Health was "Normal", but unable to add iscsi config, gave a "name in use" error. After reboot several disk/OSD would not come back online, reason for the next question ->
*** Couple of things came to mind last evening, GlusterFS. Does it keep track of disk uuid so that it can not be reused if marked bad (have not looked at that yet). Being more careful that disk that get marked down/bad are fully deleted and new ones are created, hopefully new uuid. I have an older GPFS san here that is really picky about that. Or is ceph dealing with that?***
Trying to keep in mind such a resource starved setup I have, already over allocated disk and ran of of room on the Proxmox storage, memory starved etc. Sometimes difficult to really analyze an issue with my test environment is so flaky. 1 TB allocated does not fit well in 600 mb disk space 😉
Tx!
admin
2,930 Posts
Quote from admin on November 29, 2017, 12:32 pmHi,
The OSDs not coming up on reboot is not a good sign. Since this is a test cluster i suggest you re-install, it takes only a few minutes. After installation make sure the nodes can ping each other on the management, backend 1 and 2 networks without too much latency. My feeling is you either have a network config issue or your are tool low in RAM.
Regarding gluster it does not have a direct effect on what you see. It does not take part in storage and no OSD depend on it. It is only used to create a shared configuration filesystem on the system disk and not on the storage disks.
Hi,
The OSDs not coming up on reboot is not a good sign. Since this is a test cluster i suggest you re-install, it takes only a few minutes. After installation make sure the nodes can ping each other on the management, backend 1 and 2 networks without too much latency. My feeling is you either have a network config issue or your are tool low in RAM.
Regarding gluster it does not have a direct effect on what you see. It does not take part in storage and no OSD depend on it. It is only used to create a shared configuration filesystem on the system disk and not on the storage disks.