Forums

Home / Forums

You need to log in to create posts and topics. Login · Register

Alert! Disk already exists.

when creating a new iscsi disk

 

any thing i do ends up with Alert!  Disk already exists.

logs from the node:

 














































































ooh and can anyone help me to resolve this without reinstalling the cluster, this is probably on account of me deleting a disk or two using the gui.

Hello

If you can supply a bit more info on steps how we can reproduce this. Was the ui used to create the disks and try to remove them or have you used any cli commands ? any additional info on what steps you did will help us.

If you start a fresh browser session and go to the disks list, what disks do you see and what is their status?

Can you add/stop or delete other disks than 00001 ?

What is the output of

rbd ls --cluster CLUSTER_NAME

If the problem is deleting a disk, you can as a last resort do it via ceph command

rbd rm image-00001 --cluster CLUSTER_NAME

the cluster was created with 3 machines each with 24 2 TB disks using the default install from the iso, then i created 1 30TB iscsi test-disk-one and mounted in xen server moved a running vm to test-disk-one vm was running there for 3 hours then moved back to local storage on the xen server, then i removed test-disk-one from xenserver pool, then deleted test-disk-one from in petasan using web gui.

then later going to create a iscsi disk 5TB but that was when the error popped up.

Yesterday I installed the cluster again with the iso and will try and recreate this that is create 1 30TB disk then delete and create a new disk.

I will let you know if i have issues.

thanks.

Looks like I have fallen into this hole as well.   PetaSan 1.4.0

Base on logs it looks like a metadata issue, but not sure which disk its looking for.

I'm still working in a VM environment so nothing really at stake, just learning.

At this point no iscsi disks can be added, no matter what name I give it and the log just fills up with error. Both management consoles have same issue.

 

**** I have learned you have to give this beast some time ***** Changes are far from instantaneous.

Oh wait, does acquire path 00003/1 mean iscsi disk 3 lun 1?  If so, how to clear old iscsi data???

Just got another one

Could not acquire path 00004/4  (guess lun# makes no sense)

 

MetadataException: Cannot get metadata.
28/11/2017 14:35:24 WARNING PetaSAN Could not complete __process, there are too many exceptions.
28/11/2017 14:35:25 ERROR Could not acquire path 00003/1
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/PetaSAN/backend/iscsi_service.py", line 402, in __acquire_path
all_image_meta = ceph_api.read_image_metadata(image_name)
File "/usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/api.py", line 179, in read_image_metadata
raise MetadataException("Cannot get metadata.")
MetadataException: Cannot get metadata.
28/11/2017 14:35:25 ERROR Error during __process.
28/11/2017 14:35:25 ERROR Cannot get metadata.
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/PetaSAN/backend/iscsi_service.py", line 90, in start
self.__process()
File "/usr/lib/python2.7/dist-packages/PetaSAN/backend/iscsi_service.py", line 116, in __process
while self.__do_process() != True:
File "/usr/lib/python2.7/dist-packages/PetaSAN/backend/iscsi_service.py", line 183, in __do_process
self.__acquire_path(str(path), self.__paths_consul_unlocked_firstborn.get(path))
File "/usr/lib/python2.7/dist-packages/PetaSAN/backend/iscsi_service.py", line 472, in __acquire_path
raise e
MetadataException: Cannot get metadata.

 

shutdown of all 3 nodes seem to do it.  Still unclear why.  Some disk that were added before shutdown have appeared as not used.  Hope this will clear it up...

This is not normal. What is the health of the cluster as displayed in the dashboard..does it show Ok ?

If there is serious issue, like network issue or Ceph disks not able to sync/peer together then the cluster health will not show Ok and may not be accessible/responsive which may lead to metadata read issues among other things.

It could also be a configuration issue, if you just installed it maybe re-install it and double check your conncections

Its currently within a test enviroment, Proxmox, with 5 vmbr.  So all backplane traffic should be internal, but not clear what that "blackbox" may contain (internal networking of Prox).

Health was "Normal", but unable to add iscsi config,  gave a "name in use" error.  After reboot several disk/OSD would not come back online, reason for the next question ->

*** Couple of things came to mind last evening, GlusterFS.  Does it keep track of disk uuid so that it can not be reused if marked bad (have not looked at that yet).   Being more careful that disk that get marked down/bad are fully deleted and new ones are created, hopefully new uuid.  I have an older GPFS san here that is really picky about that.  Or is ceph dealing with that?***

Trying to keep in mind such a resource starved setup I have, already over allocated disk and ran of of room on the Proxmox storage, memory starved etc.   Sometimes difficult to really analyze an issue with my test environment is so flaky.  1 TB allocated does not fit well in 600 mb disk space 😉

Tx!

Hi,

The OSDs not coming up on reboot is not a good sign. Since this is a test cluster i suggest you re-install, it takes only a few minutes. After installation make sure the nodes can ping each other on the management, backend 1 and 2 networks without too much latency. My feeling is you either have a network config issue or your are tool low in RAM.

Regarding gluster it does not have a direct effect on what you see. It does not take part in storage and no OSD depend on it. It is only used to create a shared configuration filesystem on the system disk and not on the storage disks.