Forums

Home / Forums

You need to log in to create posts and topics. Login · Register

Unable to remove iscsi disk

Pages: 1 2

Running an existing cluster but wanted to see how to add another disk onto it.  Went through the steps and it added it to the "disk list".  While it said it was starting, I pushed on the stop tab.  Now it says its stopping for a few hours.  Wanted to see what was going on and did a ceph --cluster irq.2018-05-30.xxx.com.conf (yes, named it bad, but its a name)  and got this back.

2018-05-30 15:03:58.363312 7fdab50a7700 -1 Errors while parsing config file!
2018-05-30 15:03:58.363317 7fdab50a7700 -1 parse_file: cannot open /etc/ceph/irq.2018-05-30.xxx.com.conf: (2) No such file or directory
2018-05-30 15:03:58.363328 7fdab50a7700 -1 parse_file: cannot open ~/.ceph/irq.2018-05-30.xxx.com.conf: (2) No such file or directory
2018-05-30 15:03:58.363329 7fdab50a7700 -1 parse_file: cannot open irq.2018-05-30.xxx.com.conf: (2) No such file or directory
Error initializing cluster client: ObjectNotFound('error calling conf_read_file',)

Went to the directory /etc/ceph and noticed that there is no .conf for this name but the main one is still there.  So it looks like I stopped it at a bad time and messed it up as it was building it up.  How do I correct this?  I would like to just blow the disk away.

Was the running cluster health OK before you added the disk ?

Did you have running iSCSI disks before you added the disk ? Were they running OK ?

You noticed the problem when you added a new iSCSI disk or when you added a new physical disk/OSD  ?

What is the output of

ceph status --cluster CLUSTER_NAME

If the above fails with "parse_file: cannot open /etc/ceph/.." , can you run it from other nodes.

Is the conf file missing on all nodes ?

Did you name your cluster " irq.2018-05-30.xxx.com"  ? If so was the cluster working and the problems happened just after you added the disk ?

Was the running cluster health OK before you added the disk ?  Yes, it was running with no errors.

Did you have running iSCSI disks before you added the disk ? Were they running OK ?  Yes, the original disk was operational with no errors

You noticed the problem when you added a new iSCSI disk or when you added a new physical disk/OSD  ?  An Iscsi disk was added to the cluster, nothing else.  It was a small 4gb size, just to see how it worked.

What is the output of

ceph status --cluster CLUSTER_NAME

ceph status --cluster irq.2018-05-30.tec.com
2018-05-31 11:32:21.355048 7fbbf1c5b700 -1 Errors while parsing config file!
2018-05-31 11:32:21.355053 7fbbf1c5b700 -1 parse_file: cannot open /etc/ceph/irq.2018-05-30.tec.com.conf: (2) No such file or directory
2018-05-31 11:32:21.355064 7fbbf1c5b700 -1 parse_file: cannot open ~/.ceph/irq.2018-05-30.tec.com.conf: (2) No such file or directory
2018-05-31 11:32:21.355065 7fbbf1c5b700 -1 parse_file: cannot open irq.2018-05-30.tec.com.conf: (2) No such file or directory
Error initializing cluster client: ObjectNotFound('error calling conf_read_file',)

If the above fails with "parse_file: cannot open /etc/ceph/.." , can you run it from other nodes.  All nodes output the same failures

Is the conf file missing on all nodes ?  Yes, all 3 nodes do not have the file

Did you name your cluster " irq.2018-05-30.xxx.com"  ? If so was the cluster working and the problems happened just after you added the disk ?  The xxx was done to not show the domain but its "tec.com".  The IQN is iqn.2018-05.com.tec:00001:00002

Hi there,

The missing ceph conf file, or cluster name not matching the config file is the problem. I do not know how this could have happened, but it seems it did  occur some time before you tried to add the test disk, i do cannot think that they could be related.

To double check if you do have another ceph conf file, can you please post the output of

# any existing conf file at all ?
ls /etc/ceph
ls -all /etc/ceph
# what PetaSAN thinks the cluster name is
cat /opt/petasan/config/cluster_info.json

ls /etc/ceph
XenStorage.client.admin.keyring  XenStorage.conf

 

ls -all /etc/ceph
lrwxrwxrwx 1 root root 28 May 29 10:15 /etc/ceph -> /opt/petasan/config/etc/ceph
root@PS-Node-1:/opt/petasan/config/etc/ceph# cat /opt/petasan/config/cluster_info.json
{
"backend_1_base_ip": "10.0.4.0",
"backend_1_eth_name": "eth0",
"backend_1_mask": "255.255.255.0",
"backend_2_base_ip": "10.0.5.0",
"backend_2_eth_name": "eth1",
"backend_2_mask": "255.255.255.0",
"bonds": [],
"eth_count": 2,
"iscsi_1_eth_name": "eth0",
"iscsi_2_eth_name": "eth1",
"jumbo_frames": [],
"management_eth_name": "eth0",
"management_nodes": [
{
"backend_1_ip": "10.0.4.30",
"backend_2_ip": "10.0.5.30",
"is_iscsi": true,
"is_management": true,
"is_storage": true,
"management_ip": "172.16.14.30",
"name": "PS-Node-1"
},
{
"backend_1_ip": "10.0.4.31",
"backend_2_ip": "10.0.5.31",
"is_iscsi": true,
"is_management": true,
"is_storage": true,
"management_ip": "172.16.14.31",
"name": "PS-Node-2"
},
{
"backend_1_ip": "10.0.4.32",
"backend_2_ip": "10.0.5.32",
"is_iscsi": true,
"is_management": true,
"is_storage": true,
"management_ip": "172.16.14.32",
"name": "PS-Node-3"
}
],
"name": "XenStorage"

So the cluster name is "XenStorage" rather than "irq.2018-05-30.xxx.com" ,  and the conf file is there.

We need to know the status of the cluster

ceph status --cluster  XenStorage

rbd ls  --cluster  XenStorage

 

XenStorage is working fine, just the other one is not working/showing up.
ceph status --cluster  XenStorage
cluster:
id:     b473ff79-febb-48c9-ba18-50b02b6ecb86
health: HEALTH_OK
  services:
mon: 3 daemons, quorum PS-Node-1,PS-Node-2,PS-Node-3
mgr: PS-Node-1(active), standbys: PS-Node-2, PS-Node-3
osd: 3 osds: 3 up, 3 in
  data:
pools:   1 pools, 256 pgs
objects: 245k objects, 976 GB
usage:   1950 GB used, 9224 GB / 11174 GB avail
pgs:     256 active+clean
  io:
client:   1194 B/s rd, 74720 B/s wr, 1 op/s rd, 16 op/s wr
 rbd ls  --cluster  XenStorage
image-00001
image-00002

XenStorage is working fine, just the other one is not working/showing up.

What other one ? do you have 2 clusters ?

is disk 00001 working ? 00002 ?

 

My apologies for any terminology differences.  I have 3 servers running as a "physical cluster".  All 3 are linked together and running.  A cluster called "XenStorage" was created and is running fine.  I decided to add another storage called irq.2018-05-30.tec.com as I had more space available and wanted to see how it would react adding another storage on it.  When it was created it "started" and I didn't want it to start just yet so pushed the "stop" button.  When I did, it would just say "stopping" and hung there.  Evidentially by the commands you sent it does see two clusters but its missing the config for the second one, likely because I "stopped" it before it was finished. So the first did 00001 is working, 00002 is not.  Now the question is how to get rid of the second one.  I do have data on the XenStorage at this time.  Sorry for any misunderstanding.

No problem. Just to clear things, you have 1 cluster called XenStorage,  this is the name of the ceph cluster, there is always 1 config file for ceph in /etc/ceph.

iSCSI disls are different, you can have hundreds of them within your single cluster, they do not have a config file. disks fill up storage provided by the cluster. a cluster with no disks will not use any storage by itself.

to force stop disk 2

 consul kv delete -recurse  PetaSAN/Disks/00002

Pages: 1 2