Forums

Home / Forums

You need to log in to create posts and topics. Login · Register

Can't stop iSCSI disks

Pages: 1 2

I'm on another fresh install of 2.0.0 on my 4 node cluster. This time I have 4 OSDs and 1 journal per node. Each node has 2 bonded 1Gb for management, 2 bonded 10Gb for iSCSI1 and Backend1, and 2 bonded 10G for iSCSI2 and Backend2.

Added a 20GB iSCSI disk first, with an IQN control list and 2 automatic paths

Then added a 50TB iSCSI also with IQN list and 8 automatic paths.

Both start just fine, but if I attempt to stop the 50TB, it will never stop. It also seems to stop and start the 20GB disk in the process repeatedly. Attempting to stop both results in both sitting the "stopping" stage indefinitely, until I reboot all nodes and manually start both disks again. Even then, still cannot stop the 50TB disk.

Let me know if this is reproducible on your side, or what I can provide for logs. Only thing not ideal about the setup so far that I can think of is my choice of 50-200 disks during initial setup, but I plan to add many disks very soon to satisfy this.

I had let this sit for about 2 hours before I posted this, and it just finally stopped both disks.

However, when I started them both back up, the 50TB disk that's supposed to have 8 paths only has 2 actually assigned. Will wait a bit longer to see if this changes, and then i'll try stopping and starting the disk again.

while waiting for the 50TB to gain all its paths, I was benchmarking the 20GB partition with 2 paths. This seems to have caused one of the nodes to go down. Now upon bringing it back up, 25% of my PGs are stuck unclean, with state unknown. The cluster has been in this state inactive for about 25 minutes now, and it has not recovered.

 

Any recommendation on how to proceed with PG recovery? I can't seem to find much in the ceph docs for "unknown" state PGs. Is it possible choosing 50-200 disks as the initial cluster profile caused this? If so, what manual commands would be used to adjust this after the fact?

And then how about the paths not assigning? Do all configured iSCSI disks have to have the same number of paths?

Thanks!

Yes the 50-200 selection could very well be the issue, as per my

http://www.petasan.org/forums/?view=thread&id=268&part=1#postid-1418

This will put a lot of stress on the system, my guess it is the same reason you were having issues stopping disks. As stated it is always better to have the initial disk count not less that /10 of target count. It is possible to increase the PG count in production (with the rebalancing drawback) but not decrease it.

Since i understand this is test data, i can re-install or can run the following commands which will delete all data and recreate the pool:

ceph osd pool delete rbd rbd --yes-i-really-really-mean-it --cluster CLUSTER_NAME

ceph osd pool create rbd 1024 1024 --cluster CLUSTER_NAME
ceph osd pool set rbd size 3 --cluster CLUSTER_NAME
ceph osd pool set rbd min_size 2 --cluster CLUSTER_NAME
ceph osd pool application enable rbd rbd --cluster CLUSTER_NAME

Please let me know if this also solves the stopping issues

I ended up blowing the install again, and starting from scratch using 15-50 disks because of what you said and other stuff found in the ceph forums about being able to increase pgs later. Felt like a better option than too many PGs per OSD, as it seems like %25 of my PGs wouldn't recover earlier after some testing.

Anyway, cluster came up fine, created a 20GB and 50TB disk again, same as before with 2 paths and 8 paths respectively.

Path issue remains. The cluster will not assign all 8 paths that are listed for the 50TB disk, it only assigns 2.

Testing the stopping issue now. It has so far been in the stopping state for ~10min.

Will update again soon.

Disk stopped, took about 20 min total, but it worked.

Path issue is still occurring, and is somewhat of a show stopper. Going to try a few things and check back.

Let me know if you have any info or need any logs

Hi

This is strange. unfortunately we are off for easter holidays till Tuesday. You can email me the log /opt/petasan/log/PetaSAN.log  on an iSCSI node, email to contact-us @ petasan.org  i will try have a quick look if it is something obvious, else it till have wait a couple of days till were back.

If you can also try to identify if there are any factors that lead to this it will help. Generally the start and stop of the disk should not be affected by disk size, does it work with small disks ? The only thing i can think of now that will lead to start/stop issues is network issues on backend 1, this is where the consul servers communicate, can you double check all nodes ping well on this subnet..no ip confilct etc.

I think this path issue may be a larger bug. I'm not sure what to look for or where in order to substantiate, but I just had one node go down for the second time after messing with the stop/start and path assignment issue.

node1 goes offline, and i get no ping back, and cannot get any console screen from IPMI. I have to power it off and back on. This was a rock solid node before testing PetaSAN, so I'm doubtful its hardware related.

When the node comes back online, the PGs will never balance back out and remain unclean. I also lose more storage according to the dashboard than I should have for only 4 OSDs having gone down. Storage started at 48TB, then after the node was back up, dashboard is reporting 10.99TB available cluster-wide. (there are 4 3TB disks per node). All OSDs report being back "Up" but PGs remain unclean. Here's the output of ceph status:

root@bd-ceph-sd1:~# ceph --cluster BD-Ceph-Cl1
ceph> status
cluster:
id: 7d8d0520-17bc-4aba-9c43-9588d3b3d5b0
health: HEALTH_WARN
Reduced data availability: 795 pgs inactive
Degraded data redundancy: 795 pgs unclean

services:
mon: 3 daemons, quorum bd-ceph-sd1,bd-ceph-sd2,bd-ceph-sd3
mgr: bd-ceph-sd1(active)
osd: 16 osds: 16 up, 16 in

data:
pools: 1 pools, 1024 pgs
objects: 0 objects, 0 bytes
usage: 86063 MB used, 11173 GB / 11257 GB avail
pgs: 77.637% pgs unknown
795 unknown
229 active+clean

 

As i typed this, and collected info i came up with the theory that having 6 disks and only 4 assigned as OSDs may be contributing to my troubles here. I'm going to have someone pop two disks out of each node, and I'm going to start over. This is to make sure when the system boots, the two extra disks aren't coming up in place of the actual OSDs.

Will report back on what I can reproduce after this.

 

I have actually tested networking, but once I re-install the cluster i will verify again.

No problem, will keep this thread updated and forward anything of substance once I have a full picture of whats going on here.

 

Enjoy your holiday!

The drop in available storage from 48 to 10.x is strange, this is as if 3 nodes are not working. One thing : it takes more than 1 node to report other OSDs as down (to prevent OSDs from reporting each other down and cause flapping), so it may be the 16 OSDs are not really up.

If node1 went offline by itself, it is probably due to our fencing action, if a node does not report to the consul cluster via heartbeats, we kill it. The start/stop is also likely consul connection related. If ceph is also acting strange i would also be suspicious of hardware.

Pages: 1 2