Forums

Home / Forums

You need to log in to create posts and topics. Login · Register

iSCSI stopped working on all cluster

Hi ,

We have a major issue with one cluster. Vmware is unable to contact iSCSI targets on petasan. We are able to ping them from vmware hosts, but iscsi adapter does not seem to get recognize any path.

This happened some hours  after replacing all management nodes.

Any idea what could be happening?

 

I see this in logs:

 

Jan 12 06:12:52 CEPH-22 iscsi_service.py[2858]: ARPING 192.168.75.222 from 19
Jan 12 06:12:52 CEPH-22 iscsi_service.py[2858]: Sent 5 probes (5 broadcast(s)
Jan 12 06:12:52 CEPH-22 iscsi_service.py[2858]: Received 0 response(s)
Jan 12 06:12:58 CEPH-22 iscsi_service.py[2858]: ARPING 192.168.75.221 from 19
Jan 12 06:12:58 CEPH-22 iscsi_service.py[2858]: Sent 5 probes (5 broadcast(s)
Jan 12 06:12:58 CEPH-22 iscsi_service.py[2858]: Received 0 response(s)
Jan 12 06:13:01 CEPH-22 iscsi_service.py[2858]: ARPING 192.168.75.222 from 19
Jan 12 06:13:01 CEPH-22 iscsi_service.py[2858]: Sent 5 probes (5 broadcast(s)
Jan 12 06:13:01 CEPH-22 iscsi_service.py[2858]: Received 0 response(s)
Jan 12 06:13:07 CEPH-22 iscsi_service.py[2858]: ARPING 192.168.75.221 from 19
Jan 12 06:13:07 CEPH-22 iscsi_service.py[2858]: Sent 5 probes (5 broadcast(s)
Jan 12 06:13:07 CEPH-22 iscsi_service.py[2858]: Received 0 response(s)
Jan 12 06:13:10 CEPH-22 iscsi_service.py[2858]: ARPING 192.168.75.222 from 192.168.75.222 eth6.75
Jan 12 06:13:10 CEPH-22 iscsi_service.py[2858]: Sent 5 probes (5 broadcast(s))
Jan 12 06:13:10 CEPH-22 iscsi_service.py[2858]: Received 0 response(s)
Jan 12 06:13:16 CEPH-22 iscsi_service.py[2858]: ARPING 192.168.75.221 from 192.168.75.221 eth6.75
Jan 12 06:13:16 CEPH-22 iscsi_service.py[2858]: Sent 5 probes (5 broadcast(s))
Jan 12 06:13:16 CEPH-22 iscsi_service.py[2858]: Received 0 response(s)
Jan 12 06:13:19 CEPH-22 iscsi_service.py[2858]: ARPING 192.168.75.222 from 192.168.75.222 eth6.75
Jan 12 06:13:19 CEPH-22 iscsi_service.py[2858]: Sent 5 probes (5 broadcast(s))
Jan 12 06:13:19 CEPH-22 iscsi_service.py[2858]: Received 0 response(s)
Jan 12 06:13:25 CEPH-22 iscsi_service.py[2858]: ARPING 192.168.75.221 from 192.168.75.221 eth6.75
Jan 12 06:13:25 CEPH-22 iscsi_service.py[2858]: Sent 5 probes (5 broadcast(s))
Jan 12 06:13:25 CEPH-22 iscsi_service.py[2858]: Received 0 response(s)

Also , without having changed any authentication details:

Jan 12 06:39:37 CEPH-21 kernel: [ 3598.928143] iSCSI Login negotiation failed.
Jan 12 06:39:38 CEPH-21 kernel: [ 3599.309772] iSCSI Login negotiation failed.
Jan 12 07:17:20 CEPH-21 kernel: [ 5861.972669] iSCSI Login negotiation failed.
Jan 12 07:17:21 CEPH-21 kernel: [ 5862.741594] iSCSI Login negotiation failed.
Jan 12 07:17:21 CEPH-21 kernel: [ 5862.818374] iSCSI Login negotiation failed.
Jan 12 07:17:24 CEPH-21 kernel: [ 5865.969506] iSCSI Login negotiation failed.
Jan 12 07:17:26 CEPH-21 kernel: [ 5867.468908] iSCSI Login negotiation failed.
Jan 12 07:17:28 CEPH-21 kernel: [ 5869.072307] iSCSI Login negotiation failed.
Jan 12 07:17:28 CEPH-21 kernel: [ 5869.586667] iSCSI Login negotiation failed.
Jan 12 07:17:28 CEPH-21 kernel: [ 5869.630393] iSCSI Login negotiation failed.
Jan 12 07:17:31 CEPH-21 kernel: [ 5872.151799] iSCSI Login negotiation failed.
Jan 12 07:17:31 CEPH-21 kernel: [ 5873.053264] iSCSI Login negotiation failed.
Jan 12 07:17:34 CEPH-21 kernel: [ 5875.406205] iSCSI Login negotiation failed.
Jan 12 07:17:35 CEPH-21 kernel: [ 5876.165934] iSCSI Login negotiation failed.
Jan 12 07:17:37 CEPH-21 kernel: [ 5878.436576] iSCSI Login negotiation failed.
Jan 12 07:17:38 CEPH-21 kernel: [ 5879.195870] iSCSI Login negotiation failed.
Jan 12 07:17:40 CEPH-21 kernel: [ 5881.464997] iSCSI Login negotiation failed.
Jan 12 07:17:41 CEPH-21 kernel: [ 5882.225988] iSCSI Login negotiation failed.
Jan 12 07:17:43 CEPH-21 kernel: [ 5884.497367] iSCSI Login negotiation failed.
Jan 12 07:17:44 CEPH-21 kernel: [ 5885.256594] iSCSI Login negotiation failed.
Jan 12 07:17:46 CEPH-21 kernel: [ 5887.533059] iSCSI Login negotiation failed.
Jan 12 07:17:47 CEPH-21 kernel: [ 5888.295518] iSCSI Login negotiation failed.
Jan 12 07:17:49 CEPH-21 kernel: [ 5890.572062] iSCSI Login negotiation failed.
Jan 12 07:17:50 CEPH-21 kernel: [ 5891.332238] iSCSI Login negotiation failed.
Jan 12 07:17:52 CEPH-21 kernel: [ 5893.606044] iSCSI Login negotiation failed.
Jan 12 07:17:53 CEPH-21 kernel: [ 5894.365974] iSCSI Login negotiation failed.
Jan 12 07:17:55 CEPH-21 kernel: [ 5896.642836] iSCSI Login negotiation failed.
Jan 12 07:17:56 CEPH-21 kernel: [ 5897.401589] iSCSI Login negotiation failed.
Jan 12 07:17:58 CEPH-21 kernel: [ 5899.672852] iSCSI Login negotiation failed.
Jan 12 07:17:59 CEPH-21 kernel: [ 5900.435180] iSCSI Login negotiation failed.
Jan 12 07:18:01 CEPH-21 kernel: [ 5902.704956] iSCSI Login negotiation failed.
Jan 12 07:18:02 CEPH-21 kernel: [ 5903.463004] iSCSI Login negotiation failed.
Jan 12 07:18:04 CEPH-21 kernel: [ 5905.739536] iSCSI Login negotiation failed.
Jan 12 07:18:05 CEPH-21 kernel: [ 5906.498612] iSCSI Login negotiation failed.
Jan 12 07:18:07 CEPH-21 kernel: [ 5908.778225] iSCSI Login negotiation failed.
Jan 12 07:18:08 CEPH-21 kernel: [ 5909.539265] iSCSI Login negotiation failed.
Jan 12 07:18:10 CEPH-21 kernel: [ 5911.812193] iSCSI Login negotiation failed.
Jan 12 07:18:11 CEPH-21 kernel: [ 5912.572627] iSCSI Login negotiation failed.
Jan 12 07:18:13 CEPH-21 kernel: [ 5914.851804] iSCSI Login negotiation failed.
Jan 12 07:18:14 CEPH-21 kernel: [ 5915.610926] iSCSI Login negotiation failed.
Jan 12 07:18:16 CEPH-21 kernel: [ 5917.890604] iSCSI Login negotiation failed.
Jan 12 07:18:17 CEPH-21 kernel: [ 5918.653059] iSCSI Login negotiation failed.
Jan 12 07:18:19 CEPH-21 kernel: [ 5920.928162] iSCSI Login negotiation failed.
Jan 12 07:18:19 CEPH-21 kernel: [ 5920.958762] iSCSI Login negotiation failed.
Jan 12 07:18:19 CEPH-21 kernel: [ 5920.962447] iSCSI Login negotiation failed.
Jan 12 07:18:20 CEPH-21 kernel: [ 5921.688569] iSCSI Login negotiation failed.
Jan 12 07:18:22 CEPH-21 kernel: [ 5923.959686] iSCSI Login negotiation failed.

Also this about gluster:

 

12/01/2020 07:35:51 INFO GlusterFS mount attempt

12/01/2020 07:35:21 INFO GlusterFS mount attempt

12/01/2020 07:34:51 INFO GlusterFS mount attempt

12/01/2020 07:34:21 INFO GlusterFS mount attempt

12/01/2020 07:33:51 INFO GlusterFS mount attempt

12/01/2020 07:33:21 INFO GlusterFS mount attempt

Looks like iscsi configuration is gone from files?

We have tried to restart petasan-iscsi service but problem persists.

any help appreciated !

Hi Wailer,

Can you provide some additional details of your setup. What version of PetaSAN, how many nodes/OSD/journals?

I am using 2.4.0   . 3 nodes, 15 OSD's per node .

It looks like detaching and reataching disk solved the issue. Any idea how this could happen? It's kinda scary...

Thanks!

Glad it is working now.

Did you "replace" all nodes or did you mean you "upgraded" them via installer/online ?

If you did a "replace", can you give more detail on how that was done: all at once/one by one/ what was the cluster state when this was happening + also why did you need to replace them all ?

Well, in fact, we replaced management nodes and OSD's 1 by 1 because we were changing the system disk to different size.

We set the "no out flag" during all the process , and the cluster was healthy on each change.

Thanks!

 

There is a step missing in the instructions, when you change a mon with storage you must set the weight for each osd on that mon to zero and let the cluster move all data before you no out flag and replace the node. This is important or it can take a long time for the node to actually be healthy.