Windows iSCSI initiator randomly freezes during VM snapshot
wailer
75 Posts
July 13, 2020, 10:38 amQuote from wailer on July 13, 2020, 10:38 amHi,
We have been narrowing this issue for several days. we have one VM which uses a petasan disk using Windows software initiator. The problem appears when a VMware snapshot is created, which momentarily "freezes" the VM. After this, the iscsi initiator stops responding and only way to recover is to cold reboot de VM.
At petasan nodes we see these logs:
Jul 11 14:09:23 CEPH-11 kernel: [5965880.208375] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:10:42 CEPH-11 kernel: [5965959.312575] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:11:04 CEPH-11 kernel: [5965981.328617] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:26:18 CEPH-11 kernel: [5966895.250820] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:26:21 CEPH-11 kernel: [5966898.322829] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:27:02 CEPH-11 kernel: [5966939.538927] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:28:02 CEPH-11 kernel: [5966999.699073] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:28:43 CEPH-11 kernel: [5967040.659173] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:29:06 CEPH-11 kernel: [5967063.443225] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:29:47 CEPH-11 kernel: [5967104.403326] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:30:28 CEPH-11 kernel: [5967145.363425] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:30:50 CEPH-11 kernel: [5967167.379476] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:31:31 CEPH-11 kernel: [5967208.595578] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:32:12 CEPH-11 kernel: [5967249.555679] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:32:53 CEPH-11 kernel: [5967290.771772] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:33:15 CEPH-11 kernel: [5967312.787823] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:34:16 CEPH-11 kernel: [5967372.947964] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:34:57 CEPH-11 kernel: [5967413.908064] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:35:38 CEPH-11 kernel: [5967454.868163] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:39:10 CEPH-11 kernel: [5967667.092669] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:39:51 CEPH-11 kernel: [5967708.308771] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:39:54 CEPH-11 kernel: [5967711.380774] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:39:57 CEPH-11 kernel: [5967714.452782] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:41:35 CEPH-11 kernel: [5967812.501018] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:41:57 CEPH-11 kernel: [5967834.517069] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:43:35 CEPH-11 kernel: [5967932.565302] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:43:57 CEPH-11 kernel: [5967954.581355] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:44:00 CEPH-11 kernel: [5967957.653362] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:44:03 CEPH-11 kernel: [5967960.725369] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:44:25 CEPH-11 kernel: [5967982.741426] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:45:06 CEPH-11 kernel: [5968023.701521] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:45:09 CEPH-11 kernel: [5968026.773529] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Any hint on this ?
Thanks!
Hi,
We have been narrowing this issue for several days. we have one VM which uses a petasan disk using Windows software initiator. The problem appears when a VMware snapshot is created, which momentarily "freezes" the VM. After this, the iscsi initiator stops responding and only way to recover is to cold reboot de VM.
At petasan nodes we see these logs:
Jul 11 14:09:23 CEPH-11 kernel: [5965880.208375] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:10:42 CEPH-11 kernel: [5965959.312575] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:11:04 CEPH-11 kernel: [5965981.328617] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:26:18 CEPH-11 kernel: [5966895.250820] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:26:21 CEPH-11 kernel: [5966898.322829] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:27:02 CEPH-11 kernel: [5966939.538927] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:28:02 CEPH-11 kernel: [5966999.699073] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:28:43 CEPH-11 kernel: [5967040.659173] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:29:06 CEPH-11 kernel: [5967063.443225] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:29:47 CEPH-11 kernel: [5967104.403326] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:30:28 CEPH-11 kernel: [5967145.363425] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:30:50 CEPH-11 kernel: [5967167.379476] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:31:31 CEPH-11 kernel: [5967208.595578] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:32:12 CEPH-11 kernel: [5967249.555679] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:32:53 CEPH-11 kernel: [5967290.771772] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:33:15 CEPH-11 kernel: [5967312.787823] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:34:16 CEPH-11 kernel: [5967372.947964] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:34:57 CEPH-11 kernel: [5967413.908064] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:35:38 CEPH-11 kernel: [5967454.868163] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:39:10 CEPH-11 kernel: [5967667.092669] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:39:51 CEPH-11 kernel: [5967708.308771] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:39:54 CEPH-11 kernel: [5967711.380774] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:39:57 CEPH-11 kernel: [5967714.452782] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:41:35 CEPH-11 kernel: [5967812.501018] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:41:57 CEPH-11 kernel: [5967834.517069] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:43:35 CEPH-11 kernel: [5967932.565302] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:43:57 CEPH-11 kernel: [5967954.581355] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:44:00 CEPH-11 kernel: [5967957.653362] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:44:03 CEPH-11 kernel: [5967960.725369] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:44:25 CEPH-11 kernel: [5967982.741426] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:45:06 CEPH-11 kernel: [5968023.701521] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:45:09 CEPH-11 kernel: [5968026.773529] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Any hint on this ?
Thanks!
Last edited on July 13, 2020, 10:38 am by wailer · #1
admin
2,930 Posts
July 13, 2020, 10:49 amQuote from admin on July 13, 2020, 10:49 amFor snapshots we need to flush all outstanding/inflight writes before we can proceed, this may be what you observe as temporary freeze. Can you try to see if this is load related, meaning if you are not doing heavy writing, does it work ? how long a freeze do you observe ? is this a custom snapshot or a built in PetaSAN replication ?
For snapshots we need to flush all outstanding/inflight writes before we can proceed, this may be what you observe as temporary freeze. Can you try to see if this is load related, meaning if you are not doing heavy writing, does it work ? how long a freeze do you observe ? is this a custom snapshot or a built in PetaSAN replication ?
wailer
75 Posts
July 13, 2020, 7:40 pmQuote from wailer on July 13, 2020, 7:40 pmWell the freeze is, in fact, permanent, once this happens the target never recovers, and we have to reboot the Windows VM to allow the initiator reconnect again.
About load, cluster is scrubbing, but write load is really low at that time.
It's a VMware snapshot, which affects a VM that runs Windows iSCSI client
Thanks!
Well the freeze is, in fact, permanent, once this happens the target never recovers, and we have to reboot the Windows VM to allow the initiator reconnect again.
About load, cluster is scrubbing, but write load is really low at that time.
It's a VMware snapshot, which affects a VM that runs Windows iSCSI client
Thanks!
admin
2,930 Posts
July 13, 2020, 8:05 pmQuote from admin on July 13, 2020, 8:05 pmCan you look at the disk % util charts and see if it was high ?
Can also try to lower the iSCSI performance tuning
cp /opt/petasan/config/tuning/templates/Generic\ Entry\ Level\ Hardware/lio_tunings /opt/petasan/config/tuning/current/
and see if this helps, you will need to restart the iSCSI nodes ( or move paths away then back ) so new paths will use the new tunnings
Can you look at the disk % util charts and see if it was high ?
Can also try to lower the iSCSI performance tuning
cp /opt/petasan/config/tuning/templates/Generic\ Entry\ Level\ Hardware/lio_tunings /opt/petasan/config/tuning/current/
and see if this helps, you will need to restart the iSCSI nodes ( or move paths away then back ) so new paths will use the new tunnings
wailer
75 Posts
July 14, 2020, 11:17 pmQuote from wailer on July 14, 2020, 11:17 pmHi,
I checked load at the moment the issue happened, and all nodes loads are evenly low.
Would you recommend to try an entry tuning template anyway ?
Thanks!
Hi,
I checked load at the moment the issue happened, and all nodes loads are evenly low.
Would you recommend to try an entry tuning template anyway ?
Thanks!
Windows iSCSI initiator randomly freezes during VM snapshot
wailer
75 Posts
Quote from wailer on July 13, 2020, 10:38 amHi,
We have been narrowing this issue for several days. we have one VM which uses a petasan disk using Windows software initiator. The problem appears when a VMware snapshot is created, which momentarily "freezes" the VM. After this, the iscsi initiator stops responding and only way to recover is to cold reboot de VM.
At petasan nodes we see these logs:
Jul 11 14:09:23 CEPH-11 kernel: [5965880.208375] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:10:42 CEPH-11 kernel: [5965959.312575] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:11:04 CEPH-11 kernel: [5965981.328617] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:26:18 CEPH-11 kernel: [5966895.250820] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:26:21 CEPH-11 kernel: [5966898.322829] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:27:02 CEPH-11 kernel: [5966939.538927] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:28:02 CEPH-11 kernel: [5966999.699073] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:28:43 CEPH-11 kernel: [5967040.659173] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:29:06 CEPH-11 kernel: [5967063.443225] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:29:47 CEPH-11 kernel: [5967104.403326] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:30:28 CEPH-11 kernel: [5967145.363425] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:30:50 CEPH-11 kernel: [5967167.379476] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:31:31 CEPH-11 kernel: [5967208.595578] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:32:12 CEPH-11 kernel: [5967249.555679] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:32:53 CEPH-11 kernel: [5967290.771772] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:33:15 CEPH-11 kernel: [5967312.787823] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:34:16 CEPH-11 kernel: [5967372.947964] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:34:57 CEPH-11 kernel: [5967413.908064] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:35:38 CEPH-11 kernel: [5967454.868163] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:39:10 CEPH-11 kernel: [5967667.092669] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:39:51 CEPH-11 kernel: [5967708.308771] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:39:54 CEPH-11 kernel: [5967711.380774] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:39:57 CEPH-11 kernel: [5967714.452782] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:41:35 CEPH-11 kernel: [5967812.501018] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:41:57 CEPH-11 kernel: [5967834.517069] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:43:35 CEPH-11 kernel: [5967932.565302] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:43:57 CEPH-11 kernel: [5967954.581355] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:44:00 CEPH-11 kernel: [5967957.653362] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:44:03 CEPH-11 kernel: [5967960.725369] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:44:25 CEPH-11 kernel: [5967982.741426] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:45:06 CEPH-11 kernel: [5968023.701521] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:45:09 CEPH-11 kernel: [5968026.773529] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Any hint on this ?
Thanks!
Hi,
We have been narrowing this issue for several days. we have one VM which uses a petasan disk using Windows software initiator. The problem appears when a VMware snapshot is created, which momentarily "freezes" the VM. After this, the iscsi initiator stops responding and only way to recover is to cold reboot de VM.
At petasan nodes we see these logs:
Jul 11 14:09:23 CEPH-11 kernel: [5965880.208375] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:10:42 CEPH-11 kernel: [5965959.312575] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:11:04 CEPH-11 kernel: [5965981.328617] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:26:18 CEPH-11 kernel: [5966895.250820] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:26:21 CEPH-11 kernel: [5966898.322829] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:27:02 CEPH-11 kernel: [5966939.538927] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:28:02 CEPH-11 kernel: [5966999.699073] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:28:43 CEPH-11 kernel: [5967040.659173] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:29:06 CEPH-11 kernel: [5967063.443225] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:29:47 CEPH-11 kernel: [5967104.403326] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:30:28 CEPH-11 kernel: [5967145.363425] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:30:50 CEPH-11 kernel: [5967167.379476] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:31:31 CEPH-11 kernel: [5967208.595578] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:32:12 CEPH-11 kernel: [5967249.555679] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:32:53 CEPH-11 kernel: [5967290.771772] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:33:15 CEPH-11 kernel: [5967312.787823] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:34:16 CEPH-11 kernel: [5967372.947964] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:34:57 CEPH-11 kernel: [5967413.908064] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:35:38 CEPH-11 kernel: [5967454.868163] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:39:10 CEPH-11 kernel: [5967667.092669] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:39:51 CEPH-11 kernel: [5967708.308771] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:39:54 CEPH-11 kernel: [5967711.380774] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:39:57 CEPH-11 kernel: [5967714.452782] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:41:35 CEPH-11 kernel: [5967812.501018] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:41:57 CEPH-11 kernel: [5967834.517069] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:43:35 CEPH-11 kernel: [5967932.565302] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:43:57 CEPH-11 kernel: [5967954.581355] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:44:00 CEPH-11 kernel: [5967957.653362] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:44:03 CEPH-11 kernel: [5967960.725369] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:44:25 CEPH-11 kernel: [5967982.741426] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:45:06 CEPH-11 kernel: [5968023.701521] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Jul 11 14:45:09 CEPH-11 kernel: [5968026.773529] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus iqn.1991-05.com.microsoft:veeam-isp,i,0x400001370007,iqn.2016-05.com.petasan:00005,t,0x02
Any hint on this ?
Thanks!
admin
2,930 Posts
Quote from admin on July 13, 2020, 10:49 amFor snapshots we need to flush all outstanding/inflight writes before we can proceed, this may be what you observe as temporary freeze. Can you try to see if this is load related, meaning if you are not doing heavy writing, does it work ? how long a freeze do you observe ? is this a custom snapshot or a built in PetaSAN replication ?
For snapshots we need to flush all outstanding/inflight writes before we can proceed, this may be what you observe as temporary freeze. Can you try to see if this is load related, meaning if you are not doing heavy writing, does it work ? how long a freeze do you observe ? is this a custom snapshot or a built in PetaSAN replication ?
wailer
75 Posts
Quote from wailer on July 13, 2020, 7:40 pmWell the freeze is, in fact, permanent, once this happens the target never recovers, and we have to reboot the Windows VM to allow the initiator reconnect again.
About load, cluster is scrubbing, but write load is really low at that time.
It's a VMware snapshot, which affects a VM that runs Windows iSCSI client
Thanks!
Well the freeze is, in fact, permanent, once this happens the target never recovers, and we have to reboot the Windows VM to allow the initiator reconnect again.
About load, cluster is scrubbing, but write load is really low at that time.
It's a VMware snapshot, which affects a VM that runs Windows iSCSI client
Thanks!
admin
2,930 Posts
Quote from admin on July 13, 2020, 8:05 pmCan you look at the disk % util charts and see if it was high ?
Can also try to lower the iSCSI performance tuning
cp /opt/petasan/config/tuning/templates/Generic\ Entry\ Level\ Hardware/lio_tunings /opt/petasan/config/tuning/current/
and see if this helps, you will need to restart the iSCSI nodes ( or move paths away then back ) so new paths will use the new tunnings
Can you look at the disk % util charts and see if it was high ?
Can also try to lower the iSCSI performance tuning
cp /opt/petasan/config/tuning/templates/Generic\ Entry\ Level\ Hardware/lio_tunings /opt/petasan/config/tuning/current/
and see if this helps, you will need to restart the iSCSI nodes ( or move paths away then back ) so new paths will use the new tunnings
wailer
75 Posts
Quote from wailer on July 14, 2020, 11:17 pmHi,
I checked load at the moment the issue happened, and all nodes loads are evenly low.
Would you recommend to try an entry tuning template anyway ?
Thanks!
Hi,
I checked load at the moment the issue happened, and all nodes loads are evenly low.
Would you recommend to try an entry tuning template anyway ?
Thanks!