iSCSI multi-client access to disk
protocol6v
85 Posts
May 17, 2018, 3:12 pmQuote from protocol6v on May 17, 2018, 3:12 pmHi there,
Trying to configure a Hyper-V cluster, but having trouble where it seems only one host has access to the iSCSI disk at any given time. Is there a setting I need to change on PetaSAN somewhere to allow multiple hosts to access each iSCSI disk/lun at once? I know on our old EQL SANs, this was a checkbox somewhere.
Thanks!
Hi there,
Trying to configure a Hyper-V cluster, but having trouble where it seems only one host has access to the iSCSI disk at any given time. Is there a setting I need to change on PetaSAN somewhere to allow multiple hosts to access each iSCSI disk/lun at once? I know on our old EQL SANs, this was a checkbox somewhere.
Thanks!
protocol6v
85 Posts
May 17, 2018, 3:54 pmQuote from protocol6v on May 17, 2018, 3:54 pmThis appears to be an issue with perisitent reservations to the iSCSI LUN. Here's the error Microsoft presents:
Failure issuing call to Persistent Reservation REGISTER AND IGNORE EXISTING on Test Disk 1 from node BD-E7k-HV-CN1.testing.net when the disk has no existing registration. It is expected to succeed. The device is not ready.
Test Disk 1 does not provide Persistent Reservations support for the mechanisms used by failover clusters. Some storage devices require specific firmware versions or settings to function properly with failover clusters. Please contact your storage administrator or storage vendor to check the configuration of the storage to allow it to function properly with failover clusters
Not seeing anything in PetaSAN logs. Here's some interesting output from dmesg:
[14352.489656] PR register with aptpl unset. Treating as aptpl=1
[14352.492348] PR register with aptpl unset. Treating as aptpl=1
[14352.495207] PR register with aptpl unset. Treating as aptpl=1
[14498.661779] PR register with aptpl unset. Treating as aptpl=1
[14498.661783] PR info not present, initializing
[14498.668197] PR register with aptpl unset. Treating as aptpl=1
[14498.686007] PR register with aptpl unset. Treating as aptpl=1
[14499.411722] PR register with aptpl unset. Treating as aptpl=1
[14499.431857] PR register with aptpl unset. Treating as aptpl=1
[14499.435080] PR register with aptpl unset. Treating as aptpl=1
[14499.446180] SPC-3 PR: Attempted RESERVE from iqn.2018-03.net.testing.internal:bd-e7k-hv-cn1,i,0x3430303030313337,iqn.2018-05.net.testing.internal:00001,t,0x1 while reservation already held by iqn.2018-03.net.testing.internal:bd-e7k-hv-cn2,i,0x3430303030313337,iqn.2018-05.net.testing.internal:00001,t,0x2, returning RESERVATION_CONFLICT
[14499.446456] SPC-3 PR: Attempted RESERVE from iqn.2018-03.net.testing.internal:bd-e7k-hv-cn1,i,0x3430303030313337,iqn.2018-05.net.testing.internal:00001,t,0x1 while reservation already held by iqn.2018-03.net.testing.internal:bd-e7k-hv-cn2,i,0x3430303030313337,iqn.2018-05.net.testing.internal:00001,t,0x2, returning RESERVATION_CONFLICT
[14499.474500] PR register with aptpl unset. Treating as aptpl=1
[14499.482617] SPC-3 PR: Attempted RESERVE from iqn.2018-03.net.testing.internal:bd-e7k-hv-cn2,i,0x3430303030313337,iqn.2018-05.net.testing.internal:00001,t,0x1 while reservation already held by iqn.2018-03.net.testing.internal:bd-e7k-hv-cn1,i,0x3430303030313337,iqn.2018-05.net.testing.internal:00001,t,0x1, returning RESERVATION_CONFLICT
[14499.490455] PR register with aptpl unset. Treating as aptpl=1
Let me know what other info might help here.
This appears to be an issue with perisitent reservations to the iSCSI LUN. Here's the error Microsoft presents:
Failure issuing call to Persistent Reservation REGISTER AND IGNORE EXISTING on Test Disk 1 from node BD-E7k-HV-CN1.testing.net when the disk has no existing registration. It is expected to succeed. The device is not ready.
Test Disk 1 does not provide Persistent Reservations support for the mechanisms used by failover clusters. Some storage devices require specific firmware versions or settings to function properly with failover clusters. Please contact your storage administrator or storage vendor to check the configuration of the storage to allow it to function properly with failover clusters
Not seeing anything in PetaSAN logs. Here's some interesting output from dmesg:
[14352.489656] PR register with aptpl unset. Treating as aptpl=1
[14352.492348] PR register with aptpl unset. Treating as aptpl=1
[14352.495207] PR register with aptpl unset. Treating as aptpl=1
[14498.661779] PR register with aptpl unset. Treating as aptpl=1
[14498.661783] PR info not present, initializing
[14498.668197] PR register with aptpl unset. Treating as aptpl=1
[14498.686007] PR register with aptpl unset. Treating as aptpl=1
[14499.411722] PR register with aptpl unset. Treating as aptpl=1
[14499.431857] PR register with aptpl unset. Treating as aptpl=1
[14499.435080] PR register with aptpl unset. Treating as aptpl=1
[14499.446180] SPC-3 PR: Attempted RESERVE from iqn.2018-03.net.testing.internal:bd-e7k-hv-cn1,i,0x3430303030313337,iqn.2018-05.net.testing.internal:00001,t,0x1 while reservation already held by iqn.2018-03.net.testing.internal:bd-e7k-hv-cn2,i,0x3430303030313337,iqn.2018-05.net.testing.internal:00001,t,0x2, returning RESERVATION_CONFLICT
[14499.446456] SPC-3 PR: Attempted RESERVE from iqn.2018-03.net.testing.internal:bd-e7k-hv-cn1,i,0x3430303030313337,iqn.2018-05.net.testing.internal:00001,t,0x1 while reservation already held by iqn.2018-03.net.testing.internal:bd-e7k-hv-cn2,i,0x3430303030313337,iqn.2018-05.net.testing.internal:00001,t,0x2, returning RESERVATION_CONFLICT
[14499.474500] PR register with aptpl unset. Treating as aptpl=1
[14499.482617] SPC-3 PR: Attempted RESERVE from iqn.2018-03.net.testing.internal:bd-e7k-hv-cn2,i,0x3430303030313337,iqn.2018-05.net.testing.internal:00001,t,0x1 while reservation already held by iqn.2018-03.net.testing.internal:bd-e7k-hv-cn1,i,0x3430303030313337,iqn.2018-05.net.testing.internal:00001,t,0x1, returning RESERVATION_CONFLICT
[14499.490455] PR register with aptpl unset. Treating as aptpl=1
Let me know what other info might help here.
admin
2,930 Posts
May 17, 2018, 7:28 pmQuote from admin on May 17, 2018, 7:28 pmWe do support Persistent Reservations and do pass both the Win 2012 R2 and Win 2016 storage and pr tests.
From the error it appears PetaSAN thinks there is an existing reservation on the disk but the Windows cluster does not. Maybe it could have been due to the Windows cluster being completely shutdown and restarted, though in our tests if we shutdown the entire Windows cluster in will "pre-empt" any existing reservation and starts over without issue.
It is possible to manually clear the reservation yourself as per https://docs.microsoft.com/en-us/powershell/module/failoverclusters/clear-clusterdiskreservation?view=win10-ps
I would be interested to know what steps you did to get this error.
We do support Persistent Reservations and do pass both the Win 2012 R2 and Win 2016 storage and pr tests.
From the error it appears PetaSAN thinks there is an existing reservation on the disk but the Windows cluster does not. Maybe it could have been due to the Windows cluster being completely shutdown and restarted, though in our tests if we shutdown the entire Windows cluster in will "pre-empt" any existing reservation and starts over without issue.
It is possible to manually clear the reservation yourself as per https://docs.microsoft.com/en-us/powershell/module/failoverclusters/clear-clusterdiskreservation?view=win10-ps
I would be interested to know what steps you did to get this error.
Last edited on May 17, 2018, 7:38 pm by admin · #3
protocol6v
85 Posts
May 18, 2018, 10:54 amQuote from protocol6v on May 18, 2018, 10:54 amI attempted to clear the reservation using the clear-clusterdiskreservation command, but this did not help.
I have not created the cluster yet. This is just when trying to "Validate Configuration" prior to actually creating the cluster.
It seems like PetaSAN is not clearing the reservation quickly enough for the test to succeed. If you look at the errors, the third one states the reservation is held by a different node than the first two errors, and this changes when i rerun the validation test.
I went ahead and tried creating the cluster anyway. This did not result in good things. When trying to failover the PetaSAN iSCSI disk to another cluster node, the disk role will not come online and is not accessible. Also, if you browse to the disk via Explorer, none of the data that was on the disk from one node is visible from the other. Tested, and this works just fine with other iSCSI disks on the same hyper-v cluster (Equallogic, FreeNAS).
Tried destroying the cluster and starting over with the same results.
I attempted to clear the reservation using the clear-clusterdiskreservation command, but this did not help.
I have not created the cluster yet. This is just when trying to "Validate Configuration" prior to actually creating the cluster.
It seems like PetaSAN is not clearing the reservation quickly enough for the test to succeed. If you look at the errors, the third one states the reservation is held by a different node than the first two errors, and this changes when i rerun the validation test.
I went ahead and tried creating the cluster anyway. This did not result in good things. When trying to failover the PetaSAN iSCSI disk to another cluster node, the disk role will not come online and is not accessible. Also, if you browse to the disk via Explorer, none of the data that was on the disk from one node is visible from the other. Tested, and this works just fine with other iSCSI disks on the same hyper-v cluster (Equallogic, FreeNAS).
Tried destroying the cluster and starting over with the same results.
admin
2,930 Posts
May 18, 2018, 12:57 pmQuote from admin on May 18, 2018, 12:57 pm
- Are you using Win2012 R2 or Win2016 ? how many nodes ?
- If you create all new iSCSI disks and run the validation tests on these new disks, does it fail. If so can you show a screenshot of the Windows report detail. Also show the dmesg logs on the PetaSAN iSCSI target nodes.
- Can you install this kernel on the nodes and see if it solves this:
https://drive.google.com/open?id=1LxizxXz4WKsIkcXUigQ-kv9aUVYo2oVZ
dpkg -i linux-image-4.4.38-petasan_amd64.deb
reboot
- Are you using Win2012 R2 or Win2016 ? how many nodes ?
- If you create all new iSCSI disks and run the validation tests on these new disks, does it fail. If so can you show a screenshot of the Windows report detail. Also show the dmesg logs on the PetaSAN iSCSI target nodes.
- Can you install this kernel on the nodes and see if it solves this:
https://drive.google.com/open?id=1LxizxXz4WKsIkcXUigQ-kv9aUVYo2oVZ
dpkg -i linux-image-4.4.38-petasan_amd64.deb
reboot
Last edited on May 18, 2018, 1:01 pm by admin · #5
protocol6v
85 Posts
May 21, 2018, 2:15 pmQuote from protocol6v on May 21, 2018, 2:15 pmThanks for the response, sorry for delay.
This is with Server 2016, and I am now just using two nodes to test with. Both fresh installs of Windows, with all cumulative updates installed.
Tried deleting all PetaSAN iSCSI disks first, and creating two new. Same issue, here's the screenshot of the error:
And here's some output from dmesg on one of the nodes:
[ 478.200940] PR register with aptpl unset. Treating as aptpl=1
[ 478.200980] SPC-3 PR: Attempted RESERVE from iqn.2018-03.net.testing.internal:bd-e7k-hv-cn2,i,0x3430303030313337,iqn.2018-05.net.testing.internal:00001,t,0x2 while reservation already held by iqn.2018-03.net.testing.internal:bd-e7k-hv-cn1,i,0x3430303030313337,iqn.2018-05.net.testing.internal:00001,t,0x2, returning RESERVATION_CONFLICT
[ 478.205458] PR register with aptpl unset. Treating as aptpl=1
[ 478.210698] PR register with aptpl unset. Treating as aptpl=1
[ 478.220384] PR register with aptpl unset. Treating as aptpl=1
[ 478.234342] PR register with aptpl unset. Treating as aptpl=1
[ 478.234493] PR info too large for encoding: 8673
[ 478.234494] failed to encode PR xattr: -22
[ 478.234494] atomic PR info update failed: -22
[ 478.786503] PR register with aptpl unset. Treating as aptpl=1
[ 478.803247] PR register with aptpl unset. Treating as aptpl=1
[ 478.817486] PR register with aptpl unset. Treating as aptpl=1
[ 478.827918] PR register with aptpl unset. Treating as aptpl=1
[ 478.828083] PR info too large for encoding: 8673
[ 478.828084] failed to encode PR xattr: -22
[ 478.828085] atomic PR info update failed: -22
[ 479.317857] PR register with aptpl unset. Treating as aptpl=1
[ 479.327779] PR register with aptpl unset. Treating as aptpl=1
[ 479.338941] PR register with aptpl unset. Treating as aptpl=1
[ 479.349230] PR register with aptpl unset. Treating as aptpl=1
[ 479.349478] PR info too large for encoding: 8673
[ 479.349480] failed to encode PR xattr: -22
[ 479.349480] atomic PR info update failed: -22
[ 479.849099] PR register with aptpl unset. Treating as aptpl=1
[ 479.859554] PR register with aptpl unset. Treating as aptpl=1
[ 479.870360] PR register with aptpl unset. Treating as aptpl=1
[ 479.880972] PR register with aptpl unset. Treating as aptpl=1
[ 479.881165] PR info too large for encoding: 8673
[ 479.881166] failed to encode PR xattr: -22
[ 479.881166] atomic PR info update failed: -22
[ 480.380570] PR register with aptpl unset. Treating as aptpl=1
[ 480.390800] PR register with aptpl unset. Treating as aptpl=1
[ 480.400858] PR register with aptpl unset. Treating as aptpl=1
[ 480.410973] PR register with aptpl unset. Treating as aptpl=1
[ 480.411120] PR info too large for encoding: 8673
[ 480.411121] failed to encode PR xattr: -22
[ 480.411122] atomic PR info update failed: -22
[ 480.431682] PR register with aptpl unset. Treating as aptpl=1
[ 480.441739] PR register with aptpl unset. Treating as aptpl=1
[ 480.450811] PR register with aptpl unset. Treating as aptpl=1
[ 480.455146] PR register with aptpl unset. Treating as aptpl=1
[ 480.464050] PR register with aptpl unset. Treating as aptpl=1
[ 480.500401] PR register with aptpl unset. Treating as aptpl=1
[ 480.508086] PR register with aptpl unset. Treating as aptpl=1
[ 480.512111] PR register with aptpl unset. Treating as aptpl=1
[ 480.517052] PR register with aptpl unset. Treating as aptpl=1
[ 480.526109] PR register with aptpl unset. Treating as aptpl=1
[ 480.541659] PR register with aptpl unset. Treating as aptpl=1
[ 480.553608] PR register with aptpl unset. Treating as aptpl=1
[ 480.579284] PR register with aptpl unset. Treating as aptpl=1
[ 480.592389] SPC-3 PR: Attempted RESERVE from iqn.2018-03.net.testing.internal:bd-e7k-hv-cn2,i,0x3430303030313337,iqn.2018-05.net.testing.internal:00001,t,0x2 while reservation already held by iqn.2018-03.net.testing.internal:bd-e7k-hv-cn1,i,0x3430303030313337,iqn.2018-05.net.testing.internal:00001,t,0x2, returning RESERVATION_CONFLICT
[ 480.622925] PR register with aptpl unset. Treating as aptpl=1
[ 480.636298] SPC-3 PR: Attempted RESERVE from iqn.2018-03.net.testing.internal:bd-e7k-hv-cn1,i,0x3430303030313337,iqn.2018-05.net.testing.internal:00001,t,0x2 while reservation already held by iqn.2018-03.net.testing.internal:bd-e7k-hv-cn2,i,0x3430303030313337,iqn.2018-05.net.testing.internal:00001,t,0x2, returning RESERVATION_CONFLICT
[ 480.640564] PR register with aptpl unset. Treating as aptpl=1
[ 480.693412] PR register with aptpl unset. Treating as aptpl=1
[ 480.727340] PR register with aptpl unset. Treating as aptpl=1
[ 480.739020] PR register with aptpl unset. Treating as aptpl=1
[ 480.751663] PR register with aptpl unset. Treating as aptpl=1
[ 480.772253] PR register with aptpl unset. Treating as aptpl=1
[ 480.807250] PR register with aptpl unset. Treating as aptpl=1
[ 480.820105] PR register with aptpl unset. Treating as aptpl=1
[ 480.832997] PR register with aptpl unset. Treating as aptpl=1
[ 480.855080] PR register with aptpl unset. Treating as aptpl=1
[ 480.886481] PR register with aptpl unset. Treating as aptpl=1
[ 480.898644] PR register with aptpl unset. Treating as aptpl=1
There is quite a bit more of the same content in dmesg.
The issue persists with the 4.4.38 kernel you posted. (The dmesg above is from that kernel)
Thanks for the response, sorry for delay.
This is with Server 2016, and I am now just using two nodes to test with. Both fresh installs of Windows, with all cumulative updates installed.
Tried deleting all PetaSAN iSCSI disks first, and creating two new. Same issue, here's the screenshot of the error:
And here's some output from dmesg on one of the nodes:
[ 478.200940] PR register with aptpl unset. Treating as aptpl=1
[ 478.200980] SPC-3 PR: Attempted RESERVE from iqn.2018-03.net.testing.internal:bd-e7k-hv-cn2,i,0x3430303030313337,iqn.2018-05.net.testing.internal:00001,t,0x2 while reservation already held by iqn.2018-03.net.testing.internal:bd-e7k-hv-cn1,i,0x3430303030313337,iqn.2018-05.net.testing.internal:00001,t,0x2, returning RESERVATION_CONFLICT
[ 478.205458] PR register with aptpl unset. Treating as aptpl=1
[ 478.210698] PR register with aptpl unset. Treating as aptpl=1
[ 478.220384] PR register with aptpl unset. Treating as aptpl=1
[ 478.234342] PR register with aptpl unset. Treating as aptpl=1
[ 478.234493] PR info too large for encoding: 8673
[ 478.234494] failed to encode PR xattr: -22
[ 478.234494] atomic PR info update failed: -22
[ 478.786503] PR register with aptpl unset. Treating as aptpl=1
[ 478.803247] PR register with aptpl unset. Treating as aptpl=1
[ 478.817486] PR register with aptpl unset. Treating as aptpl=1
[ 478.827918] PR register with aptpl unset. Treating as aptpl=1
[ 478.828083] PR info too large for encoding: 8673
[ 478.828084] failed to encode PR xattr: -22
[ 478.828085] atomic PR info update failed: -22
[ 479.317857] PR register with aptpl unset. Treating as aptpl=1
[ 479.327779] PR register with aptpl unset. Treating as aptpl=1
[ 479.338941] PR register with aptpl unset. Treating as aptpl=1
[ 479.349230] PR register with aptpl unset. Treating as aptpl=1
[ 479.349478] PR info too large for encoding: 8673
[ 479.349480] failed to encode PR xattr: -22
[ 479.349480] atomic PR info update failed: -22
[ 479.849099] PR register with aptpl unset. Treating as aptpl=1
[ 479.859554] PR register with aptpl unset. Treating as aptpl=1
[ 479.870360] PR register with aptpl unset. Treating as aptpl=1
[ 479.880972] PR register with aptpl unset. Treating as aptpl=1
[ 479.881165] PR info too large for encoding: 8673
[ 479.881166] failed to encode PR xattr: -22
[ 479.881166] atomic PR info update failed: -22
[ 480.380570] PR register with aptpl unset. Treating as aptpl=1
[ 480.390800] PR register with aptpl unset. Treating as aptpl=1
[ 480.400858] PR register with aptpl unset. Treating as aptpl=1
[ 480.410973] PR register with aptpl unset. Treating as aptpl=1
[ 480.411120] PR info too large for encoding: 8673
[ 480.411121] failed to encode PR xattr: -22
[ 480.411122] atomic PR info update failed: -22
[ 480.431682] PR register with aptpl unset. Treating as aptpl=1
[ 480.441739] PR register with aptpl unset. Treating as aptpl=1
[ 480.450811] PR register with aptpl unset. Treating as aptpl=1
[ 480.455146] PR register with aptpl unset. Treating as aptpl=1
[ 480.464050] PR register with aptpl unset. Treating as aptpl=1
[ 480.500401] PR register with aptpl unset. Treating as aptpl=1
[ 480.508086] PR register with aptpl unset. Treating as aptpl=1
[ 480.512111] PR register with aptpl unset. Treating as aptpl=1
[ 480.517052] PR register with aptpl unset. Treating as aptpl=1
[ 480.526109] PR register with aptpl unset. Treating as aptpl=1
[ 480.541659] PR register with aptpl unset. Treating as aptpl=1
[ 480.553608] PR register with aptpl unset. Treating as aptpl=1
[ 480.579284] PR register with aptpl unset. Treating as aptpl=1
[ 480.592389] SPC-3 PR: Attempted RESERVE from iqn.2018-03.net.testing.internal:bd-e7k-hv-cn2,i,0x3430303030313337,iqn.2018-05.net.testing.internal:00001,t,0x2 while reservation already held by iqn.2018-03.net.testing.internal:bd-e7k-hv-cn1,i,0x3430303030313337,iqn.2018-05.net.testing.internal:00001,t,0x2, returning RESERVATION_CONFLICT
[ 480.622925] PR register with aptpl unset. Treating as aptpl=1
[ 480.636298] SPC-3 PR: Attempted RESERVE from iqn.2018-03.net.testing.internal:bd-e7k-hv-cn1,i,0x3430303030313337,iqn.2018-05.net.testing.internal:00001,t,0x2 while reservation already held by iqn.2018-03.net.testing.internal:bd-e7k-hv-cn2,i,0x3430303030313337,iqn.2018-05.net.testing.internal:00001,t,0x2, returning RESERVATION_CONFLICT
[ 480.640564] PR register with aptpl unset. Treating as aptpl=1
[ 480.693412] PR register with aptpl unset. Treating as aptpl=1
[ 480.727340] PR register with aptpl unset. Treating as aptpl=1
[ 480.739020] PR register with aptpl unset. Treating as aptpl=1
[ 480.751663] PR register with aptpl unset. Treating as aptpl=1
[ 480.772253] PR register with aptpl unset. Treating as aptpl=1
[ 480.807250] PR register with aptpl unset. Treating as aptpl=1
[ 480.820105] PR register with aptpl unset. Treating as aptpl=1
[ 480.832997] PR register with aptpl unset. Treating as aptpl=1
[ 480.855080] PR register with aptpl unset. Treating as aptpl=1
[ 480.886481] PR register with aptpl unset. Treating as aptpl=1
[ 480.898644] PR register with aptpl unset. Treating as aptpl=1
There is quite a bit more of the same content in dmesg.
The issue persists with the 4.4.38 kernel you posted. (The dmesg above is from that kernel)
Last edited on May 21, 2018, 2:18 pm by protocol6v · #6
admin
2,930 Posts
May 21, 2018, 2:50 pmQuote from admin on May 21, 2018, 2:50 pmThanks very much for the error detail.
We will look into this, we do tests a lot with 2012 and 2016 but maybe it is one of the recent updates that causes this. If we cannot reproduce it, i will send you a newer kernel with lots of debug logs which will help us solve this, it will take a couple of days and i will get back to you.
Again thanks for all the effort in this. 🙂
Thanks very much for the error detail.
We will look into this, we do tests a lot with 2012 and 2016 but maybe it is one of the recent updates that causes this. If we cannot reproduce it, i will send you a newer kernel with lots of debug logs which will help us solve this, it will take a couple of days and i will get back to you.
Again thanks for all the effort in this. 🙂
protocol6v
85 Posts
May 21, 2018, 2:55 pmQuote from protocol6v on May 21, 2018, 2:55 pmThanks, look forward to seeing what you find!
Thanks, look forward to seeing what you find!
protocol6v
85 Posts
May 29, 2018, 11:38 amQuote from protocol6v on May 29, 2018, 11:38 amJust checking in to see if you were able to reproduce this? Thanks!
Just checking in to see if you were able to reproduce this? Thanks!
admin
2,930 Posts
May 29, 2018, 12:19 pmQuote from admin on May 29, 2018, 12:19 pmWe did several test but we could not reproduce it. We are using Windows Server 2016 version 1607 build 14393.0 release date 10/12/2016 + did all the updates.We are testing by running the Windows cluster validation suite of tests. We also did various configurations 2 nodes to 2 nodes, 2 nodes to 1 node., sharing paths or using different paths...etc.
Can you check the build and version number of Windows and let us know. Also can you give more detail on your configuration is it 2 hyperv nodes talking to 2 (same/different) paths each on different node ? Also if possible can you give us the node names and initiator iqn names so we can be exactly like your environment.
Currently we are building a new kernel with extra logging for you to test, i will post you the link when done.
We did several test but we could not reproduce it. We are using Windows Server 2016 version 1607 build 14393.0 release date 10/12/2016 + did all the updates.We are testing by running the Windows cluster validation suite of tests. We also did various configurations 2 nodes to 2 nodes, 2 nodes to 1 node., sharing paths or using different paths...etc.
Can you check the build and version number of Windows and let us know. Also can you give more detail on your configuration is it 2 hyperv nodes talking to 2 (same/different) paths each on different node ? Also if possible can you give us the node names and initiator iqn names so we can be exactly like your environment.
Currently we are building a new kernel with extra logging for you to test, i will post you the link when done.
Last edited on May 29, 2018, 12:22 pm by admin · #10
iSCSI multi-client access to disk
protocol6v
85 Posts
Quote from protocol6v on May 17, 2018, 3:12 pmHi there,
Trying to configure a Hyper-V cluster, but having trouble where it seems only one host has access to the iSCSI disk at any given time. Is there a setting I need to change on PetaSAN somewhere to allow multiple hosts to access each iSCSI disk/lun at once? I know on our old EQL SANs, this was a checkbox somewhere.
Thanks!
Hi there,
Trying to configure a Hyper-V cluster, but having trouble where it seems only one host has access to the iSCSI disk at any given time. Is there a setting I need to change on PetaSAN somewhere to allow multiple hosts to access each iSCSI disk/lun at once? I know on our old EQL SANs, this was a checkbox somewhere.
Thanks!
protocol6v
85 Posts
Quote from protocol6v on May 17, 2018, 3:54 pmThis appears to be an issue with perisitent reservations to the iSCSI LUN. Here's the error Microsoft presents:
Failure issuing call to Persistent Reservation REGISTER AND IGNORE EXISTING on Test Disk 1 from node BD-E7k-HV-CN1.testing.net when the disk has no existing registration. It is expected to succeed. The device is not ready.
Test Disk 1 does not provide Persistent Reservations support for the mechanisms used by failover clusters. Some storage devices require specific firmware versions or settings to function properly with failover clusters. Please contact your storage administrator or storage vendor to check the configuration of the storage to allow it to function properly with failover clusters
Not seeing anything in PetaSAN logs. Here's some interesting output from dmesg:
[14352.489656] PR register with aptpl unset. Treating as aptpl=1
[14352.492348] PR register with aptpl unset. Treating as aptpl=1
[14352.495207] PR register with aptpl unset. Treating as aptpl=1
[14498.661779] PR register with aptpl unset. Treating as aptpl=1
[14498.661783] PR info not present, initializing
[14498.668197] PR register with aptpl unset. Treating as aptpl=1
[14498.686007] PR register with aptpl unset. Treating as aptpl=1
[14499.411722] PR register with aptpl unset. Treating as aptpl=1
[14499.431857] PR register with aptpl unset. Treating as aptpl=1
[14499.435080] PR register with aptpl unset. Treating as aptpl=1
[14499.446180] SPC-3 PR: Attempted RESERVE from iqn.2018-03.net.testing.internal:bd-e7k-hv-cn1,i,0x3430303030313337,iqn.2018-05.net.testing.internal:00001,t,0x1 while reservation already held by iqn.2018-03.net.testing.internal:bd-e7k-hv-cn2,i,0x3430303030313337,iqn.2018-05.net.testing.internal:00001,t,0x2, returning RESERVATION_CONFLICT
[14499.446456] SPC-3 PR: Attempted RESERVE from iqn.2018-03.net.testing.internal:bd-e7k-hv-cn1,i,0x3430303030313337,iqn.2018-05.net.testing.internal:00001,t,0x1 while reservation already held by iqn.2018-03.net.testing.internal:bd-e7k-hv-cn2,i,0x3430303030313337,iqn.2018-05.net.testing.internal:00001,t,0x2, returning RESERVATION_CONFLICT
[14499.474500] PR register with aptpl unset. Treating as aptpl=1
[14499.482617] SPC-3 PR: Attempted RESERVE from iqn.2018-03.net.testing.internal:bd-e7k-hv-cn2,i,0x3430303030313337,iqn.2018-05.net.testing.internal:00001,t,0x1 while reservation already held by iqn.2018-03.net.testing.internal:bd-e7k-hv-cn1,i,0x3430303030313337,iqn.2018-05.net.testing.internal:00001,t,0x1, returning RESERVATION_CONFLICT
[14499.490455] PR register with aptpl unset. Treating as aptpl=1Let me know what other info might help here.
This appears to be an issue with perisitent reservations to the iSCSI LUN. Here's the error Microsoft presents:
Failure issuing call to Persistent Reservation REGISTER AND IGNORE EXISTING on Test Disk 1 from node BD-E7k-HV-CN1.testing.net when the disk has no existing registration. It is expected to succeed. The device is not ready.
Test Disk 1 does not provide Persistent Reservations support for the mechanisms used by failover clusters. Some storage devices require specific firmware versions or settings to function properly with failover clusters. Please contact your storage administrator or storage vendor to check the configuration of the storage to allow it to function properly with failover clusters
Not seeing anything in PetaSAN logs. Here's some interesting output from dmesg:
[14352.489656] PR register with aptpl unset. Treating as aptpl=1
[14352.492348] PR register with aptpl unset. Treating as aptpl=1
[14352.495207] PR register with aptpl unset. Treating as aptpl=1
[14498.661779] PR register with aptpl unset. Treating as aptpl=1
[14498.661783] PR info not present, initializing
[14498.668197] PR register with aptpl unset. Treating as aptpl=1
[14498.686007] PR register with aptpl unset. Treating as aptpl=1
[14499.411722] PR register with aptpl unset. Treating as aptpl=1
[14499.431857] PR register with aptpl unset. Treating as aptpl=1
[14499.435080] PR register with aptpl unset. Treating as aptpl=1
[14499.446180] SPC-3 PR: Attempted RESERVE from iqn.2018-03.net.testing.internal:bd-e7k-hv-cn1,i,0x3430303030313337,iqn.2018-05.net.testing.internal:00001,t,0x1 while reservation already held by iqn.2018-03.net.testing.internal:bd-e7k-hv-cn2,i,0x3430303030313337,iqn.2018-05.net.testing.internal:00001,t,0x2, returning RESERVATION_CONFLICT
[14499.446456] SPC-3 PR: Attempted RESERVE from iqn.2018-03.net.testing.internal:bd-e7k-hv-cn1,i,0x3430303030313337,iqn.2018-05.net.testing.internal:00001,t,0x1 while reservation already held by iqn.2018-03.net.testing.internal:bd-e7k-hv-cn2,i,0x3430303030313337,iqn.2018-05.net.testing.internal:00001,t,0x2, returning RESERVATION_CONFLICT
[14499.474500] PR register with aptpl unset. Treating as aptpl=1
[14499.482617] SPC-3 PR: Attempted RESERVE from iqn.2018-03.net.testing.internal:bd-e7k-hv-cn2,i,0x3430303030313337,iqn.2018-05.net.testing.internal:00001,t,0x1 while reservation already held by iqn.2018-03.net.testing.internal:bd-e7k-hv-cn1,i,0x3430303030313337,iqn.2018-05.net.testing.internal:00001,t,0x1, returning RESERVATION_CONFLICT
[14499.490455] PR register with aptpl unset. Treating as aptpl=1
Let me know what other info might help here.
admin
2,930 Posts
Quote from admin on May 17, 2018, 7:28 pmWe do support Persistent Reservations and do pass both the Win 2012 R2 and Win 2016 storage and pr tests.
From the error it appears PetaSAN thinks there is an existing reservation on the disk but the Windows cluster does not. Maybe it could have been due to the Windows cluster being completely shutdown and restarted, though in our tests if we shutdown the entire Windows cluster in will "pre-empt" any existing reservation and starts over without issue.
It is possible to manually clear the reservation yourself as per https://docs.microsoft.com/en-us/powershell/module/failoverclusters/clear-clusterdiskreservation?view=win10-ps
I would be interested to know what steps you did to get this error.
We do support Persistent Reservations and do pass both the Win 2012 R2 and Win 2016 storage and pr tests.
From the error it appears PetaSAN thinks there is an existing reservation on the disk but the Windows cluster does not. Maybe it could have been due to the Windows cluster being completely shutdown and restarted, though in our tests if we shutdown the entire Windows cluster in will "pre-empt" any existing reservation and starts over without issue.
It is possible to manually clear the reservation yourself as per https://docs.microsoft.com/en-us/powershell/module/failoverclusters/clear-clusterdiskreservation?view=win10-ps
I would be interested to know what steps you did to get this error.
protocol6v
85 Posts
Quote from protocol6v on May 18, 2018, 10:54 amI attempted to clear the reservation using the clear-clusterdiskreservation command, but this did not help.
I have not created the cluster yet. This is just when trying to "Validate Configuration" prior to actually creating the cluster.
It seems like PetaSAN is not clearing the reservation quickly enough for the test to succeed. If you look at the errors, the third one states the reservation is held by a different node than the first two errors, and this changes when i rerun the validation test.
I went ahead and tried creating the cluster anyway. This did not result in good things. When trying to failover the PetaSAN iSCSI disk to another cluster node, the disk role will not come online and is not accessible. Also, if you browse to the disk via Explorer, none of the data that was on the disk from one node is visible from the other. Tested, and this works just fine with other iSCSI disks on the same hyper-v cluster (Equallogic, FreeNAS).
Tried destroying the cluster and starting over with the same results.
I attempted to clear the reservation using the clear-clusterdiskreservation command, but this did not help.
I have not created the cluster yet. This is just when trying to "Validate Configuration" prior to actually creating the cluster.
It seems like PetaSAN is not clearing the reservation quickly enough for the test to succeed. If you look at the errors, the third one states the reservation is held by a different node than the first two errors, and this changes when i rerun the validation test.
I went ahead and tried creating the cluster anyway. This did not result in good things. When trying to failover the PetaSAN iSCSI disk to another cluster node, the disk role will not come online and is not accessible. Also, if you browse to the disk via Explorer, none of the data that was on the disk from one node is visible from the other. Tested, and this works just fine with other iSCSI disks on the same hyper-v cluster (Equallogic, FreeNAS).
Tried destroying the cluster and starting over with the same results.
admin
2,930 Posts
Quote from admin on May 18, 2018, 12:57 pm
- Are you using Win2012 R2 or Win2016 ? how many nodes ?
- If you create all new iSCSI disks and run the validation tests on these new disks, does it fail. If so can you show a screenshot of the Windows report detail. Also show the dmesg logs on the PetaSAN iSCSI target nodes.
- Can you install this kernel on the nodes and see if it solves this:
https://drive.google.com/open?id=1LxizxXz4WKsIkcXUigQ-kv9aUVYo2oVZ
dpkg -i linux-image-4.4.38-petasan_amd64.deb
reboot
- Are you using Win2012 R2 or Win2016 ? how many nodes ?
- If you create all new iSCSI disks and run the validation tests on these new disks, does it fail. If so can you show a screenshot of the Windows report detail. Also show the dmesg logs on the PetaSAN iSCSI target nodes.
- Can you install this kernel on the nodes and see if it solves this:
https://drive.google.com/open?id=1LxizxXz4WKsIkcXUigQ-kv9aUVYo2oVZ
dpkg -i linux-image-4.4.38-petasan_amd64.deb
reboot
protocol6v
85 Posts
Quote from protocol6v on May 21, 2018, 2:15 pmThanks for the response, sorry for delay.
This is with Server 2016, and I am now just using two nodes to test with. Both fresh installs of Windows, with all cumulative updates installed.
Tried deleting all PetaSAN iSCSI disks first, and creating two new. Same issue, here's the screenshot of the error:
And here's some output from dmesg on one of the nodes:
[ 478.200940] PR register with aptpl unset. Treating as aptpl=1
[ 478.200980] SPC-3 PR: Attempted RESERVE from iqn.2018-03.net.testing.internal:bd-e7k-hv-cn2,i,0x3430303030313337,iqn.2018-05.net.testing.internal:00001,t,0x2 while reservation already held by iqn.2018-03.net.testing.internal:bd-e7k-hv-cn1,i,0x3430303030313337,iqn.2018-05.net.testing.internal:00001,t,0x2, returning RESERVATION_CONFLICT
[ 478.205458] PR register with aptpl unset. Treating as aptpl=1
[ 478.210698] PR register with aptpl unset. Treating as aptpl=1
[ 478.220384] PR register with aptpl unset. Treating as aptpl=1
[ 478.234342] PR register with aptpl unset. Treating as aptpl=1
[ 478.234493] PR info too large for encoding: 8673
[ 478.234494] failed to encode PR xattr: -22
[ 478.234494] atomic PR info update failed: -22
[ 478.786503] PR register with aptpl unset. Treating as aptpl=1
[ 478.803247] PR register with aptpl unset. Treating as aptpl=1
[ 478.817486] PR register with aptpl unset. Treating as aptpl=1
[ 478.827918] PR register with aptpl unset. Treating as aptpl=1
[ 478.828083] PR info too large for encoding: 8673
[ 478.828084] failed to encode PR xattr: -22
[ 478.828085] atomic PR info update failed: -22
[ 479.317857] PR register with aptpl unset. Treating as aptpl=1
[ 479.327779] PR register with aptpl unset. Treating as aptpl=1
[ 479.338941] PR register with aptpl unset. Treating as aptpl=1
[ 479.349230] PR register with aptpl unset. Treating as aptpl=1
[ 479.349478] PR info too large for encoding: 8673
[ 479.349480] failed to encode PR xattr: -22
[ 479.349480] atomic PR info update failed: -22
[ 479.849099] PR register with aptpl unset. Treating as aptpl=1
[ 479.859554] PR register with aptpl unset. Treating as aptpl=1
[ 479.870360] PR register with aptpl unset. Treating as aptpl=1
[ 479.880972] PR register with aptpl unset. Treating as aptpl=1
[ 479.881165] PR info too large for encoding: 8673
[ 479.881166] failed to encode PR xattr: -22
[ 479.881166] atomic PR info update failed: -22
[ 480.380570] PR register with aptpl unset. Treating as aptpl=1
[ 480.390800] PR register with aptpl unset. Treating as aptpl=1
[ 480.400858] PR register with aptpl unset. Treating as aptpl=1
[ 480.410973] PR register with aptpl unset. Treating as aptpl=1
[ 480.411120] PR info too large for encoding: 8673
[ 480.411121] failed to encode PR xattr: -22
[ 480.411122] atomic PR info update failed: -22
[ 480.431682] PR register with aptpl unset. Treating as aptpl=1
[ 480.441739] PR register with aptpl unset. Treating as aptpl=1
[ 480.450811] PR register with aptpl unset. Treating as aptpl=1
[ 480.455146] PR register with aptpl unset. Treating as aptpl=1
[ 480.464050] PR register with aptpl unset. Treating as aptpl=1
[ 480.500401] PR register with aptpl unset. Treating as aptpl=1
[ 480.508086] PR register with aptpl unset. Treating as aptpl=1
[ 480.512111] PR register with aptpl unset. Treating as aptpl=1
[ 480.517052] PR register with aptpl unset. Treating as aptpl=1
[ 480.526109] PR register with aptpl unset. Treating as aptpl=1
[ 480.541659] PR register with aptpl unset. Treating as aptpl=1
[ 480.553608] PR register with aptpl unset. Treating as aptpl=1
[ 480.579284] PR register with aptpl unset. Treating as aptpl=1
[ 480.592389] SPC-3 PR: Attempted RESERVE from iqn.2018-03.net.testing.internal:bd-e7k-hv-cn2,i,0x3430303030313337,iqn.2018-05.net.testing.internal:00001,t,0x2 while reservation already held by iqn.2018-03.net.testing.internal:bd-e7k-hv-cn1,i,0x3430303030313337,iqn.2018-05.net.testing.internal:00001,t,0x2, returning RESERVATION_CONFLICT
[ 480.622925] PR register with aptpl unset. Treating as aptpl=1
[ 480.636298] SPC-3 PR: Attempted RESERVE from iqn.2018-03.net.testing.internal:bd-e7k-hv-cn1,i,0x3430303030313337,iqn.2018-05.net.testing.internal:00001,t,0x2 while reservation already held by iqn.2018-03.net.testing.internal:bd-e7k-hv-cn2,i,0x3430303030313337,iqn.2018-05.net.testing.internal:00001,t,0x2, returning RESERVATION_CONFLICT
[ 480.640564] PR register with aptpl unset. Treating as aptpl=1
[ 480.693412] PR register with aptpl unset. Treating as aptpl=1
[ 480.727340] PR register with aptpl unset. Treating as aptpl=1
[ 480.739020] PR register with aptpl unset. Treating as aptpl=1
[ 480.751663] PR register with aptpl unset. Treating as aptpl=1
[ 480.772253] PR register with aptpl unset. Treating as aptpl=1
[ 480.807250] PR register with aptpl unset. Treating as aptpl=1
[ 480.820105] PR register with aptpl unset. Treating as aptpl=1
[ 480.832997] PR register with aptpl unset. Treating as aptpl=1
[ 480.855080] PR register with aptpl unset. Treating as aptpl=1
[ 480.886481] PR register with aptpl unset. Treating as aptpl=1
[ 480.898644] PR register with aptpl unset. Treating as aptpl=1There is quite a bit more of the same content in dmesg.
The issue persists with the 4.4.38 kernel you posted. (The dmesg above is from that kernel)
Thanks for the response, sorry for delay.
This is with Server 2016, and I am now just using two nodes to test with. Both fresh installs of Windows, with all cumulative updates installed.
Tried deleting all PetaSAN iSCSI disks first, and creating two new. Same issue, here's the screenshot of the error:
And here's some output from dmesg on one of the nodes:
[ 478.200940] PR register with aptpl unset. Treating as aptpl=1
[ 478.200980] SPC-3 PR: Attempted RESERVE from iqn.2018-03.net.testing.internal:bd-e7k-hv-cn2,i,0x3430303030313337,iqn.2018-05.net.testing.internal:00001,t,0x2 while reservation already held by iqn.2018-03.net.testing.internal:bd-e7k-hv-cn1,i,0x3430303030313337,iqn.2018-05.net.testing.internal:00001,t,0x2, returning RESERVATION_CONFLICT
[ 478.205458] PR register with aptpl unset. Treating as aptpl=1
[ 478.210698] PR register with aptpl unset. Treating as aptpl=1
[ 478.220384] PR register with aptpl unset. Treating as aptpl=1
[ 478.234342] PR register with aptpl unset. Treating as aptpl=1
[ 478.234493] PR info too large for encoding: 8673
[ 478.234494] failed to encode PR xattr: -22
[ 478.234494] atomic PR info update failed: -22
[ 478.786503] PR register with aptpl unset. Treating as aptpl=1
[ 478.803247] PR register with aptpl unset. Treating as aptpl=1
[ 478.817486] PR register with aptpl unset. Treating as aptpl=1
[ 478.827918] PR register with aptpl unset. Treating as aptpl=1
[ 478.828083] PR info too large for encoding: 8673
[ 478.828084] failed to encode PR xattr: -22
[ 478.828085] atomic PR info update failed: -22
[ 479.317857] PR register with aptpl unset. Treating as aptpl=1
[ 479.327779] PR register with aptpl unset. Treating as aptpl=1
[ 479.338941] PR register with aptpl unset. Treating as aptpl=1
[ 479.349230] PR register with aptpl unset. Treating as aptpl=1
[ 479.349478] PR info too large for encoding: 8673
[ 479.349480] failed to encode PR xattr: -22
[ 479.349480] atomic PR info update failed: -22
[ 479.849099] PR register with aptpl unset. Treating as aptpl=1
[ 479.859554] PR register with aptpl unset. Treating as aptpl=1
[ 479.870360] PR register with aptpl unset. Treating as aptpl=1
[ 479.880972] PR register with aptpl unset. Treating as aptpl=1
[ 479.881165] PR info too large for encoding: 8673
[ 479.881166] failed to encode PR xattr: -22
[ 479.881166] atomic PR info update failed: -22
[ 480.380570] PR register with aptpl unset. Treating as aptpl=1
[ 480.390800] PR register with aptpl unset. Treating as aptpl=1
[ 480.400858] PR register with aptpl unset. Treating as aptpl=1
[ 480.410973] PR register with aptpl unset. Treating as aptpl=1
[ 480.411120] PR info too large for encoding: 8673
[ 480.411121] failed to encode PR xattr: -22
[ 480.411122] atomic PR info update failed: -22
[ 480.431682] PR register with aptpl unset. Treating as aptpl=1
[ 480.441739] PR register with aptpl unset. Treating as aptpl=1
[ 480.450811] PR register with aptpl unset. Treating as aptpl=1
[ 480.455146] PR register with aptpl unset. Treating as aptpl=1
[ 480.464050] PR register with aptpl unset. Treating as aptpl=1
[ 480.500401] PR register with aptpl unset. Treating as aptpl=1
[ 480.508086] PR register with aptpl unset. Treating as aptpl=1
[ 480.512111] PR register with aptpl unset. Treating as aptpl=1
[ 480.517052] PR register with aptpl unset. Treating as aptpl=1
[ 480.526109] PR register with aptpl unset. Treating as aptpl=1
[ 480.541659] PR register with aptpl unset. Treating as aptpl=1
[ 480.553608] PR register with aptpl unset. Treating as aptpl=1
[ 480.579284] PR register with aptpl unset. Treating as aptpl=1
[ 480.592389] SPC-3 PR: Attempted RESERVE from iqn.2018-03.net.testing.internal:bd-e7k-hv-cn2,i,0x3430303030313337,iqn.2018-05.net.testing.internal:00001,t,0x2 while reservation already held by iqn.2018-03.net.testing.internal:bd-e7k-hv-cn1,i,0x3430303030313337,iqn.2018-05.net.testing.internal:00001,t,0x2, returning RESERVATION_CONFLICT
[ 480.622925] PR register with aptpl unset. Treating as aptpl=1
[ 480.636298] SPC-3 PR: Attempted RESERVE from iqn.2018-03.net.testing.internal:bd-e7k-hv-cn1,i,0x3430303030313337,iqn.2018-05.net.testing.internal:00001,t,0x2 while reservation already held by iqn.2018-03.net.testing.internal:bd-e7k-hv-cn2,i,0x3430303030313337,iqn.2018-05.net.testing.internal:00001,t,0x2, returning RESERVATION_CONFLICT
[ 480.640564] PR register with aptpl unset. Treating as aptpl=1
[ 480.693412] PR register with aptpl unset. Treating as aptpl=1
[ 480.727340] PR register with aptpl unset. Treating as aptpl=1
[ 480.739020] PR register with aptpl unset. Treating as aptpl=1
[ 480.751663] PR register with aptpl unset. Treating as aptpl=1
[ 480.772253] PR register with aptpl unset. Treating as aptpl=1
[ 480.807250] PR register with aptpl unset. Treating as aptpl=1
[ 480.820105] PR register with aptpl unset. Treating as aptpl=1
[ 480.832997] PR register with aptpl unset. Treating as aptpl=1
[ 480.855080] PR register with aptpl unset. Treating as aptpl=1
[ 480.886481] PR register with aptpl unset. Treating as aptpl=1
[ 480.898644] PR register with aptpl unset. Treating as aptpl=1
There is quite a bit more of the same content in dmesg.
The issue persists with the 4.4.38 kernel you posted. (The dmesg above is from that kernel)
admin
2,930 Posts
Quote from admin on May 21, 2018, 2:50 pmThanks very much for the error detail.
We will look into this, we do tests a lot with 2012 and 2016 but maybe it is one of the recent updates that causes this. If we cannot reproduce it, i will send you a newer kernel with lots of debug logs which will help us solve this, it will take a couple of days and i will get back to you.
Again thanks for all the effort in this. 🙂
Thanks very much for the error detail.
We will look into this, we do tests a lot with 2012 and 2016 but maybe it is one of the recent updates that causes this. If we cannot reproduce it, i will send you a newer kernel with lots of debug logs which will help us solve this, it will take a couple of days and i will get back to you.
Again thanks for all the effort in this. 🙂
protocol6v
85 Posts
Quote from protocol6v on May 21, 2018, 2:55 pmThanks, look forward to seeing what you find!
Thanks, look forward to seeing what you find!
protocol6v
85 Posts
Quote from protocol6v on May 29, 2018, 11:38 amJust checking in to see if you were able to reproduce this? Thanks!
Just checking in to see if you were able to reproduce this? Thanks!
admin
2,930 Posts
Quote from admin on May 29, 2018, 12:19 pmWe did several test but we could not reproduce it. We are using Windows Server 2016 version 1607 build 14393.0 release date 10/12/2016 + did all the updates.We are testing by running the Windows cluster validation suite of tests. We also did various configurations 2 nodes to 2 nodes, 2 nodes to 1 node., sharing paths or using different paths...etc.
Can you check the build and version number of Windows and let us know. Also can you give more detail on your configuration is it 2 hyperv nodes talking to 2 (same/different) paths each on different node ? Also if possible can you give us the node names and initiator iqn names so we can be exactly like your environment.
Currently we are building a new kernel with extra logging for you to test, i will post you the link when done.
We did several test but we could not reproduce it. We are using Windows Server 2016 version 1607 build 14393.0 release date 10/12/2016 + did all the updates.We are testing by running the Windows cluster validation suite of tests. We also did various configurations 2 nodes to 2 nodes, 2 nodes to 1 node., sharing paths or using different paths...etc.
Can you check the build and version number of Windows and let us know. Also can you give more detail on your configuration is it 2 hyperv nodes talking to 2 (same/different) paths each on different node ? Also if possible can you give us the node names and initiator iqn names so we can be exactly like your environment.
Currently we are building a new kernel with extra logging for you to test, i will post you the link when done.