Minor typo / Debug Level / Error building Consul cluster
therm
121 Posts
June 27, 2017, 1:40 pmQuote from therm on June 27, 2017, 1:40 pmHi,
in PetaSAN.log it says
"sarting ceph-disk zap /dev/sdb"
you mean
"starting ceph-disk zap /dev/sdb"
right?
BTW: Howto see more information in PetaSAN.log during cluster building? Tried to change
/opt/petasan/config/etc/consul.d/server/config.json
"log_level": "DEBUG"
But this does not help.
At the moment the cluster refuses to build with the messages:
Error building cluster, please check detail below.
Error List
Error building Consul cluster
How to debug this Error? It is a bare metal installation:
3 Nodes:
64 GB RAM, 24 HDDs, 2 System disks, 1x Quad-Port-1Gbit(Management), 2x Dual-Port-10Gbit(the other networks)
Any advise?
Regards,
Dennis
Hi,
in PetaSAN.log it says
"sarting ceph-disk zap /dev/sdb"
you mean
"starting ceph-disk zap /dev/sdb"
right?
BTW: Howto see more information in PetaSAN.log during cluster building? Tried to change
/opt/petasan/config/etc/consul.d/server/config.json
"log_level": "DEBUG"
But this does not help.
At the moment the cluster refuses to build with the messages:
Error building cluster, please check detail below.
Error List
Error building Consul cluster
How to debug this Error? It is a bare metal installation:
3 Nodes:
64 GB RAM, 24 HDDs, 2 System disks, 1x Quad-Port-1Gbit(Management), 2x Dual-Port-10Gbit(the other networks)
Any advise?
Regards,
Dennis
Last edited on June 27, 2017, 1:44 pm · #1
admin
2,930 Posts
June 27, 2017, 2:35 pmQuote from admin on June 27, 2017, 2:35 pmHi,
Thanks for see-ing the typo 🙂
The error you see means there was an issue building the Consul cluster (which is used for health checking nodes and resource distribution). It is built early on before building the Ceph cluster, in the vast majority of cases this happens due to network issues, in other cases it happens if for some reason the deployment was aborted from the user UI and then tried to re-deploy (even though we try to handle this).
I recommend you re-install fresh from the installer all 3 nodes and just before the last step (before step 6) on  your third node (so just before building the cluster)  make sure your nodes can ping each other on all subnets, Consul uses the backend 1 but it is better to test all, and look for any errors or delays. You can use ssh or the node console menu to ping. Also double check your ips and subnet masks.
Let me know it works, else do let me know and i can trace the logs with you.
Hi,
Thanks for see-ing the typo 🙂
The error you see means there was an issue building the Consul cluster (which is used for health checking nodes and resource distribution). It is built early on before building the Ceph cluster, in the vast majority of cases this happens due to network issues, in other cases it happens if for some reason the deployment was aborted from the user UI and then tried to re-deploy (even though we try to handle this).
I recommend you re-install fresh from the installer all 3 nodes and just before the last step (before step 6) on  your third node (so just before building the cluster)  make sure your nodes can ping each other on all subnets, Consul uses the backend 1 but it is better to test all, and look for any errors or delays. You can use ssh or the node console menu to ping. Also double check your ips and subnet masks.
Let me know it works, else do let me know and i can trace the logs with you.
Last edited on June 27, 2017, 2:37 pm · #2
therm
121 Posts
June 27, 2017, 2:42 pmQuote from therm on June 27, 2017, 2:42 pmYes you are right. It seems like node2 is isolated from the other two.
The root cause seems to be that the nics are recognized in a different order:
All nodes have backend networks on eth6 + eth7, but on node2 the nics are recognized in a different order, so that the right nics would be eth4 + eth5. Will play with udev rules.
Thanks,
Dennis
Yes you are right. It seems like node2 is isolated from the other two.
The root cause seems to be that the nics are recognized in a different order:
All nodes have backend networks on eth6 + eth7, but on node2 the nics are recognized in a different order, so that the right nics would be eth4 + eth5. Will play with udev rules.
Thanks,
Dennis
admin
2,930 Posts
June 27, 2017, 2:58 pmQuote from admin on June 27, 2017, 2:58 pmExcellent you identified the issue 🙂
Are all nodes using the same NIC hardware and are they plugged in the same slot order ? We did encounter an issue like this when this was causing a problem and will support manually re-ordering the NICS in our next release: Â Â http://www.petasan.org/next-release/
For now it could be simpler just to swap your cable.
Excellent you identified the issue 🙂
Are all nodes using the same NIC hardware and are they plugged in the same slot order ? We did encounter an issue like this when this was causing a problem and will support manually re-ordering the NICS in our next release: Â Â http://www.petasan.org/next-release/
For now it could be simpler just to swap your cable.
Last edited on June 27, 2017, 2:59 pm · #4
therm
121 Posts
June 29, 2017, 4:38 amQuote from therm on June 29, 2017, 4:38 amIn the end there was a cabling problem two. So you were right in your assumption that this was network related.
Thank you!
Dennis
In the end there was a cabling problem two. So you were right in your assumption that this was network related.
Thank you!
Dennis
philip.shannon
37 Posts
June 29, 2017, 7:00 pmQuote from philip.shannon on June 29, 2017, 7:00 pmhit the same issue. In VMware it's always:
network adapter 1 = management
network adapter 2 = iscsi 1
network adapter 3 = iscsi 2
network adapter 4 = back end 1
network adapter 5 = back end 1
first time installing this worked in the petasan install:
management interface eth0
iscsi 1 interface eth1
iscsi 2 interface eth2
backend 1 interface eth3
backend 2 interface eth4
So first time through the cluster was created but in the end I never got the esxi hosts to connect to the petasan disks via the iscsi networks, so I blew em away and started over.
3 or 4 times now I started over from scratch & could not join cluster with 3rd node, same error you got. Troubleshooting I had to use mac addresses to learn the networks are all scrambled up like eth2 is connected to iscsi 1 network instead of backend 1. And all 3 nodes' networks are scrambled up in a different way.
Is there a network config file in petasan system that I could fix this issue? thanks
hit the same issue. In VMware it's always:
network adapter 1 = management
network adapter 2 = iscsi 1
network adapter 3 = iscsi 2
network adapter 4 = back end 1
network adapter 5 = back end 1
first time installing this worked in the petasan install:
management interface eth0
iscsi 1 interface eth1
iscsi 2 interface eth2
backend 1 interface eth3
backend 2 interface eth4
So first time through the cluster was created but in the end I never got the esxi hosts to connect to the petasan disks via the iscsi networks, so I blew em away and started over.
3 or 4 times now I started over from scratch & could not join cluster with 3rd node, same error you got. Troubleshooting I had to use mac addresses to learn the networks are all scrambled up like eth2 is connected to iscsi 1 network instead of backend 1. And all 3 nodes' networks are scrambled up in a different way.
Is there a network config file in petasan system that I could fix this issue? thanks
Last edited on June 29, 2017, 7:02 pm · #6
admin
2,930 Posts
June 29, 2017, 8:07 pmQuote from admin on June 29, 2017, 8:07 pmWe did hit something similar ourselves, but it may not be as bad, we are testing with v 6.0. I will also try to confirm this with our tester but i will speak from memory: There is a page when you create your VM and add interfaces, there is a limit of 4 interfaces, if you need to add a fifth you need to go and add this later in the edit settings, so far so good and all NICs match the PetaSAN ethernet naming order (we can check this in the PetaSAN installer network config page). But when we add the 5th NIC in the edit settings, VMWare will insert the 5th NIC as eth1 (the second in order) instead of eth4 and shift all the others down. I our case this was consistent across all nodes and we built/rebuild the PetaSAN cluster many times without issues just knowing this.
Maybe 6.5 is different and shuffles them without any order..or maybe you hit something like the one described above and can work around it..also make sure you add the networks and add nics to the VM in the same order across all VMs. Also if you have the exact same issue as we did, you can just use 4 interfaces and map your management subnet and iSCSI 1 together.
I hope you can find out the order that VMWare 6.5 does it, if not then your may have to add a udev rule to force the ethernet names / order to specific MAC addresses as per:
https://unix.stackexchange.com/questions/91085/udev-renaming-my-network-interface
you can use node console shell/ssh/winscp to add this file to all 3 nodes.
In v 1.4  ( http://www.petasan.org/next-release/ ) we have a feature to force eth naming to MAC addresses,  mostly to support servers with different hardware nic card layout, but  it could help in these cases. Currently the naming is based on pci slot number scan. If you really need this feature early i can try to send you a beta once it is done.
We did hit something similar ourselves, but it may not be as bad, we are testing with v 6.0. I will also try to confirm this with our tester but i will speak from memory: There is a page when you create your VM and add interfaces, there is a limit of 4 interfaces, if you need to add a fifth you need to go and add this later in the edit settings, so far so good and all NICs match the PetaSAN ethernet naming order (we can check this in the PetaSAN installer network config page). But when we add the 5th NIC in the edit settings, VMWare will insert the 5th NIC as eth1 (the second in order) instead of eth4 and shift all the others down. I our case this was consistent across all nodes and we built/rebuild the PetaSAN cluster many times without issues just knowing this.
Maybe 6.5 is different and shuffles them without any order..or maybe you hit something like the one described above and can work around it..also make sure you add the networks and add nics to the VM in the same order across all VMs. Also if you have the exact same issue as we did, you can just use 4 interfaces and map your management subnet and iSCSI 1 together.
I hope you can find out the order that VMWare 6.5 does it, if not then your may have to add a udev rule to force the ethernet names / order to specific MAC addresses as per:
https://unix.stackexchange.com/questions/91085/udev-renaming-my-network-interface
you can use node console shell/ssh/winscp to add this file to all 3 nodes.
In v 1.4  ( http://www.petasan.org/next-release/ ) we have a feature to force eth naming to MAC addresses,  mostly to support servers with different hardware nic card layout, but  it could help in these cases. Currently the naming is based on pci slot number scan. If you really need this feature early i can try to send you a beta once it is done.
Last edited on June 29, 2017, 8:15 pm · #7
philip.shannon
37 Posts
June 29, 2017, 8:32 pmQuote from philip.shannon on June 29, 2017, 8:32 pmOn further review I think I'm seeing exactly the same thing. Wish I could post screenshots here, I'll try to explain it in txt. This is how I was configuring each vm:
management interface eth0
iscsi 1 interface eth1
iscsi 2 interface eth2
backend 1 interface eth3
backend 2 interface eth4
after studying the mac addresses via ifconfig, and comparing that to the vm settings in VMware, this is the change I made to fix the mixup:
management interface eth0
iscsi 2 interface eth1
backend 1 interface eth2
backend 2 interface eth3
iscsi 1 interface eth4
still having issues with the vm's communicating on these networks but at least the correct eth# are matched to correct networks now. Will keep plugging away. thank you
On further review I think I'm seeing exactly the same thing. Wish I could post screenshots here, I'll try to explain it in txt. This is how I was configuring each vm:
management interface eth0
iscsi 1 interface eth1
iscsi 2 interface eth2
backend 1 interface eth3
backend 2 interface eth4
after studying the mac addresses via ifconfig, and comparing that to the vm settings in VMware, this is the change I made to fix the mixup:
management interface eth0
iscsi 2 interface eth1
backend 1 interface eth2
backend 2 interface eth3
iscsi 1 interface eth4
still having issues with the vm's communicating on these networks but at least the correct eth# are matched to correct networks now. Will keep plugging away. thank you
admin
2,930 Posts
June 30, 2017, 9:07 amQuote from admin on June 30, 2017, 9:07 amSome of the things i recommend:
Try to give the VMs enough resources cpu/ram. Try to give it close to our recommendation guide if you can.
For the disk SCSI controller choose VMware Paravirtual
For the net controller choose VMXNET3
For the number of OSDs, i recommend you start low with 1 or 2 OSDs per storage node then build your cluster. You can at runtime increase your OSDs in steps after making sure you test your configuration under workload. Note each OSD will require cpu and ram resources.
When mapping OSD to regular vmdk on local store, make your vmdk on separate physical disks, it is not recommended to place several OSD vmdk on the same local drive.
If you have a heavy load consider increasing your queue depth as in:
https://blogs.vmware.com/apps/2015/07/queues-queues-queues-2.html
we are in the process of testing rdm and pci passthrough, the later will probably give the best performance...but our testing is going slow since we focus on v 1.4 tests.
Good luck,,
Some of the things i recommend:
Try to give the VMs enough resources cpu/ram. Try to give it close to our recommendation guide if you can.
For the disk SCSI controller choose VMware Paravirtual
For the net controller choose VMXNET3
For the number of OSDs, i recommend you start low with 1 or 2 OSDs per storage node then build your cluster. You can at runtime increase your OSDs in steps after making sure you test your configuration under workload. Note each OSD will require cpu and ram resources.
When mapping OSD to regular vmdk on local store, make your vmdk on separate physical disks, it is not recommended to place several OSD vmdk on the same local drive.
If you have a heavy load consider increasing your queue depth as in:
https://blogs.vmware.com/apps/2015/07/queues-queues-queues-2.html
we are in the process of testing rdm and pci passthrough, the later will probably give the best performance...but our testing is going slow since we focus on v 1.4 tests.
Good luck,,
Last edited on June 30, 2017, 9:10 am · #9
philip.shannon
37 Posts
June 30, 2017, 3:17 pmQuote from philip.shannon on June 30, 2017, 3:17 pmgot all the network issues sorted out, what was biting me was that the esxi hosts virtual switch didn't have a physical network adapter attached, I finally saw the error in vcenter lookin at the hosts virtual switches configure tab. Now everything is working (including the vlan tagging/trunk ports) and cluster has been created, next steps are to run some tests. thank yoU!
got all the network issues sorted out, what was biting me was that the esxi hosts virtual switch didn't have a physical network adapter attached, I finally saw the error in vcenter lookin at the hosts virtual switches configure tab. Now everything is working (including the vlan tagging/trunk ports) and cluster has been created, next steps are to run some tests. thank yoU!
Last edited on June 30, 2017, 3:18 pm · #10
Minor typo / Debug Level / Error building Consul cluster
therm
121 Posts
Quote from therm on June 27, 2017, 1:40 pmHi,
in PetaSAN.log it says
"sarting ceph-disk zap /dev/sdb"
you mean
"starting ceph-disk zap /dev/sdb"
right?
BTW: Howto see more information in PetaSAN.log during cluster building? Tried to change
/opt/petasan/config/etc/consul.d/server/config.json
"log_level": "DEBUG"
But this does not help.
At the moment the cluster refuses to build with the messages:
Error building cluster, please check detail below.
Error List
Error building Consul cluster
How to debug this Error? It is a bare metal installation:
3 Nodes:
64 GB RAM, 24 HDDs, 2 System disks, 1x Quad-Port-1Gbit(Management), 2x Dual-Port-10Gbit(the other networks)
Any advise?
Regards,
Dennis
Hi,
in PetaSAN.log it says
"sarting ceph-disk zap /dev/sdb"
you mean
"starting ceph-disk zap /dev/sdb"
right?
BTW: Howto see more information in PetaSAN.log during cluster building? Tried to change
/opt/petasan/config/etc/consul.d/server/config.json
"log_level": "DEBUG"
But this does not help.
At the moment the cluster refuses to build with the messages:
Error building cluster, please check detail below.
Error List
Error building Consul cluster
How to debug this Error? It is a bare metal installation:
3 Nodes:
64 GB RAM, 24 HDDs, 2 System disks, 1x Quad-Port-1Gbit(Management), 2x Dual-Port-10Gbit(the other networks)
Any advise?
Regards,
Dennis
admin
2,930 Posts
Quote from admin on June 27, 2017, 2:35 pmHi,
Thanks for see-ing the typo 🙂
The error you see means there was an issue building the Consul cluster (which is used for health checking nodes and resource distribution). It is built early on before building the Ceph cluster, in the vast majority of cases this happens due to network issues, in other cases it happens if for some reason the deployment was aborted from the user UI and then tried to re-deploy (even though we try to handle this).
I recommend you re-install fresh from the installer all 3 nodes and just before the last step (before step 6) on  your third node (so just before building the cluster)  make sure your nodes can ping each other on all subnets, Consul uses the backend 1 but it is better to test all, and look for any errors or delays. You can use ssh or the node console menu to ping. Also double check your ips and subnet masks.
Let me know it works, else do let me know and i can trace the logs with you.
Hi,
Thanks for see-ing the typo 🙂
The error you see means there was an issue building the Consul cluster (which is used for health checking nodes and resource distribution). It is built early on before building the Ceph cluster, in the vast majority of cases this happens due to network issues, in other cases it happens if for some reason the deployment was aborted from the user UI and then tried to re-deploy (even though we try to handle this).
I recommend you re-install fresh from the installer all 3 nodes and just before the last step (before step 6) on  your third node (so just before building the cluster)  make sure your nodes can ping each other on all subnets, Consul uses the backend 1 but it is better to test all, and look for any errors or delays. You can use ssh or the node console menu to ping. Also double check your ips and subnet masks.
Let me know it works, else do let me know and i can trace the logs with you.
therm
121 Posts
Quote from therm on June 27, 2017, 2:42 pmYes you are right. It seems like node2 is isolated from the other two.
The root cause seems to be that the nics are recognized in a different order:
All nodes have backend networks on eth6 + eth7, but on node2 the nics are recognized in a different order, so that the right nics would be eth4 + eth5. Will play with udev rules.
Thanks,
Dennis
Yes you are right. It seems like node2 is isolated from the other two.
The root cause seems to be that the nics are recognized in a different order:
All nodes have backend networks on eth6 + eth7, but on node2 the nics are recognized in a different order, so that the right nics would be eth4 + eth5. Will play with udev rules.
Thanks,
Dennis
admin
2,930 Posts
Quote from admin on June 27, 2017, 2:58 pmExcellent you identified the issue 🙂
Are all nodes using the same NIC hardware and are they plugged in the same slot order ? We did encounter an issue like this when this was causing a problem and will support manually re-ordering the NICS in our next release: Â Â http://www.petasan.org/next-release/
For now it could be simpler just to swap your cable.
Excellent you identified the issue 🙂
Are all nodes using the same NIC hardware and are they plugged in the same slot order ? We did encounter an issue like this when this was causing a problem and will support manually re-ordering the NICS in our next release: Â Â http://www.petasan.org/next-release/
For now it could be simpler just to swap your cable.
therm
121 Posts
Quote from therm on June 29, 2017, 4:38 amIn the end there was a cabling problem two. So you were right in your assumption that this was network related.
Thank you!
Dennis
In the end there was a cabling problem two. So you were right in your assumption that this was network related.
Thank you!
Dennis
philip.shannon
37 Posts
Quote from philip.shannon on June 29, 2017, 7:00 pmhit the same issue. In VMware it's always:
network adapter 1 = management
network adapter 2 = iscsi 1
network adapter 3 = iscsi 2
network adapter 4 = back end 1
network adapter 5 = back end 1
first time installing this worked in the petasan install:
management interface eth0
iscsi 1 interface eth1
iscsi 2 interface eth2
backend 1 interface eth3
backend 2 interface eth4
So first time through the cluster was created but in the end I never got the esxi hosts to connect to the petasan disks via the iscsi networks, so I blew em away and started over.
3 or 4 times now I started over from scratch & could not join cluster with 3rd node, same error you got. Troubleshooting I had to use mac addresses to learn the networks are all scrambled up like eth2 is connected to iscsi 1 network instead of backend 1. And all 3 nodes' networks are scrambled up in a different way.
Is there a network config file in petasan system that I could fix this issue? thanks
hit the same issue. In VMware it's always:
network adapter 1 = management
network adapter 2 = iscsi 1
network adapter 3 = iscsi 2
network adapter 4 = back end 1
network adapter 5 = back end 1
first time installing this worked in the petasan install:
management interface eth0
iscsi 1 interface eth1
iscsi 2 interface eth2
backend 1 interface eth3
backend 2 interface eth4
So first time through the cluster was created but in the end I never got the esxi hosts to connect to the petasan disks via the iscsi networks, so I blew em away and started over.
3 or 4 times now I started over from scratch & could not join cluster with 3rd node, same error you got. Troubleshooting I had to use mac addresses to learn the networks are all scrambled up like eth2 is connected to iscsi 1 network instead of backend 1. And all 3 nodes' networks are scrambled up in a different way.
Is there a network config file in petasan system that I could fix this issue? thanks
admin
2,930 Posts
Quote from admin on June 29, 2017, 8:07 pmWe did hit something similar ourselves, but it may not be as bad, we are testing with v 6.0. I will also try to confirm this with our tester but i will speak from memory: There is a page when you create your VM and add interfaces, there is a limit of 4 interfaces, if you need to add a fifth you need to go and add this later in the edit settings, so far so good and all NICs match the PetaSAN ethernet naming order (we can check this in the PetaSAN installer network config page). But when we add the 5th NIC in the edit settings, VMWare will insert the 5th NIC as eth1 (the second in order) instead of eth4 and shift all the others down. I our case this was consistent across all nodes and we built/rebuild the PetaSAN cluster many times without issues just knowing this.
Maybe 6.5 is different and shuffles them without any order..or maybe you hit something like the one described above and can work around it..also make sure you add the networks and add nics to the VM in the same order across all VMs. Also if you have the exact same issue as we did, you can just use 4 interfaces and map your management subnet and iSCSI 1 together.
I hope you can find out the order that VMWare 6.5 does it, if not then your may have to add a udev rule to force the ethernet names / order to specific MAC addresses as per:
https://unix.stackexchange.com/questions/91085/udev-renaming-my-network-interface
you can use node console shell/ssh/winscp to add this file to all 3 nodes.
In v 1.4  ( http://www.petasan.org/next-release/ ) we have a feature to force eth naming to MAC addresses,  mostly to support servers with different hardware nic card layout, but  it could help in these cases. Currently the naming is based on pci slot number scan. If you really need this feature early i can try to send you a beta once it is done.
We did hit something similar ourselves, but it may not be as bad, we are testing with v 6.0. I will also try to confirm this with our tester but i will speak from memory: There is a page when you create your VM and add interfaces, there is a limit of 4 interfaces, if you need to add a fifth you need to go and add this later in the edit settings, so far so good and all NICs match the PetaSAN ethernet naming order (we can check this in the PetaSAN installer network config page). But when we add the 5th NIC in the edit settings, VMWare will insert the 5th NIC as eth1 (the second in order) instead of eth4 and shift all the others down. I our case this was consistent across all nodes and we built/rebuild the PetaSAN cluster many times without issues just knowing this.
Maybe 6.5 is different and shuffles them without any order..or maybe you hit something like the one described above and can work around it..also make sure you add the networks and add nics to the VM in the same order across all VMs. Also if you have the exact same issue as we did, you can just use 4 interfaces and map your management subnet and iSCSI 1 together.
I hope you can find out the order that VMWare 6.5 does it, if not then your may have to add a udev rule to force the ethernet names / order to specific MAC addresses as per:
https://unix.stackexchange.com/questions/91085/udev-renaming-my-network-interface
you can use node console shell/ssh/winscp to add this file to all 3 nodes.
In v 1.4  ( http://www.petasan.org/next-release/ ) we have a feature to force eth naming to MAC addresses,  mostly to support servers with different hardware nic card layout, but  it could help in these cases. Currently the naming is based on pci slot number scan. If you really need this feature early i can try to send you a beta once it is done.
philip.shannon
37 Posts
Quote from philip.shannon on June 29, 2017, 8:32 pmOn further review I think I'm seeing exactly the same thing. Wish I could post screenshots here, I'll try to explain it in txt. This is how I was configuring each vm:
management interface eth0
iscsi 1 interface eth1
iscsi 2 interface eth2
backend 1 interface eth3
backend 2 interface eth4
after studying the mac addresses via ifconfig, and comparing that to the vm settings in VMware, this is the change I made to fix the mixup:
management interface eth0
iscsi 2 interface eth1
backend 1 interface eth2
backend 2 interface eth3
iscsi 1 interface eth4
still having issues with the vm's communicating on these networks but at least the correct eth# are matched to correct networks now. Will keep plugging away. thank you
On further review I think I'm seeing exactly the same thing. Wish I could post screenshots here, I'll try to explain it in txt. This is how I was configuring each vm:
management interface eth0
iscsi 1 interface eth1
iscsi 2 interface eth2
backend 1 interface eth3
backend 2 interface eth4
after studying the mac addresses via ifconfig, and comparing that to the vm settings in VMware, this is the change I made to fix the mixup:
management interface eth0
iscsi 2 interface eth1
backend 1 interface eth2
backend 2 interface eth3
iscsi 1 interface eth4
still having issues with the vm's communicating on these networks but at least the correct eth# are matched to correct networks now. Will keep plugging away. thank you
admin
2,930 Posts
Quote from admin on June 30, 2017, 9:07 amSome of the things i recommend:
Try to give the VMs enough resources cpu/ram. Try to give it close to our recommendation guide if you can.
For the disk SCSI controller choose VMware Paravirtual
For the net controller choose VMXNET3
For the number of OSDs, i recommend you start low with 1 or 2 OSDs per storage node then build your cluster. You can at runtime increase your OSDs in steps after making sure you test your configuration under workload. Note each OSD will require cpu and ram resources.
When mapping OSD to regular vmdk on local store, make your vmdk on separate physical disks, it is not recommended to place several OSD vmdk on the same local drive.
If you have a heavy load consider increasing your queue depth as in:
https://blogs.vmware.com/apps/2015/07/queues-queues-queues-2.html
we are in the process of testing rdm and pci passthrough, the later will probably give the best performance...but our testing is going slow since we focus on v 1.4 tests.
Good luck,,
Some of the things i recommend:
Try to give the VMs enough resources cpu/ram. Try to give it close to our recommendation guide if you can.
For the disk SCSI controller choose VMware Paravirtual
For the net controller choose VMXNET3
For the number of OSDs, i recommend you start low with 1 or 2 OSDs per storage node then build your cluster. You can at runtime increase your OSDs in steps after making sure you test your configuration under workload. Note each OSD will require cpu and ram resources.
When mapping OSD to regular vmdk on local store, make your vmdk on separate physical disks, it is not recommended to place several OSD vmdk on the same local drive.
If you have a heavy load consider increasing your queue depth as in:
https://blogs.vmware.com/apps/2015/07/queues-queues-queues-2.html
we are in the process of testing rdm and pci passthrough, the later will probably give the best performance...but our testing is going slow since we focus on v 1.4 tests.
Good luck,,
philip.shannon
37 Posts
Quote from philip.shannon on June 30, 2017, 3:17 pmgot all the network issues sorted out, what was biting me was that the esxi hosts virtual switch didn't have a physical network adapter attached, I finally saw the error in vcenter lookin at the hosts virtual switches configure tab. Now everything is working (including the vlan tagging/trunk ports) and cluster has been created, next steps are to run some tests. thank yoU!
got all the network issues sorted out, what was biting me was that the esxi hosts virtual switch didn't have a physical network adapter attached, I finally saw the error in vcenter lookin at the hosts virtual switches configure tab. Now everything is working (including the vlan tagging/trunk ports) and cluster has been created, next steps are to run some tests. thank yoU!