Forums

Home / Forums

You need to log in to create posts and topics. Login · Register

Testing Petasan under ESXi Vms

Greetings,

I am currently playing a bit with Petasan, but I am facing a strange problem. I have no connectivity to iSCSI IP's in VM's... Here is a brief description for network config:

Nodes                                 1                    2                  3

Mangement               192.x.x.41 , 192.x.x.42, 192.x.x.43

iSCSI 2 Assigned IPs         x             10.75.0.100     x

iSCSI 1 Assigned IP's        x                          x              10.74.0.100

Backend 1 Assigned IP's    10.74.1.1        10.74.1.2         10.74.1.3

Backend 2 Assigned IP's    10.75.1.1        10.75.1.2         10.75.1.3

 

Cluster builds succesfully , all nodes answer to ping through backend and management interfaces but not for the iSCSI ones.

Might it be something related to ESXi networking that is preventing those interfaces to communicate? iSCSI software adapter cannot connect either...any ideas?

Thanks!

 

 

 

It should work of setup correctly.
I am not sure if this is related but one issues we faced, we are using ESX 6, is that when we create the PetaSAN vms, we can only add 4 networks at first when adding the vm, then we can add the 5th network in the edit vm settings. There seems to be a bug in ESX, the first 4 networks are added in consecutive order and show up in PetaSAN as eth0,eth1,eth2,eth4 but after adding the fifth network, ESX will insert the fifth network as eth1 ( to be accurate, in inserts it in its pci slot ) and shifts the others so in PetaSAN the fifth network will now become eth1 and the second becomes eth2. We found this nic re-ordering consistent across all ESXs. The best way to be sure is to take note of the MAC addresses and see the order/naming as reported by PetaSAN installer, also in ver 1.4 you can re-order/rename nics via the node console menue.

Excluding this hick-up, things should work well.

That was exactly the same problem I was having, after reordering interfaces all is working as intended.

Thanks!

Hi  again,

After succesfully creating cluster, and adding it as a datastore to vSphere I am facing a new issue. As soon as there's some sustained write load, one of the nodes cpu load colapses... ceph health starts throwing slow requests warnings and OSD's go down and up ...

Using atop, I saw lock starts with IO latency, after that the whole system just hangs... All drives(journal and OSD) are SSD's, The 3 vm's have 8GB of RAM a 2 cores each one, with all roles active.

Is there anything more I could check to see if its just a hardware performance issue or something I could tune somehow ?

Thanks!

First we do not officially support this setup  as of yet, but these are some pointers.
Also how many OSDs you have and how much free RAM does atop report ?
I recommend you give more RAM if you can. Then start with 1 or 2 OSDs in each VM, perform load tests then if successful add a disk at a time to the VM and manually create an OSD from it via the UI. Your virtual disks should be on separate physical disks, do not place 2 virtual disks on the same physical disk.
To perform load test, you can use the PetaSAN cluster benchmark from the web ui, this tests the underlying Ceph storage without iSCSI, we need to make sure Ceph is working correctly before testing iSCSI.

The above may be acceptable under certain workload but if you need better io performance, if you can use a dedicated hba/controller for PetaSAN and use PCI passthrough to assignt it to the PetaSAN vm. There are several videos online on how to do so with Freenas.
else you may need to adjust io work queues as in:

https://blogs.vmware.com/vsphere/2012/07/troubleshooting-storage-performance-in-vsphere-part-5-storage-queues.html

https://blogs.vmware.com/apps/2015/07/queues-queues-queues-2.html

https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2053145

https://pubs.vmware.com/vsphere-50/index.jsp?topic=%2Fcom.vmware.vsphere.troubleshooting.doc_50%2FGUID-BB3DF1B1-7A51-4D43-8094-1F01D7CD3D67.html

Once Ceph is working, we can test iSCSI which will require some more resources, technically we have a recommendation guide for this but at least you need an extra 4G of RAM.