Forums

Home / Forums

You need to log in to create posts and topics. Login · Register

Petasan best setting for 500 to 1000 Vm ( Hyper-V )

Hi,

I have questions and also need your advise to get the best pergormance and high iops using petasan.

Apparently, we have 500 working vms that runs under production environment by using 10 dell PE r610. And each host server are joint domain and All vms are not joint domain.

However, we planned to change our old PE r610 (10 servers ) to PE r630 v3 or v4 ( 10 servers) and we planned to use SAN because we are facing issue on the uptime, HA and failover issue.

Cureently, our PE610 are using H700 with Raid 10 ( 6 sas hdd ).

We are interested on using SAN to get the HA and failover fully works. We are in a midst of using Starwind and PetaSAN.

Due to the huge price of licensing for windows server + SW, we are unable to use SW because the price will be going up to the sky. We would like to use Petasan for the production level.

We need some advise on the SAN server.
1) how many servers need? We planned to purchase additional PE 630, is it possible?

2) what is the recommendation setting on the NIC? How many NIC needed?

3) apparently, we plan to  get 10tb( might be increasing infuture ) with high ops + high and best performance .

Pls advise.

Thanks

Getting high iops could be an open ended question, it is also very hardware dependent. I would recommend you benchmark your hardware, preferably in  small lab cluster and preferably testing different disk types if you can. Try to tune your servers by adding more disks per host until both your disk and cpu usage saturate at the same time, the benchmark page and stats charts should be your guide. Once you tune it, get an understanding of how much iops/throughput you will get per host as a building block, so you know how many hosts to add as your iops workload increases.

For high iops you should use SSDs. Best use models that are known to work well with Ceph, else make sure they support PLP and have a decent FUA/sync write speed which you can measure from the node console. You can also mix SSD and HDD in separate pools and have the HDD for backups or non iops demanding apps. If you really have to use HDDs, make sure you use a journal and for write intensive apps you could use an SSD write cache device, for reads you could increase the OSD cache from default 4G.

For cpu, the more cores the better for iops, the higher frequency the better for latency. However this should be tuned with the disk type/speed so they both saturate together, else you will be paying for an unused resource.

For network: use 2x10G LACP bonded for backend and 2x10G separate for iSCSI 1/2 MPIO. Higher the better but typically go higher if your cpu and disks are already top rated.

these are pointers to get you going, we can always help you in more detail in our support services of course.

Hi admin,

Thank you for the reply.

About the network, the min requirement is 10gbE isnt it? Apparently, how about sfp+ and qsfp+? Does petasan supports that?

About the storage, we might go for the sata 3 SSD ( samsung or dell or intel). This is quite confusing, i read on the petasan documents, the requirement should 3 servers and above isnt it? Well, here are my questions:-

1) can i use raid card such as H730?
2) currently we have more than 6tb total for 10 servers, but we decided to get more TB. In this case, how many ssd required? And what is the raid setup?

Thanks

Yes 10G is min for network, any ethernet supported by Linux will work.

About the storage, we might go for the sata 3 SSD ( samsung or dell or intel). This is quite confusing, i read on the petasan documents, the requirement should 3 servers and above isnt it?

It is not clear what you mean and what is not clear, the min of 3 servers is correct.

1) can i use raid card such as H730?

not recommended you use any RAID, in some cases for slow HDD you could use single volume RAID0 to make use controller write back cache but do not do this since you use SSDs.

2) currently we have more than 6tb total for 10 servers, but we decided to get more TB. In this case, how many ssd required? And what is the raid setup?

do not use RAID as above. Your total raw capacity needs to factor in the x3 replication, the count of SSDs per host for tuned iops should be determined by having equal saturation of cpu and disk % usage as posted earlier. You can use the highest capacity of the same SSD model but do not trade off by decreasing disk count for higher capacity, also  avoid mixing different disk model / sizes in the same pool.

Hi admin,

For the Networking parts, can we use SFP+ or QSFP for the petasan? which means it will be on fiber optic networking. Is it possible?

For the Storage, it means if I use 3.9tb... so i need to use 3.9tb on all 3 nodes right isnt it? and all 3 servers should be the same spec isnt it?

Thanks

 

We support ethernet, any interface which has a Linux ethernet driver should work.

I am not sure if you use 3 nodes or 10 as in original post but assume you want to store 10 TB as seen by your client apps, then internally you will need 10x3 = 30 TB to account for replication, these 30TB can be distributed to 3 or 10 or any number of nodes.

You do not strictly have to use the same hardware spec for your storage nodes, but it recommended, else if you have a slow node or node with more disks than others it will slow your entire cluster.

Hi,

Apparently we have 10 servers runs with Raid 10 on each server. Each servers run on RAID 10 and the total disk space for each server are ( http://prntscr.com/yzla8e )

So total disk for 10 servers are 27TB and for now we need to extend the servers and the hard disk. As for now, we are having problem with the HA and FO because SAN is a must.

However, we plan to use PETASAN for our labtest and once everything is ok, we will purchase a new hardware and we will implement the petasan on our production environment.

We are in a midst on the hardware requirement.

1) On the new version of petasan, can we use Fiber connection ( 10 GB Fiber SFP+ ) on 3 petasan nodes?

2) If we have new 3 servers that will run for the petasan nodes ( petasan-1 , petasan-2 , petasan-3 ) what are the total disk that we need? Is it possible if we want to scale-out the space?

3) Does petasan supports NVMe hard disk that directluy attached to the pci-e gen3 or gen 4?

4) you did mention that RAID is not recommended? can we use the RAID controller card with a RAID 0 ?

Thanks!

Hi,

Not sure if you have figured out what you need to do but here is my take:

You will need at least 3 servers to start a cluster, yes sfp/qsfp cards are available but must have Linux drivers in Debian/Ubuntu. The actual port does not matter as long as the card has ethernet drivers.

You need 30TB of total disk space, meaning you need 90TB of total drive space to account for replication within the cluster. You could use a 2x replication to start and change it to 3x as your cluster grows but you will have problems if you loose one node of three on 2x replication.

Petasan is running on a modified Ubuntu Linux, so if your network card has linux drivers that presents as an ethernet host then yes it will work. It does not matter if the physical medium is twisted pair, fiber, coax or a string between cans as long as the ethernet protocol is used. Just for fun, we used a Cambium epmp1000 AP as the network switch and place 3 nodes on SMs just to see if we could put nodes around our extended network. Yes, you can but not ideal for performance or cluster reliability! too much latency!

As it stands, you will need 8x 4TB drives and 2x 1TB SSD (for journaling) with 64GB ram and 12 cpu cores (not threads, so a 12 core xenon or opteron) per node. If using spinning disks, make sure they are not SMR. For the best IOPS possible, use all SSDs and SAS controllers. NVMe drives would be best but at the cost of more expensive server hardware to support them.

DO NOT USE RAID! all drives must be presented to the OS as a individual drives and is best not to use single disk raid0 off raid controllers. This places dependancy on a particularcontroller should it fail. There is no guaranteethat the raid0 data on that drive will be readable by another card that is not an exact model and firmware revision. Try to get IT configured controllers or controllers that pass through unconfigured drives. We use LSI SAS9210 cards as these can be put into IT mode with a simple firmware upgrade.

Depending on how much IOPS you need, you may need to have more nodes in your cluster. Each spinning disk is usually around 100iops each so 24 drives would be near 2400iops (other factors contribute and degrade this number). Network bandwidth required is simply the sustained read speed of one drive multiplied by the number of drives in one node of the cluster. So most drives provide upto 200mbps sustained read and if you have 8 drives, you would need 1.6Gbps or a minimum two active/active LACP bonded 1gbps network connections per iscsi network, or 1 10gbps link.

Data replication allows for node failures and 3x allows for two nodes in a 5 node cluster to be lost and still have your data available. Never set replication below 2 and divide your total installed drive capacity by the replication factor to figure out cluster capacity. Adding nodes can either expand your available storage or allow for more replication thus increase your fault tolerance not both at the same time.

Example: you have a 3 node cluster with 2x replication. You can loose one node and still have all of your data (two nodes can still agree on the data content and prove its correct), this holds true no matter how many nodes you have in 2x replication since you do not know which two servers have your data. Also to expand total storage space you only need to add two drives. Now you want to increase your fault tolerance to two node failures, you need to move to 3x replication but to keep the same amount of storage available you now need 6 nodes (assuming all storage drives are the same size). This could also be achieved by replacing one OSD at a time with larger drives (be careful of the SMR drives!) then migrating to 3x replication. But adding nodes increases your available IOPS and lowers the per drive workload which increases its service life. Like raid 6 systems, the more drives in the array (nodes in a cluster) the more fault tolerance you can achieve, but you also have to increase the number of replications stored to gain node fault tolerance.

We found that for production a 6 node cluster with 3x replication is a good starting point and at 8 nodes, 4x replication starts to make sense and we consider 4x a must at 12 nodes in one cluster as it gives us the ability to loose 3 nodes (which though unlikely is still possible) and still keep out SAN running.

Adding a node is as simple as build, install, join to cluster. Wait for node to join cluster and activate the new OSDs (storage drives) one at a time. Setting the backfill rate low on the maintenance tab will ensure your cluster does not loose too much performance while the new node fills with its data.