Forums

Home / Forums

You need to log in to create posts and topics. Login · Register

Hardware Advice Please

Dear all experts i initially want to setup a 50Tb available storage for my university which can be extendable upto 200-300TB . As per PETASAN hardware recommendation guide i will purchase 3 server nodes. Can someone please recommend me server node specification for above setup?

Get 10 (or more) Gbps nics x 4 ports

Try to use all SSD disks.  Try to get enterprise grade, some perform much better with sync writes in Ceph, see the list on:

https://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/

You can also perform these tests from the PetaSAN node console menu.

If you cannot get all SSDs,, then get a few SSDs to serve as journals to HDDs, with a ratio of 1:4. Get a controller with write back cache to use for your HDDs to increase iops and reduce latency + try to use many HDDs per node to help boost performance, for example 16 or more. For SSD only you should not exceed 10 per node. The PetaSAN benchmark can be used to help tune your system.

When computing capacity, you need to multiply by number of replicas. So for replica x3 and you want net 200 TB then you need 600 TB total storage of your disks.

Admin's suggestion is good, but in most cases over-kill.

Here is my experience with the requirements, this is not empirical and different CPUs and mainboards change this. This is solely my opinions and should not be taken as gospel, but this is working without any issues so far.

You need to know what the intended connection load is going to be. If you are setting up a Nextcloud/Owncloud file store then standard 5400rpm HDDs are more than fast enough considering that you are accessing several drives at once and not limited to the SAS/SATA bus speeds. If you are looking at VM storage, minimum you need 7200rpm drives due to the lower search speeds, SSDs are still overkill. SSDs shine when you are doing lots of reads but few writes. Dedicated journal drives do become required when you are reaching 300TB across 8 or more nodes, this speeds up data recovery in the event of a disk failure. Do not use RAID cards unless you can set them into IT mode. Ceph needs to access the drives raw, Raid0 configs work but you must shutdown the node to swap a drive, then add the drive as a single drive raid0 and then tell PetaSAN to use the drive. This limits the hotswap functions considerably. Writeback cache is not a requirement, but it is a good thing to consider having.

Number of nodes, with more drives per node you will require more memory and CPU cores. If the node is also performing iSCSI and monitor functions then you need even more resources. Consider having three nodes for monitor and iSCSI and add nodes for storage. With this model you can upgrade monitor/iSCSI nodes to handle the load changes and simply add storage nodes as your data requirements grow. It is easier to build them separate now then after you have a cluster running. Storage nodes will need 1GB of ram per HDD plus 2GB of ram per 10Gb port and 1 CPU core per 2 HDD. Monitor/iSCSI nodes require 2GB per 10Gb port plus 1GB per iSCSI connection (this is over the actuals, but it allows for traffic spikes and data reconstruction with minimal issues) plus 4 CPU cores for the monitor software, 1 CPU core per 10Gb port and 1 CPU core per iSCSI connection.

Network requirements are fairly simple, get two 2-port 10Gbps nics, bond 2 ports across the two nics to ensure fail safe networking and you are done. You require four "networks" but they a logical not physical, you can use vlans to separate traffic but the IP structure must be four distinct subnets.

Now we are doing some things different as we are using Infiniband for the node-to-node and iSCSI networks. We are using 2U servers like HP dl380-G5p with two LSI9208 in IT mode and a Mellanox ConnectX-2 VPI card. We bond the onboard 1Gbps nics and use them for management access, all other traffic goes through the Infiniband network. Our nodes have two 8-core CPUs, minimum of 32GB ram, and storage nodes have up to 16 drives per node. All monitor and iSCSI is performed by the monitor nodes which are on three HP BL460-G7 blades, currently spread across two bladecenters with plans to add a third bladecenter and move the third monitor node to it. We add a storage nodes as needed with little to no performance hit noticeable to our servers. And no, Infiniband is not supported. It must be setup separately and before the cluster is created and before additional nodes are added. We use it because we already had a Infiniband network in place prior to switching to PetaSAN and forklifting to 40Gbps network cards was not an economical solution. Upgrades are slow and tedious but the performance is worth it in our opinion.