Forums

Home / Forums

You need to log in to create posts and topics. Login · Register

Boot Disk Selection

We're in the process of building our first production cluster using three Dell R630 servers. Our standard for boot media is the mirrored SD card module (RAID 1), however this is not recommended from what we have read in these forums.

Each server is equipped with (8) Samsung 883 grade SSD for the cluster storage. For the boot media, we are considering two 120GB SSD in RAID1, per server. Most of our budget for this project went towards the enterprise SSDs and server hardware. Would it be acceptable to run consumer grade SSDs, such as Samsung or SanDisk for the boot media? We understand the consumer grade drives lack PLP and intensive write longevity in the NAND.

This cluster will be in a datacenter with dual PDUs and backup generator power. We will also have remote support hands that can replace a failed boot drive as needed.

 

We ran a test with consumer SSDs for boot back in PS2.3 days. From our experience they do not last as long as an enterprise SSD due to the lack of sufficient over subscription in the storage media. Consumer SSDs do not have enough extra storage blocks to deal with the high failure rates of storage blocks in server environments and thus the drives tend to fail earlier.

Our test method was not totally empirical but the setup was:

master drive: 120gb WD black spinning disk

cloned using dd to a 240gb samsung evo ssd

used an LSI raid controller (dont remember which one) create a mirror of the cloned ssd to another and then built the 3 node cluster with 6x 1tb wd drives per node and cloned our SQL VM to it in slave replication from the master. The ssd drives started showing issues within two months and are now basically dead.

Your boot drives get a lot of writes to them for logging and statistics and it is the writes that kill SSDs. Since the write requirements are lower than a pair of SAS 10k spinning drives can handle, we have determined that there is no real need to use SSDs as boot drives.

We still use SSDs for cache and journals in a 4:1 ratio against storage OSDs and consumer grade SSDs are fine for this application since replacing a journal drive does not ruin the cluster (will slow it down until it catches up though) whereas replacing a boot drive can force a full node replacement and recovery.