Forums

Home / Forums

You need to log in to create posts and topics. Login · Register

Support for SSD Affinity in a mixed media pool

Great post from ceph-users about assigning primary affinity to SSDs.  Basically you build a pool that primarily is SSDs, while the 2nd and 3rd writes go to (SSD journaled or WAL'd) HDDs.  Reads always hit the SSDs.

(edit: adding illustration for clarity)

So you have in your pool 18 1TB SSD drives (spread across 3 hosts, with a RAID1 boot/OS volume, 6 SSDs each).  Then (for the 2nd and 3rd replicas that are "mostly write" except in the event of a primary SSD failure for rebuild) perhaps 3 hosts, that each have 6 2TB drives with two decent SSDs in RAID1 for OS+WAL/Rocksdb (to allow your writes to be ACK'd right away as if the array is all flash).

You would get 18TB of flash storage for far less than half the cost than if you did the entire pool in SSD.  (And the added theoretical write durability HDDs bring for your 2nd and 3rd replicas).

This is different than cache tiering as all your data is on SSD (unless you lose an SSD then it will backfill from HDD).

Explained in this Ceph-users post: http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-June/018487.html

It would be fantastic to have this kind of functionality accessible in the PetaSAN GUI, especially since I'm assuming many peoples' use case here is vmware/other hypervisor storage.  This makes using SSDs almost affordable since you're really only paying for the SSDs once instead of 3x.

You'd still need to put an SSD or two in each HDD host for WAL/RocksDB but that's pretty insignificant cost wise.

Thanks for the suggestion, it does look interesting. It is not directly related to pool and crush map customization but is related so i will see if we can also add this.

It would be very nice indeed if we could have this in petasan.