Forums

Home / Forums

You need to log in to create posts and topics. Login · Register

SSD Journal sizing, IOPS calc

Looking for some info on SSDs for journals. I know i've seen your recommendation of 1 SSD to 4 HDDs, just wondering if this is based on consumer SSD throughput, or something else?

IE, i'm trying to spec my cluster such that I can get good write throughput with minimal costs when I want to add OSDs. Right now, with high-end consumer MLC SSDs, I'm looking at having to spend 4-500$ on an ssd every time I add 4 disks. Spread amongst four nodes, this adds up quickly, and also restricts expansion capability as I now loose a fifth slot for every four disks deployed.

If i step this up to NVMe or PCIe SSDs such as Intel P3700 or others, is it safe to journal for more than 4 disks and still expect decent write speeds?

Thanks!

i've seen your recommendation of 1 SSD to 4 HDDs, just wondering if this is based on consumer SSD throughput, or something else?

It is based on the write throughput ratio of enterprise ssd : hdd Note you need 20 GB journal partition on the ssd for each hdd, so you do not nead a large disk.

If i step this up to NVMe or PCIe SSDs such as Intel P3700 or others, is it safe to journal for more than 4 disks and still expect decent write speeds?

The ratio is 1 nvme to 12-18 hdds. This ratio is quoted by Red Hat for both Ceph Jewel  ( PetaSAN 1.5) and Ceph Luminous (PetaSAN 2.0) . but we believe it should be lowered in case of Luminous. but there are no real data to support this as it depends on the type of workload. As per above, you need 20GB journal space per OSD.

Last unless you will use an all ssd solution, we recommend you try both v 1.5 and 2.0 and if you have battery backed controller, enabled them.

That is what I wanted to hear. I'm going to shoot for 8-9 disks per 1 256GB NVMe.

 

Thanks!

In what scenario would it make sense to up the journal partition sizes? I'm looking at NVMe's and it's looking like it may be more cost effective to go with 512's (still using 1:8 rule) than lower speed 256's. Would I benefit in any way by doubling the journal partition size up to 40gb as to not waste half the SSD?

Thanks!

Yes you can increase it to 40 GB, you can define this value while creating the cluster in the tuning page, or after cluster creation manually on each node conf file.

It will have some benefit but not very significant.