Forums

Home / Forums

You need to log in to create posts and topics. Login · Register

Caching using SSDs or NVRAM cards

Pages: 1 2

In download section missing caching support it is called an issue but in my point of view, it is a missing feature.

Adding caching feature would be great. So about is the roadmap for this feature?

I suppose the demand of having several tiers will raise up afterwards 🙂

Regards

Juergen

We will soon add support for caching OSDs using SSDs/NVRAMs.  This is different than cache tiering where caching is done at the pool level rather that device level.  This seems to be the current recommendations from Red Hat, they are no longer offering cache tier-ing in their commercial version.

How soon is soon? PetaSAN sounds very interesting, but have some slow SATA disks and really need SSD caching before it is worth for me to give it a try.

It will be either in v 1.4 or 1.5, so approx 1.5 months or 3 months from now.

We are still prioritizing the 1.4 feature set.

Sounds good. Thanks for the quick reply.

Quote from admin on May 28, 2017, 2:24 pm

We will soon add support for caching OSDs using SSDs/NVRAMs.  This is different than cache tiering where caching is done at the pool level rather that device level.  This seems to be the current recommendations from Red Hat, they are no longer offering cache tier-ing in their commercial version.

Hi, I am reasonably familiar with cache tiering but can you please elaborate on what you mean by "caching OSDs"? (or point me at the appropriate ceph docs).

Thanks, Jim

I suppose the place where the journaling information of the OSD daemons will get be written can be defined so that this can be done to selectable drive, like SSD or NVMe drive. In my point of view, this has 2 backdraws:

- when this SSD / NVMe is crashing, the data of the of all referring OSDs are lost.

- only reasonable with NVMe, cause SSDs throughput is ~ 500 MB/Sek and if you have more than 5 SATA drives, it is better to store the journal on the SATA drives with each ~100 MB/sek otherwise the SSD is a bottleneck.

Why is a Cache-Tiering not recommended ? (http://docs.ceph.com/docs/master/rados/operations/cache-tiering/)

 

Regards Juergen

 

According to

https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/2.0/html/release_notes/deprecated_functionality

Cache tiering is now deprecated

The RADOS-level cache tiering feature has been deprecated. The feature does not provide any performance improvements for most Ceph workloads and introduces stability issues to the cluster when activated.

Alternative cache options will be provided by using the dm-cache feature in future versions of Red Hat Ceph Storage.

Our plan  is to support both Journal caching for existing OSDs which use the Filestore engine  as well as support caching of wal and rocksdb for Ceph Luminous which uses the new Bluestore storage engine, its idea is to bypass the journal and achieve transactions via the rocksdb. But Luminous is quite delayed already and maybe we will not include it in 1.5.  For existing clusters that will be upgraded it will be possible to have older OSDs using Filestore co-exist with newer OSDs using Bluetore, we will also allow the older OSDs to use external SSDs for journal, at least that is what we plan.

For Bluestore apart from wal and rocksdb, in the future there could a block level cache on top of the OSD data using dm-cache/bcache/flashcache. But this is future thing we are following so we can integrate in PetaSAN once supported/qualified in Ceph.

Note the putting journals on SSDs does improve performance for spinning disks:  collocating the journal on same device reduces your sequential write bandwidth for the disk by half and random IOPS by up to 3 times ( you need 3 seeks per write operation ).

OK, so in laymans terms we are talking about a write journal and metadata on SSD.  I'm planning some hardware for deployment later in the year when I hope the new SSD caching capability will be included in PetaSAN. I'm going to plan on 1% SSD for now, unless anyone else has a better suggestion?

Thanks, Jim

For journals the common ratio for ssd:hdd is 1:4-5,  for nvme:hdd is 1:16-20

Pages: 1 2