Journal point of failure?
tomtheinfguy
3 Posts
April 26, 2018, 9:42 amQuote from tomtheinfguy on April 26, 2018, 9:42 amHi everyone,
I am in the process of planning a major overhaul of our virtualisation platform, including replacing our traditional HP SAN storage with something that is most cost effective when scaling out and that avoids vendor lock in. I have known about Ceph for a long time and have come across PetaSAN.
I have one key question with regards to Journals...
A single SSD journal on a node with say 8+ OSD's seems like a big single point of failure to me.... Is it still a good idea considering the below?
With the new Bluestore engine, I understand the double write mechanism has been dealt with and Ceph is boasting 2x write performance even when compared to previous generations with a journal.
My workload is pure virtualisation workload or various types. Nothing very intensive to my knowledge. If I can get away with it, my ideal configuration is just big spinning disks. Keep it cheap, keep performance consistant and keep it simple.
Guess I may just have to test the various configurations in the flesh!
Thanks
Tom
Hi everyone,
I am in the process of planning a major overhaul of our virtualisation platform, including replacing our traditional HP SAN storage with something that is most cost effective when scaling out and that avoids vendor lock in. I have known about Ceph for a long time and have come across PetaSAN.
I have one key question with regards to Journals...
A single SSD journal on a node with say 8+ OSD's seems like a big single point of failure to me.... Is it still a good idea considering the below?
With the new Bluestore engine, I understand the double write mechanism has been dealt with and Ceph is boasting 2x write performance even when compared to previous generations with a journal.
My workload is pure virtualisation workload or various types. Nothing very intensive to my knowledge. If I can get away with it, my ideal configuration is just big spinning disks. Keep it cheap, keep performance consistant and keep it simple.
Guess I may just have to test the various configurations in the flesh!
Thanks
Tom
Last edited on April 26, 2018, 10:08 am by tomtheinfguy · #1
admin
2,930 Posts
April 26, 2018, 11:09 amQuote from admin on April 26, 2018, 11:09 amCeph is designed to handle failures itself and heal, so it is OK for a couple of disk on a single node ( or a complete node) go down. Ceph makes sure it does not keep more than 1 copy of data on a single node.
If you use bluestore, it is best to use all flash, or if using hdds you need a controller with write back cache as well as ssds for wal/db (journal). This is because each io operation in bluestore requires many supporting ios (db access), for spinning disks this adds a lot of latency. This is particularly true for vm workload where there is a lot of random small block sizes, for streaming or backup applications you can probably get away with pure spinning disks.
Ceph is designed to handle failures itself and heal, so it is OK for a couple of disk on a single node ( or a complete node) go down. Ceph makes sure it does not keep more than 1 copy of data on a single node.
If you use bluestore, it is best to use all flash, or if using hdds you need a controller with write back cache as well as ssds for wal/db (journal). This is because each io operation in bluestore requires many supporting ios (db access), for spinning disks this adds a lot of latency. This is particularly true for vm workload where there is a lot of random small block sizes, for streaming or backup applications you can probably get away with pure spinning disks.
Journal point of failure?
tomtheinfguy
3 Posts
Quote from tomtheinfguy on April 26, 2018, 9:42 amHi everyone,
I am in the process of planning a major overhaul of our virtualisation platform, including replacing our traditional HP SAN storage with something that is most cost effective when scaling out and that avoids vendor lock in. I have known about Ceph for a long time and have come across PetaSAN.
I have one key question with regards to Journals...
A single SSD journal on a node with say 8+ OSD's seems like a big single point of failure to me.... Is it still a good idea considering the below?
With the new Bluestore engine, I understand the double write mechanism has been dealt with and Ceph is boasting 2x write performance even when compared to previous generations with a journal.
My workload is pure virtualisation workload or various types. Nothing very intensive to my knowledge. If I can get away with it, my ideal configuration is just big spinning disks. Keep it cheap, keep performance consistant and keep it simple.
Guess I may just have to test the various configurations in the flesh!
Thanks
Tom
Hi everyone,
I am in the process of planning a major overhaul of our virtualisation platform, including replacing our traditional HP SAN storage with something that is most cost effective when scaling out and that avoids vendor lock in. I have known about Ceph for a long time and have come across PetaSAN.
I have one key question with regards to Journals...
A single SSD journal on a node with say 8+ OSD's seems like a big single point of failure to me.... Is it still a good idea considering the below?
With the new Bluestore engine, I understand the double write mechanism has been dealt with and Ceph is boasting 2x write performance even when compared to previous generations with a journal.
My workload is pure virtualisation workload or various types. Nothing very intensive to my knowledge. If I can get away with it, my ideal configuration is just big spinning disks. Keep it cheap, keep performance consistant and keep it simple.
Guess I may just have to test the various configurations in the flesh!
Thanks
Tom
admin
2,930 Posts
Quote from admin on April 26, 2018, 11:09 amCeph is designed to handle failures itself and heal, so it is OK for a couple of disk on a single node ( or a complete node) go down. Ceph makes sure it does not keep more than 1 copy of data on a single node.
If you use bluestore, it is best to use all flash, or if using hdds you need a controller with write back cache as well as ssds for wal/db (journal). This is because each io operation in bluestore requires many supporting ios (db access), for spinning disks this adds a lot of latency. This is particularly true for vm workload where there is a lot of random small block sizes, for streaming or backup applications you can probably get away with pure spinning disks.
Ceph is designed to handle failures itself and heal, so it is OK for a couple of disk on a single node ( or a complete node) go down. Ceph makes sure it does not keep more than 1 copy of data on a single node.
If you use bluestore, it is best to use all flash, or if using hdds you need a controller with write back cache as well as ssds for wal/db (journal). This is because each io operation in bluestore requires many supporting ios (db access), for spinning disks this adds a lot of latency. This is particularly true for vm workload where there is a lot of random small block sizes, for streaming or backup applications you can probably get away with pure spinning disks.