Hardware compatibility - expected performance (Bluestore/Filestore) for production use
wailer
75 Posts
April 25, 2018, 11:38 amQuote from wailer on April 25, 2018, 11:38 amHi,
We are planning to deploy a petasan cluster for production use in our datacenter, but we feel really unconfident about what level of performance we can expect. Is there any reference hardware setup we can look at to have no surprises about its performance? We are currently using an HP MSA SAN with RAID10 and 2x1Gbit iSCSI links.
For instance, we plan to buy:
3 x HP DL380 GEN10 for storage nodes with 12xSAS HD's + 3xHP480G SSDs (to start with)
RAID controller would support jbod and RAID with caché
128GB RAM
2 x Xeon 8 core CPU
The monitors will be VM's spreaded across our vSphere cluster to ensure HA.
For network we are planning to use:
Backend: 2x10G links
iSCSI: 2x10G links
Managament: 4x1G links
In terms of pricing, we are at the same prince range of a new SAN array, around 40K$. Keeping that in mind, what are your thoughts? We like the CEPH approach in terms of scalability, but is it really worth the try? Would we get better performance or at least the same than our old SAN with 2x1GBit links?
Thanks,
Hi,
We are planning to deploy a petasan cluster for production use in our datacenter, but we feel really unconfident about what level of performance we can expect. Is there any reference hardware setup we can look at to have no surprises about its performance? We are currently using an HP MSA SAN with RAID10 and 2x1Gbit iSCSI links.
For instance, we plan to buy:
3 x HP DL380 GEN10 for storage nodes with 12xSAS HD's + 3xHP480G SSDs (to start with)
RAID controller would support jbod and RAID with caché
128GB RAM
2 x Xeon 8 core CPU
The monitors will be VM's spreaded across our vSphere cluster to ensure HA.
For network we are planning to use:
Backend: 2x10G links
iSCSI: 2x10G links
Managament: 4x1G links
In terms of pricing, we are at the same prince range of a new SAN array, around 40K$. Keeping that in mind, what are your thoughts? We like the CEPH approach in terms of scalability, but is it really worth the try? Would we get better performance or at least the same than our old SAN with 2x1GBit links?
Thanks,
Last edited on April 25, 2018, 11:46 am by wailer · #1
admin
2,930 Posts
April 25, 2018, 10:49 pmQuote from admin on April 25, 2018, 10:49 pmWould we get better performance or at least the same than our old SAN with 2x1GBit links?
You should.
Is there any reference hardware setup we can look at to have no surprises about its performance?
There are various performance charts for Ceph online, it varies quite a lot based on hardware. Some hardware vendors do have Ceph tuned reference implementations. The iSCSI layer in PetaSAN does lower the numbers by up to 50% for small 4k block sizes, 25% at 64k block sizes, at 4M block sizes it does not incur any overhead. For VMWare vms you would expect to see 75% of native Ceph rados performance via iSCSI.
For best iops performance per dollar in Ceph, it is best to use all flash. The bluestore engine is being tuned to such hardware, the majority of new Ceph installations in the near future will be all flash. Using hdds will give good throughput with Ceph at larger block sizes, but for vm storage to get decent performance from hdds in Ceph you need controllers with battery backed write back cache + ssd as journal/wal devices to the hdds (ratio 1:4), you may also consider increasing the hdd count per node greater than 12 to increase performance.
Would we get better performance or at least the same than our old SAN with 2x1GBit links?
You should.
Is there any reference hardware setup we can look at to have no surprises about its performance?
There are various performance charts for Ceph online, it varies quite a lot based on hardware. Some hardware vendors do have Ceph tuned reference implementations. The iSCSI layer in PetaSAN does lower the numbers by up to 50% for small 4k block sizes, 25% at 64k block sizes, at 4M block sizes it does not incur any overhead. For VMWare vms you would expect to see 75% of native Ceph rados performance via iSCSI.
For best iops performance per dollar in Ceph, it is best to use all flash. The bluestore engine is being tuned to such hardware, the majority of new Ceph installations in the near future will be all flash. Using hdds will give good throughput with Ceph at larger block sizes, but for vm storage to get decent performance from hdds in Ceph you need controllers with battery backed write back cache + ssd as journal/wal devices to the hdds (ratio 1:4), you may also consider increasing the hdd count per node greater than 12 to increase performance.
Last edited on April 25, 2018, 10:59 pm by admin · #2
wailer
75 Posts
April 26, 2018, 7:59 amQuote from wailer on April 26, 2018, 7:59 amThanks for your hints.
About ALL-FLASH deployments, in case we go for Bluestore, would be enough to use an SSD drive (Samsung PM863 500MB read /400MB write)for WAL or putting an NVME device would make big difference ?
Thanks for your hints.
About ALL-FLASH deployments, in case we go for Bluestore, would be enough to use an SSD drive (Samsung PM863 500MB read /400MB write)for WAL or putting an NVME device would make big difference ?
admin
2,930 Posts
April 26, 2018, 11:14 amQuote from admin on April 26, 2018, 11:14 amMost current deployments do use pure ssds, where wal/db is collocated, it works very well. Red Hat (owners of Ceph) do however recommend nvme for every 4 ssds for wal/db, so it is better but it may not be that much worth the effort, your cluster will cpu saturated for small block sizes (eg virtualization) and network saturated for large block sizes (eg streaming/backup) for most hardware.
Most current deployments do use pure ssds, where wal/db is collocated, it works very well. Red Hat (owners of Ceph) do however recommend nvme for every 4 ssds for wal/db, so it is better but it may not be that much worth the effort, your cluster will cpu saturated for small block sizes (eg virtualization) and network saturated for large block sizes (eg streaming/backup) for most hardware.
Hardware compatibility - expected performance (Bluestore/Filestore) for production use
wailer
75 Posts
Quote from wailer on April 25, 2018, 11:38 amHi,
We are planning to deploy a petasan cluster for production use in our datacenter, but we feel really unconfident about what level of performance we can expect. Is there any reference hardware setup we can look at to have no surprises about its performance? We are currently using an HP MSA SAN with RAID10 and 2x1Gbit iSCSI links.
For instance, we plan to buy:
3 x HP DL380 GEN10 for storage nodes with 12xSAS HD's + 3xHP480G SSDs (to start with)
RAID controller would support jbod and RAID with caché
128GB RAM
2 x Xeon 8 core CPU
The monitors will be VM's spreaded across our vSphere cluster to ensure HA.
For network we are planning to use:
Backend: 2x10G links
iSCSI: 2x10G links
Managament: 4x1G links
In terms of pricing, we are at the same prince range of a new SAN array, around 40K$. Keeping that in mind, what are your thoughts? We like the CEPH approach in terms of scalability, but is it really worth the try? Would we get better performance or at least the same than our old SAN with 2x1GBit links?
Thanks,
Hi,
We are planning to deploy a petasan cluster for production use in our datacenter, but we feel really unconfident about what level of performance we can expect. Is there any reference hardware setup we can look at to have no surprises about its performance? We are currently using an HP MSA SAN with RAID10 and 2x1Gbit iSCSI links.
For instance, we plan to buy:
3 x HP DL380 GEN10 for storage nodes with 12xSAS HD's + 3xHP480G SSDs (to start with)
RAID controller would support jbod and RAID with caché
128GB RAM
2 x Xeon 8 core CPU
The monitors will be VM's spreaded across our vSphere cluster to ensure HA.
For network we are planning to use:
Backend: 2x10G links
iSCSI: 2x10G links
Managament: 4x1G links
In terms of pricing, we are at the same prince range of a new SAN array, around 40K$. Keeping that in mind, what are your thoughts? We like the CEPH approach in terms of scalability, but is it really worth the try? Would we get better performance or at least the same than our old SAN with 2x1GBit links?
Thanks,
admin
2,930 Posts
Quote from admin on April 25, 2018, 10:49 pmWould we get better performance or at least the same than our old SAN with 2x1GBit links?
You should.
Is there any reference hardware setup we can look at to have no surprises about its performance?
There are various performance charts for Ceph online, it varies quite a lot based on hardware. Some hardware vendors do have Ceph tuned reference implementations. The iSCSI layer in PetaSAN does lower the numbers by up to 50% for small 4k block sizes, 25% at 64k block sizes, at 4M block sizes it does not incur any overhead. For VMWare vms you would expect to see 75% of native Ceph rados performance via iSCSI.
For best iops performance per dollar in Ceph, it is best to use all flash. The bluestore engine is being tuned to such hardware, the majority of new Ceph installations in the near future will be all flash. Using hdds will give good throughput with Ceph at larger block sizes, but for vm storage to get decent performance from hdds in Ceph you need controllers with battery backed write back cache + ssd as journal/wal devices to the hdds (ratio 1:4), you may also consider increasing the hdd count per node greater than 12 to increase performance.
Would we get better performance or at least the same than our old SAN with 2x1GBit links?
You should.
Is there any reference hardware setup we can look at to have no surprises about its performance?
There are various performance charts for Ceph online, it varies quite a lot based on hardware. Some hardware vendors do have Ceph tuned reference implementations. The iSCSI layer in PetaSAN does lower the numbers by up to 50% for small 4k block sizes, 25% at 64k block sizes, at 4M block sizes it does not incur any overhead. For VMWare vms you would expect to see 75% of native Ceph rados performance via iSCSI.
For best iops performance per dollar in Ceph, it is best to use all flash. The bluestore engine is being tuned to such hardware, the majority of new Ceph installations in the near future will be all flash. Using hdds will give good throughput with Ceph at larger block sizes, but for vm storage to get decent performance from hdds in Ceph you need controllers with battery backed write back cache + ssd as journal/wal devices to the hdds (ratio 1:4), you may also consider increasing the hdd count per node greater than 12 to increase performance.
wailer
75 Posts
Quote from wailer on April 26, 2018, 7:59 amThanks for your hints.
About ALL-FLASH deployments, in case we go for Bluestore, would be enough to use an SSD drive (Samsung PM863 500MB read /400MB write)for WAL or putting an NVME device would make big difference ?
Thanks for your hints.
About ALL-FLASH deployments, in case we go for Bluestore, would be enough to use an SSD drive (Samsung PM863 500MB read /400MB write)for WAL or putting an NVME device would make big difference ?
admin
2,930 Posts
Quote from admin on April 26, 2018, 11:14 amMost current deployments do use pure ssds, where wal/db is collocated, it works very well. Red Hat (owners of Ceph) do however recommend nvme for every 4 ssds for wal/db, so it is better but it may not be that much worth the effort, your cluster will cpu saturated for small block sizes (eg virtualization) and network saturated for large block sizes (eg streaming/backup) for most hardware.
Most current deployments do use pure ssds, where wal/db is collocated, it works very well. Red Hat (owners of Ceph) do however recommend nvme for every 4 ssds for wal/db, so it is better but it may not be that much worth the effort, your cluster will cpu saturated for small block sizes (eg virtualization) and network saturated for large block sizes (eg streaming/backup) for most hardware.