Performance expectations?
marko
5 Posts
March 4, 2018, 2:05 amQuote from marko on March 4, 2018, 2:05 amFirst to PetaSAN devs, this is fantastic!! It's what I've been looking for for so long as a XenServer user who can't easily access Ceph via RBD directly.
I'm looking to build a PetaSAN cluster, I was wondering, roughly what kind of performance I could expect with a small cluster as follows:
3 nodes, dual E5640 CPUs (4-6 core), 32GB DDR3 ECC, 8 x 4TB HGST 7200RPM SATA NAS-grade drives, dual 10 gigabit ethernet, two of the nodes will share the iSCSI target service.
I've got a Dell Equallogic 24-drive (in RAID10) 1TB 7200 RPM SAS drives as my current iSCSI target.
With Bluestore, but without any external SSD write ahead journaling, would I get fairly close to the same performance as the Equallogic? Blow away the Equallogic? Or nowhere near?
I'm just wondering if I need way more heads, and SSD/NVME WAL before it performs reasonably? Or would my bottlenecks be other issues (32GB maybe not enough RAM etc.)
I had tried Ceph with Proxmox a few years ago (going to say maybe pre-Jewel) just 3 drives x 3 nodes, gigabit connections, and it wasn't too bad running a few VMs on it but none of them were really heavy duty IOPS hoggers either.
Thanks in advance!!
First to PetaSAN devs, this is fantastic!! It's what I've been looking for for so long as a XenServer user who can't easily access Ceph via RBD directly.
I'm looking to build a PetaSAN cluster, I was wondering, roughly what kind of performance I could expect with a small cluster as follows:
3 nodes, dual E5640 CPUs (4-6 core), 32GB DDR3 ECC, 8 x 4TB HGST 7200RPM SATA NAS-grade drives, dual 10 gigabit ethernet, two of the nodes will share the iSCSI target service.
I've got a Dell Equallogic 24-drive (in RAID10) 1TB 7200 RPM SAS drives as my current iSCSI target.
With Bluestore, but without any external SSD write ahead journaling, would I get fairly close to the same performance as the Equallogic? Blow away the Equallogic? Or nowhere near?
I'm just wondering if I need way more heads, and SSD/NVME WAL before it performs reasonably? Or would my bottlenecks be other issues (32GB maybe not enough RAM etc.)
I had tried Ceph with Proxmox a few years ago (going to say maybe pre-Jewel) just 3 drives x 3 nodes, gigabit connections, and it wasn't too bad running a few VMs on it but none of them were really heavy duty IOPS hoggers either.
Thanks in advance!!
admin
2,930 Posts
March 4, 2018, 10:09 amQuote from admin on March 4, 2018, 10:09 amA traditional SAN will give better latency than a distributed SDS system like Ceph: the io will go staright to the controller then to disk via SAS cable, replication is done via hardware. The data path in Ceph is much more complex: you have software processes (OSD deamons) that send messages across the wire, in addition Bluestore uses a transactional db to store io metada requiring several io ops themselves..the tranditional SAN will beat scale out SDS solution in io latency for sure. Equallogic uses shared SAS ports to share storage, so it works the same as traditional SAN. You will get better latency and better iops/disk with such setups. Another thing is under RAID a single io operation/thread can get more that a single disk performance, in Ceph a single io/thread will get a max of a disk performance for reads and /3 for writes.
Ceph however allows you to scale out in unlimited fashion giving you unlimited total cluster performance and in most cases better performance per dollar. It handles a large number of clients/vms very well, but if you have a few high performance applications it will not be that good. A well tuned Ceph cluster should meet most demands, the benchmark page in PetaSAN will give you a lot of info on what your cluster is capable of and what resources need to be tuned. Also Ceph is self healing and handles recovery very well without affecting client io like RAID.
Bluestore is perfect when using all SSDs, the latency issues described above may give you low iops results with straight spinning hdds, it will make a big difference if you use a RAID controller with write back cache (battery backed) and setup your disks as single disk RAID0 volumes. Using an external SSD for storing wal/db will help but not as much as write back cache. With JBOD pass through hdds + SSD journals you may find Filestore (PetaSAN v 1.5) may in some cases give you better iops performance. If you run the PetaSAN iops benchmarks on your proposed setup, i would think it will show you saturation of your disks as the bottleneck while your cpu/ram/net are not. In such case you should beef up the disk by write back cache (if possible), SSD journals, and adding more hdds.
If your workload is all large block size such as over 1M (which i doubt in vm loads), latency will not be a large factor and you may find straight hdds good enough.
A traditional SAN will give better latency than a distributed SDS system like Ceph: the io will go staright to the controller then to disk via SAS cable, replication is done via hardware. The data path in Ceph is much more complex: you have software processes (OSD deamons) that send messages across the wire, in addition Bluestore uses a transactional db to store io metada requiring several io ops themselves..the tranditional SAN will beat scale out SDS solution in io latency for sure. Equallogic uses shared SAS ports to share storage, so it works the same as traditional SAN. You will get better latency and better iops/disk with such setups. Another thing is under RAID a single io operation/thread can get more that a single disk performance, in Ceph a single io/thread will get a max of a disk performance for reads and /3 for writes.
Ceph however allows you to scale out in unlimited fashion giving you unlimited total cluster performance and in most cases better performance per dollar. It handles a large number of clients/vms very well, but if you have a few high performance applications it will not be that good. A well tuned Ceph cluster should meet most demands, the benchmark page in PetaSAN will give you a lot of info on what your cluster is capable of and what resources need to be tuned. Also Ceph is self healing and handles recovery very well without affecting client io like RAID.
Bluestore is perfect when using all SSDs, the latency issues described above may give you low iops results with straight spinning hdds, it will make a big difference if you use a RAID controller with write back cache (battery backed) and setup your disks as single disk RAID0 volumes. Using an external SSD for storing wal/db will help but not as much as write back cache. With JBOD pass through hdds + SSD journals you may find Filestore (PetaSAN v 1.5) may in some cases give you better iops performance. If you run the PetaSAN iops benchmarks on your proposed setup, i would think it will show you saturation of your disks as the bottleneck while your cpu/ram/net are not. In such case you should beef up the disk by write back cache (if possible), SSD journals, and adding more hdds.
If your workload is all large block size such as over 1M (which i doubt in vm loads), latency will not be a large factor and you may find straight hdds good enough.
Last edited on March 4, 2018, 10:18 am by admin · #2
marko
5 Posts
March 4, 2018, 9:11 pmQuote from marko on March 4, 2018, 9:11 pmThank you for the very informative response.
So what I'm hearing is that if I went ahead and deployed PetaSAN 2.0 with my above hardware, using a traditional RAID controller and setting the drives to RAID0 to take advantage of the BBWC would give me reasonable performance for 'average' workloads. I intended to still keep my EQL arrays around, so I could simply use them for the busier workloads I suppose.
Or, should I want to go all-PetaSAN a pure SSD configuration would still beat a 24-drive Equallogic with RAID10 7200rpm SAS drives for the higher IO VMs. Does PetaSAN allow me to create multiple pools in the same cluster and assign them to different iSCSI targets? Or would I need to create two parallel PetaSAN installs one for SSD and one for HDD?
Thanks again for your work on this, I have a feeling this is going to become a very popular product... 🙂
Thank you for the very informative response.
So what I'm hearing is that if I went ahead and deployed PetaSAN 2.0 with my above hardware, using a traditional RAID controller and setting the drives to RAID0 to take advantage of the BBWC would give me reasonable performance for 'average' workloads. I intended to still keep my EQL arrays around, so I could simply use them for the busier workloads I suppose.
Or, should I want to go all-PetaSAN a pure SSD configuration would still beat a 24-drive Equallogic with RAID10 7200rpm SAS drives for the higher IO VMs. Does PetaSAN allow me to create multiple pools in the same cluster and assign them to different iSCSI targets? Or would I need to create two parallel PetaSAN installs one for SSD and one for HDD?
Thanks again for your work on this, I have a feeling this is going to become a very popular product... 🙂
admin
2,930 Posts
March 5, 2018, 9:04 amQuote from admin on March 5, 2018, 9:04 amIf you can have an all SSD configuration, you will not want anything else. The next version ( 2.1 ) will have support for mixed pools, currently you will need 2 different installs.
If you can have an all SSD configuration, you will not want anything else. The next version ( 2.1 ) will have support for mixed pools, currently you will need 2 different installs.
BonsaiJoe
53 Posts
March 5, 2018, 4:44 pmQuote from BonsaiJoe on March 5, 2018, 4:44 pmthis sound great do you have any release date for the 2.1? 😉
this sound great do you have any release date for the 2.1? 😉
marko
5 Posts
March 6, 2018, 7:26 pmQuote from marko on March 6, 2018, 7:26 pmSupport for all SSD and all HDD pools sounds great. I'd like to be able to manually tier stuff depending on the workload.
One last question -- is it possible to mix and match iSCSI and direct RBD consumption in PetaSAN? Basically, I have some XenServer, some vmware, and some Proxmox (KVM). I'd like to have Proxmox talk directly RBD to the cluster, while I'd give XenServer and vmware their own iSCSI LUNs (each).
Support for all SSD and all HDD pools sounds great. I'd like to be able to manually tier stuff depending on the workload.
One last question -- is it possible to mix and match iSCSI and direct RBD consumption in PetaSAN? Basically, I have some XenServer, some vmware, and some Proxmox (KVM). I'd like to have Proxmox talk directly RBD to the cluster, while I'd give XenServer and vmware their own iSCSI LUNs (each).
admin
2,930 Posts
March 6, 2018, 8:35 pmQuote from admin on March 6, 2018, 8:35 pmYes you can mix, Proxmox rbd images will show up in the PetaSAN ui as "detached" images which means they do not have any iSCSI metadata assigned. If you ever have a need to serve these images via iSCSI, you would "attach" and "start" them. The reverse is also true, you can detach an existing iSCSI disk created in PetaSAN and have Proxmox use it via direct rbd.
Yes you can mix, Proxmox rbd images will show up in the PetaSAN ui as "detached" images which means they do not have any iSCSI metadata assigned. If you ever have a need to serve these images via iSCSI, you would "attach" and "start" them. The reverse is also true, you can detach an existing iSCSI disk created in PetaSAN and have Proxmox use it via direct rbd.
Last edited on March 6, 2018, 8:36 pm by admin · #7
Performance expectations?
marko
5 Posts
Quote from marko on March 4, 2018, 2:05 amFirst to PetaSAN devs, this is fantastic!! It's what I've been looking for for so long as a XenServer user who can't easily access Ceph via RBD directly.
I'm looking to build a PetaSAN cluster, I was wondering, roughly what kind of performance I could expect with a small cluster as follows:
3 nodes, dual E5640 CPUs (4-6 core), 32GB DDR3 ECC, 8 x 4TB HGST 7200RPM SATA NAS-grade drives, dual 10 gigabit ethernet, two of the nodes will share the iSCSI target service.
I've got a Dell Equallogic 24-drive (in RAID10) 1TB 7200 RPM SAS drives as my current iSCSI target.
With Bluestore, but without any external SSD write ahead journaling, would I get fairly close to the same performance as the Equallogic? Blow away the Equallogic? Or nowhere near?
I'm just wondering if I need way more heads, and SSD/NVME WAL before it performs reasonably? Or would my bottlenecks be other issues (32GB maybe not enough RAM etc.)
I had tried Ceph with Proxmox a few years ago (going to say maybe pre-Jewel) just 3 drives x 3 nodes, gigabit connections, and it wasn't too bad running a few VMs on it but none of them were really heavy duty IOPS hoggers either.
Thanks in advance!!
First to PetaSAN devs, this is fantastic!! It's what I've been looking for for so long as a XenServer user who can't easily access Ceph via RBD directly.
I'm looking to build a PetaSAN cluster, I was wondering, roughly what kind of performance I could expect with a small cluster as follows:
3 nodes, dual E5640 CPUs (4-6 core), 32GB DDR3 ECC, 8 x 4TB HGST 7200RPM SATA NAS-grade drives, dual 10 gigabit ethernet, two of the nodes will share the iSCSI target service.
I've got a Dell Equallogic 24-drive (in RAID10) 1TB 7200 RPM SAS drives as my current iSCSI target.
With Bluestore, but without any external SSD write ahead journaling, would I get fairly close to the same performance as the Equallogic? Blow away the Equallogic? Or nowhere near?
I'm just wondering if I need way more heads, and SSD/NVME WAL before it performs reasonably? Or would my bottlenecks be other issues (32GB maybe not enough RAM etc.)
I had tried Ceph with Proxmox a few years ago (going to say maybe pre-Jewel) just 3 drives x 3 nodes, gigabit connections, and it wasn't too bad running a few VMs on it but none of them were really heavy duty IOPS hoggers either.
Thanks in advance!!
admin
2,930 Posts
Quote from admin on March 4, 2018, 10:09 amA traditional SAN will give better latency than a distributed SDS system like Ceph: the io will go staright to the controller then to disk via SAS cable, replication is done via hardware. The data path in Ceph is much more complex: you have software processes (OSD deamons) that send messages across the wire, in addition Bluestore uses a transactional db to store io metada requiring several io ops themselves..the tranditional SAN will beat scale out SDS solution in io latency for sure. Equallogic uses shared SAS ports to share storage, so it works the same as traditional SAN. You will get better latency and better iops/disk with such setups. Another thing is under RAID a single io operation/thread can get more that a single disk performance, in Ceph a single io/thread will get a max of a disk performance for reads and /3 for writes.
Ceph however allows you to scale out in unlimited fashion giving you unlimited total cluster performance and in most cases better performance per dollar. It handles a large number of clients/vms very well, but if you have a few high performance applications it will not be that good. A well tuned Ceph cluster should meet most demands, the benchmark page in PetaSAN will give you a lot of info on what your cluster is capable of and what resources need to be tuned. Also Ceph is self healing and handles recovery very well without affecting client io like RAID.
Bluestore is perfect when using all SSDs, the latency issues described above may give you low iops results with straight spinning hdds, it will make a big difference if you use a RAID controller with write back cache (battery backed) and setup your disks as single disk RAID0 volumes. Using an external SSD for storing wal/db will help but not as much as write back cache. With JBOD pass through hdds + SSD journals you may find Filestore (PetaSAN v 1.5) may in some cases give you better iops performance. If you run the PetaSAN iops benchmarks on your proposed setup, i would think it will show you saturation of your disks as the bottleneck while your cpu/ram/net are not. In such case you should beef up the disk by write back cache (if possible), SSD journals, and adding more hdds.
If your workload is all large block size such as over 1M (which i doubt in vm loads), latency will not be a large factor and you may find straight hdds good enough.
A traditional SAN will give better latency than a distributed SDS system like Ceph: the io will go staright to the controller then to disk via SAS cable, replication is done via hardware. The data path in Ceph is much more complex: you have software processes (OSD deamons) that send messages across the wire, in addition Bluestore uses a transactional db to store io metada requiring several io ops themselves..the tranditional SAN will beat scale out SDS solution in io latency for sure. Equallogic uses shared SAS ports to share storage, so it works the same as traditional SAN. You will get better latency and better iops/disk with such setups. Another thing is under RAID a single io operation/thread can get more that a single disk performance, in Ceph a single io/thread will get a max of a disk performance for reads and /3 for writes.
Ceph however allows you to scale out in unlimited fashion giving you unlimited total cluster performance and in most cases better performance per dollar. It handles a large number of clients/vms very well, but if you have a few high performance applications it will not be that good. A well tuned Ceph cluster should meet most demands, the benchmark page in PetaSAN will give you a lot of info on what your cluster is capable of and what resources need to be tuned. Also Ceph is self healing and handles recovery very well without affecting client io like RAID.
Bluestore is perfect when using all SSDs, the latency issues described above may give you low iops results with straight spinning hdds, it will make a big difference if you use a RAID controller with write back cache (battery backed) and setup your disks as single disk RAID0 volumes. Using an external SSD for storing wal/db will help but not as much as write back cache. With JBOD pass through hdds + SSD journals you may find Filestore (PetaSAN v 1.5) may in some cases give you better iops performance. If you run the PetaSAN iops benchmarks on your proposed setup, i would think it will show you saturation of your disks as the bottleneck while your cpu/ram/net are not. In such case you should beef up the disk by write back cache (if possible), SSD journals, and adding more hdds.
If your workload is all large block size such as over 1M (which i doubt in vm loads), latency will not be a large factor and you may find straight hdds good enough.
marko
5 Posts
Quote from marko on March 4, 2018, 9:11 pmThank you for the very informative response.
So what I'm hearing is that if I went ahead and deployed PetaSAN 2.0 with my above hardware, using a traditional RAID controller and setting the drives to RAID0 to take advantage of the BBWC would give me reasonable performance for 'average' workloads. I intended to still keep my EQL arrays around, so I could simply use them for the busier workloads I suppose.
Or, should I want to go all-PetaSAN a pure SSD configuration would still beat a 24-drive Equallogic with RAID10 7200rpm SAS drives for the higher IO VMs. Does PetaSAN allow me to create multiple pools in the same cluster and assign them to different iSCSI targets? Or would I need to create two parallel PetaSAN installs one for SSD and one for HDD?
Thanks again for your work on this, I have a feeling this is going to become a very popular product... 🙂
Thank you for the very informative response.
So what I'm hearing is that if I went ahead and deployed PetaSAN 2.0 with my above hardware, using a traditional RAID controller and setting the drives to RAID0 to take advantage of the BBWC would give me reasonable performance for 'average' workloads. I intended to still keep my EQL arrays around, so I could simply use them for the busier workloads I suppose.
Or, should I want to go all-PetaSAN a pure SSD configuration would still beat a 24-drive Equallogic with RAID10 7200rpm SAS drives for the higher IO VMs. Does PetaSAN allow me to create multiple pools in the same cluster and assign them to different iSCSI targets? Or would I need to create two parallel PetaSAN installs one for SSD and one for HDD?
Thanks again for your work on this, I have a feeling this is going to become a very popular product... 🙂
admin
2,930 Posts
Quote from admin on March 5, 2018, 9:04 amIf you can have an all SSD configuration, you will not want anything else. The next version ( 2.1 ) will have support for mixed pools, currently you will need 2 different installs.
If you can have an all SSD configuration, you will not want anything else. The next version ( 2.1 ) will have support for mixed pools, currently you will need 2 different installs.
BonsaiJoe
53 Posts
Quote from BonsaiJoe on March 5, 2018, 4:44 pmthis sound great do you have any release date for the 2.1? 😉
this sound great do you have any release date for the 2.1? 😉
marko
5 Posts
Quote from marko on March 6, 2018, 7:26 pmSupport for all SSD and all HDD pools sounds great. I'd like to be able to manually tier stuff depending on the workload.
One last question -- is it possible to mix and match iSCSI and direct RBD consumption in PetaSAN? Basically, I have some XenServer, some vmware, and some Proxmox (KVM). I'd like to have Proxmox talk directly RBD to the cluster, while I'd give XenServer and vmware their own iSCSI LUNs (each).
Support for all SSD and all HDD pools sounds great. I'd like to be able to manually tier stuff depending on the workload.
One last question -- is it possible to mix and match iSCSI and direct RBD consumption in PetaSAN? Basically, I have some XenServer, some vmware, and some Proxmox (KVM). I'd like to have Proxmox talk directly RBD to the cluster, while I'd give XenServer and vmware their own iSCSI LUNs (each).
admin
2,930 Posts
Quote from admin on March 6, 2018, 8:35 pmYes you can mix, Proxmox rbd images will show up in the PetaSAN ui as "detached" images which means they do not have any iSCSI metadata assigned. If you ever have a need to serve these images via iSCSI, you would "attach" and "start" them. The reverse is also true, you can detach an existing iSCSI disk created in PetaSAN and have Proxmox use it via direct rbd.
Yes you can mix, Proxmox rbd images will show up in the PetaSAN ui as "detached" images which means they do not have any iSCSI metadata assigned. If you ever have a need to serve these images via iSCSI, you would "attach" and "start" them. The reverse is also true, you can detach an existing iSCSI disk created in PetaSAN and have Proxmox use it via direct rbd.