Very slow on HDD. I will close my project
Pages: 1 2
Pavel
10 Posts
May 9, 2021, 7:58 pmQuote from Pavel on May 9, 2021, 7:58 pmHi guys!
I have lived with petasan for about a year. The system was growing and I thought that productivity would also grow. This did not happen.
A cluster of three nodes each of 15 SAS 7200 8TB replica size 2 produces ~ 300 Iops!! (Windows ISCSI mpio clients).
20 HDD or 40 is no difference. 10g network, 128GB 10Core each node.
Of course, maybe I'm doing something wrong, but I talked on specialized channels(Redhat Ceph) and this is the situation for everyone.
Maybe we should admit that the this solution does not work on the HDD?
PS: I want to note that I did not have any problems with fault tolerance, but this is not enough. I use VSAN in other projects and it is ten times faster with the same linear writing / reading.
Hi guys!
I have lived with petasan for about a year. The system was growing and I thought that productivity would also grow. This did not happen.
A cluster of three nodes each of 15 SAS 7200 8TB replica size 2 produces ~ 300 Iops!! (Windows ISCSI mpio clients).
20 HDD or 40 is no difference. 10g network, 128GB 10Core each node.
Of course, maybe I'm doing something wrong, but I talked on specialized channels(Redhat Ceph) and this is the situation for everyone.
Maybe we should admit that the this solution does not work on the HDD?
PS: I want to note that I did not have any problems with fault tolerance, but this is not enough. I use VSAN in other projects and it is ten times faster with the same linear writing / reading.
admin
2,930 Posts
May 10, 2021, 2:25 pmQuote from admin on May 10, 2021, 2:25 pmHave you done any benchmark from the ui before ? if so what are results you got for iops / throughput ? if not can you run any benchmarks now in production ?
For iops workloads like virtualisation you should go all flash, you can get 100K iops with 3 nodes, or at least use some SSD helpers for jorunal and cache. A 256GB SSD journal can serve 5 HDD, at least doubling their iops: a pure HDD OSD will do many iops for metadata (object size, offset, crc, modified time,,etc) during data io, taking these iops out to the SSD is really needed.
The only workloads suitable for pure HDD without any SSD helpers is low iops large block size workload such as backups/video streaming recording/S3.
Have you done any benchmark from the ui before ? if so what are results you got for iops / throughput ? if not can you run any benchmarks now in production ?
For iops workloads like virtualisation you should go all flash, you can get 100K iops with 3 nodes, or at least use some SSD helpers for jorunal and cache. A 256GB SSD journal can serve 5 HDD, at least doubling their iops: a pure HDD OSD will do many iops for metadata (object size, offset, crc, modified time,,etc) during data io, taking these iops out to the SSD is really needed.
The only workloads suitable for pure HDD without any SSD helpers is low iops large block size workload such as backups/video streaming recording/S3.
Last edited on May 10, 2021, 2:28 pm by admin · #2
Pavel
10 Posts
May 11, 2021, 10:54 amQuote from Pavel on May 11, 2021, 10:54 am1Minute 8Threads client is 1st node from 3.
272 write, 12876 read IOPS
Util Mem 99%, Cpu 35-60%, OSDs 21-30% write, 9-11% read.
234 MB/s write, 1031 MB/s read
Util Mem 99%, Cpu 17-24%, OSDs 22-37% write, 15-22% read.
Cluster under light load.
My workload is video archive (simpy storing large files)+backups on separate pools. Storage is 50% full
Disk utilisation under load only 15-20%(70% while scrubbing). OSD latency 50-100ms.
1Minute 8Threads client is 1st node from 3.
272 write, 12876 read IOPS
Util Mem 99%, Cpu 35-60%, OSDs 21-30% write, 9-11% read.
234 MB/s write, 1031 MB/s read
Util Mem 99%, Cpu 17-24%, OSDs 22-37% write, 15-22% read.
Cluster under light load.
My workload is video archive (simpy storing large files)+backups on separate pools. Storage is 50% full
Disk utilisation under load only 15-20%(70% while scrubbing). OSD latency 50-100ms.
admin
2,930 Posts
May 11, 2021, 2:00 pmQuote from admin on May 11, 2021, 2:00 pm8 threads iops 272 write 12876 read -> 1 thread iops 34 write 1600 read
8 threads throughput 234 MB/s write, 1031 MB/s read -> 1 thread 30 MBs write 128 MB/s read
The performance per thread looks ok for a pure HDD setup (note writes are lower due to replication)
8 threads is not enough for 40 disks, this is why the disks are not busy, and you will get same performance even with 20 disks.
You need to put several concurrent operations / threads per HDD to get the most out of it
if you test with 128 threads you will get much higher total performance
if on the other hand your workload does not have concurrency / queue depth then you will get limited performance as you are limited by latency which determines single thread performance. in such case adding more OSD drives will not be of use.
To lower latency at least add journals : a 512 GB SSD will server 5 HDDs and will lower the latency by at least 2x
For large block sizes and low concurreny workload you can also consider using raid0 to speed up your OSDs
It also help in Windows iSCSI settings in your case to change MaxTransferLength in registry
default is 0x00040000 hex ( 262144 bytes)
change to 0x00400000 hex (4194304 bytes)
8 threads iops 272 write 12876 read -> 1 thread iops 34 write 1600 read
8 threads throughput 234 MB/s write, 1031 MB/s read -> 1 thread 30 MBs write 128 MB/s read
The performance per thread looks ok for a pure HDD setup (note writes are lower due to replication)
8 threads is not enough for 40 disks, this is why the disks are not busy, and you will get same performance even with 20 disks.
You need to put several concurrent operations / threads per HDD to get the most out of it
if you test with 128 threads you will get much higher total performance
if on the other hand your workload does not have concurrency / queue depth then you will get limited performance as you are limited by latency which determines single thread performance. in such case adding more OSD drives will not be of use.
To lower latency at least add journals : a 512 GB SSD will server 5 HDDs and will lower the latency by at least 2x
For large block sizes and low concurreny workload you can also consider using raid0 to speed up your OSDs
It also help in Windows iSCSI settings in your case to change MaxTransferLength in registry
default is 0x00040000 hex ( 262144 bytes)
change to 0x00400000 hex (4194304 bytes)
Last edited on May 11, 2021, 3:08 pm by admin · #4
admin
2,930 Posts
May 11, 2021, 11:36 pmQuote from admin on May 11, 2021, 11:36 pmin addition, in your case you may want to create raid volume on the Windows clients to increase the queue depth
in addition, in your case you may want to create raid volume on the Windows clients to increase the queue depth
Pavel
10 Posts
June 3, 2021, 8:08 pmQuote from Pavel on June 3, 2021, 8:08 pmI started evacuating data, there is no difference in speed if I read from two clients or from one.
50Mb\s, 200 Iops average, scrubbing off.
I started evacuating data, there is no difference in speed if I read from two clients or from one.
50Mb\s, 200 Iops average, scrubbing off.
admin
2,930 Posts
June 4, 2021, 1:12 pmQuote from admin on June 4, 2021, 1:12 pmis this 50 MB/s total for all clients or is it for each ?
as per earlier: have you tried ssd journal ? if using large block size, did you increase the registry setting ?
is this 50 MB/s total for all clients or is it for each ?
as per earlier: have you tried ssd journal ? if using large block size, did you increase the registry setting ?
AlbertHakvoort
21 Posts
June 30, 2021, 10:44 amQuote from AlbertHakvoort on June 30, 2021, 10:44 amI have the same experience with a harddisk only setup (3 nodes with 4 x 12TB)
I have the same experience with a harddisk only setup (3 nodes with 4 x 12TB)
admin
2,930 Posts
June 30, 2021, 3:14 pmQuote from admin on June 30, 2021, 3:14 pmdid any of the prev replies help ?
did any of the prev replies help ?
AlbertHakvoort
21 Posts
July 1, 2021, 12:13 pmQuote from AlbertHakvoort on July 1, 2021, 12:13 pmI've add SSD's for journal and caching, the speed was slightly increased but still to slow.
I've add SSD's for journal and caching, the speed was slightly increased but still to slow.
Pages: 1 2
Very slow on HDD. I will close my project
Pavel
10 Posts
Quote from Pavel on May 9, 2021, 7:58 pmHi guys!
I have lived with petasan for about a year. The system was growing and I thought that productivity would also grow. This did not happen.
A cluster of three nodes each of 15 SAS 7200 8TB replica size 2 produces ~ 300 Iops!! (Windows ISCSI mpio clients).
20 HDD or 40 is no difference. 10g network, 128GB 10Core each node.
Of course, maybe I'm doing something wrong, but I talked on specialized channels(Redhat Ceph) and this is the situation for everyone.
Maybe we should admit that the this solution does not work on the HDD?
PS: I want to note that I did not have any problems with fault tolerance, but this is not enough. I use VSAN in other projects and it is ten times faster with the same linear writing / reading.
Hi guys!
I have lived with petasan for about a year. The system was growing and I thought that productivity would also grow. This did not happen.
A cluster of three nodes each of 15 SAS 7200 8TB replica size 2 produces ~ 300 Iops!! (Windows ISCSI mpio clients).
20 HDD or 40 is no difference. 10g network, 128GB 10Core each node.
Of course, maybe I'm doing something wrong, but I talked on specialized channels(Redhat Ceph) and this is the situation for everyone.
Maybe we should admit that the this solution does not work on the HDD?
PS: I want to note that I did not have any problems with fault tolerance, but this is not enough. I use VSAN in other projects and it is ten times faster with the same linear writing / reading.
admin
2,930 Posts
Quote from admin on May 10, 2021, 2:25 pmHave you done any benchmark from the ui before ? if so what are results you got for iops / throughput ? if not can you run any benchmarks now in production ?
For iops workloads like virtualisation you should go all flash, you can get 100K iops with 3 nodes, or at least use some SSD helpers for jorunal and cache. A 256GB SSD journal can serve 5 HDD, at least doubling their iops: a pure HDD OSD will do many iops for metadata (object size, offset, crc, modified time,,etc) during data io, taking these iops out to the SSD is really needed.
The only workloads suitable for pure HDD without any SSD helpers is low iops large block size workload such as backups/video streaming recording/S3.
Have you done any benchmark from the ui before ? if so what are results you got for iops / throughput ? if not can you run any benchmarks now in production ?
For iops workloads like virtualisation you should go all flash, you can get 100K iops with 3 nodes, or at least use some SSD helpers for jorunal and cache. A 256GB SSD journal can serve 5 HDD, at least doubling their iops: a pure HDD OSD will do many iops for metadata (object size, offset, crc, modified time,,etc) during data io, taking these iops out to the SSD is really needed.
The only workloads suitable for pure HDD without any SSD helpers is low iops large block size workload such as backups/video streaming recording/S3.
Pavel
10 Posts
Quote from Pavel on May 11, 2021, 10:54 am1Minute 8Threads client is 1st node from 3.
272 write, 12876 read IOPS
Util Mem 99%, Cpu 35-60%, OSDs 21-30% write, 9-11% read.
234 MB/s write, 1031 MB/s read
Util Mem 99%, Cpu 17-24%, OSDs 22-37% write, 15-22% read.
Cluster under light load.
My workload is video archive (simpy storing large files)+backups on separate pools. Storage is 50% full
Disk utilisation under load only 15-20%(70% while scrubbing). OSD latency 50-100ms.
1Minute 8Threads client is 1st node from 3.
272 write, 12876 read IOPS
Util Mem 99%, Cpu 35-60%, OSDs 21-30% write, 9-11% read.
234 MB/s write, 1031 MB/s read
Util Mem 99%, Cpu 17-24%, OSDs 22-37% write, 15-22% read.
Cluster under light load.
My workload is video archive (simpy storing large files)+backups on separate pools. Storage is 50% full
Disk utilisation under load only 15-20%(70% while scrubbing). OSD latency 50-100ms.
admin
2,930 Posts
Quote from admin on May 11, 2021, 2:00 pm8 threads iops 272 write 12876 read -> 1 thread iops 34 write 1600 read
8 threads throughput 234 MB/s write, 1031 MB/s read -> 1 thread 30 MBs write 128 MB/s readThe performance per thread looks ok for a pure HDD setup (note writes are lower due to replication)
8 threads is not enough for 40 disks, this is why the disks are not busy, and you will get same performance even with 20 disks.
You need to put several concurrent operations / threads per HDD to get the most out of it
if you test with 128 threads you will get much higher total performanceif on the other hand your workload does not have concurrency / queue depth then you will get limited performance as you are limited by latency which determines single thread performance. in such case adding more OSD drives will not be of use.
To lower latency at least add journals : a 512 GB SSD will server 5 HDDs and will lower the latency by at least 2x
For large block sizes and low concurreny workload you can also consider using raid0 to speed up your OSDs
It also help in Windows iSCSI settings in your case to change MaxTransferLength in registry
default is 0x00040000 hex ( 262144 bytes)
change to 0x00400000 hex (4194304 bytes)
8 threads iops 272 write 12876 read -> 1 thread iops 34 write 1600 read
8 threads throughput 234 MB/s write, 1031 MB/s read -> 1 thread 30 MBs write 128 MB/s read
The performance per thread looks ok for a pure HDD setup (note writes are lower due to replication)
8 threads is not enough for 40 disks, this is why the disks are not busy, and you will get same performance even with 20 disks.
You need to put several concurrent operations / threads per HDD to get the most out of it
if you test with 128 threads you will get much higher total performance
if on the other hand your workload does not have concurrency / queue depth then you will get limited performance as you are limited by latency which determines single thread performance. in such case adding more OSD drives will not be of use.
To lower latency at least add journals : a 512 GB SSD will server 5 HDDs and will lower the latency by at least 2x
For large block sizes and low concurreny workload you can also consider using raid0 to speed up your OSDs
It also help in Windows iSCSI settings in your case to change MaxTransferLength in registry
default is 0x00040000 hex ( 262144 bytes)
change to 0x00400000 hex (4194304 bytes)
admin
2,930 Posts
Quote from admin on May 11, 2021, 11:36 pmin addition, in your case you may want to create raid volume on the Windows clients to increase the queue depth
in addition, in your case you may want to create raid volume on the Windows clients to increase the queue depth
Pavel
10 Posts
Quote from Pavel on June 3, 2021, 8:08 pmI started evacuating data, there is no difference in speed if I read from two clients or from one.
50Mb\s, 200 Iops average, scrubbing off.
I started evacuating data, there is no difference in speed if I read from two clients or from one.
50Mb\s, 200 Iops average, scrubbing off.
admin
2,930 Posts
Quote from admin on June 4, 2021, 1:12 pmis this 50 MB/s total for all clients or is it for each ?
as per earlier: have you tried ssd journal ? if using large block size, did you increase the registry setting ?
is this 50 MB/s total for all clients or is it for each ?
as per earlier: have you tried ssd journal ? if using large block size, did you increase the registry setting ?
AlbertHakvoort
21 Posts
Quote from AlbertHakvoort on June 30, 2021, 10:44 amI have the same experience with a harddisk only setup (3 nodes with 4 x 12TB)
I have the same experience with a harddisk only setup (3 nodes with 4 x 12TB)
admin
2,930 Posts
Quote from admin on June 30, 2021, 3:14 pmdid any of the prev replies help ?
did any of the prev replies help ?
AlbertHakvoort
21 Posts
Quote from AlbertHakvoort on July 1, 2021, 12:13 pmI've add SSD's for journal and caching, the speed was slightly increased but still to slow.
I've add SSD's for journal and caching, the speed was slightly increased but still to slow.