About petasan and ceph and iscsi
shadowlin
67 Posts
January 8, 2018, 9:39 amQuote from shadowlin on January 8, 2018, 9:39 amI have been playing with different fio iodepth numjobs settings to find the max throughput.
Should we consider latency when we try to find the max throughput because when iodepth and numbjobs increased the throughput increased but the latency also increased.
I have been playing with different fio iodepth numjobs settings to find the max throughput.
Should we consider latency when we try to find the max throughput because when iodepth and numbjobs increased the throughput increased but the latency also increased.
admin
2,933 Posts
January 8, 2018, 10:12 amQuote from admin on January 8, 2018, 10:12 amYou are correct. iops is the inverse of latency when the io depth is 1, but for higher depths they are not directly related and this is why they are both needed to describe performance.
You can maximize iops by adding "more" disks / cpu resources. The only way to increase latency is to use faster "types" of cpu (with faster frequencies not core count) /disk / network which generally is more expensive.
Generally with many concurrent users, iops is the most common measure. You would care about latency when you have specific requirement placed on applications like databases.
Note: v1.5 has a lot of improvements in iops and latency performance.
You are correct. iops is the inverse of latency when the io depth is 1, but for higher depths they are not directly related and this is why they are both needed to describe performance.
You can maximize iops by adding "more" disks / cpu resources. The only way to increase latency is to use faster "types" of cpu (with faster frequencies not core count) /disk / network which generally is more expensive.
Generally with many concurrent users, iops is the most common measure. You would care about latency when you have specific requirement placed on applications like databases.
Note: v1.5 has a lot of improvements in iops and latency performance.
Last edited on January 8, 2018, 10:17 am by admin · #22
shadowlin
67 Posts
January 8, 2018, 10:33 amQuote from shadowlin on January 8, 2018, 10:33 amI want to benchmark my cluster's maximum throughput by fio?.At the begaining I found the iops doubled with doubled numjobs setting or iodepth setting but the latency also doubled.
After certain number to double numbjobs or iodepth can only gain a few more iops but increased a lot of latency.
How should I know when to stop to increase the numjobs and iodepth?Should I continue increase iodepth ,numbjobs ,clients until there wouldn't be any noticeable iops gains despite there would be a very high latency?
I want to benchmark my cluster's maximum throughput by fio?.At the begaining I found the iops doubled with doubled numjobs setting or iodepth setting but the latency also doubled.
After certain number to double numbjobs or iodepth can only gain a few more iops but increased a lot of latency.
How should I know when to stop to increase the numjobs and iodepth?Should I continue increase iodepth ,numbjobs ,clients until there wouldn't be any noticeable iops gains despite there would be a very high latency?
admin
2,933 Posts
January 8, 2018, 1:01 pmQuote from admin on January 8, 2018, 1:01 pmI think it really depends on what you want to achieve. For some the goal is to show the highest iops possible for marketing purpose..the correct approach is to measure how good your system is to your workload. So what is your expected workload io depth ? block size ? mix percentage read/write, do you have any latency requirement you do not want to exceed ?
I think it really depends on what you want to achieve. For some the goal is to show the highest iops possible for marketing purpose..the correct approach is to measure how good your system is to your workload. So what is your expected workload io depth ? block size ? mix percentage read/write, do you have any latency requirement you do not want to exceed ?
Last edited on January 8, 2018, 1:01 pm by admin · #24
shadowlin
67 Posts
January 25, 2018, 6:36 amQuote from shadowlin on January 25, 2018, 6:36 amI have been testing my cluster for some time and get some results.
The cluster has 142 osds(with 10TB hdd),pool size is 2
fio parameter:
[4m]
description="rbd 4m-seq-write"
direct=1
ioengine=libaio
directory=/mnt/rbd_benchmark/fio_benchmark/4m/
numjobs=58
iodepth=4
group_reporting
rw=write
bs=4M
size=20G
5 clients(with 10G nic) are used in the test
the 4m write test result is
All clients: (groupid=0, jobs=5): err= 0: pid=0: Mon Jan 15 23:27:57 2018
write: io=6227.8GB, bw=3335.1M/s, iops=795, runt=1866853msec
slat (usec): min=94, max=7059.2K, avg=231179.41, stdev=376712.11
clat (msec): min=154, max=40593, avg=1120.65, stdev=1722.16
lat (msec): min=154, max=41450, avg=1351.83, stdev=1804.07
bw (MB /s): min= 0, max= 0, per=0.41%, avg= inf, stdev= inf
lat (msec) : 250=2.25%, 500=26.95%, 750=23.59%, 1000=13.37%
The throughput looks good but I the latency is high.
I am not sure if the latency is in the reasonable range because I can't find a similar size of cluster to compare.
I have been testing my cluster for some time and get some results.
The cluster has 142 osds(with 10TB hdd),pool size is 2
fio parameter:
[4m]
description="rbd 4m-seq-write"
direct=1
ioengine=libaio
directory=/mnt/rbd_benchmark/fio_benchmark/4m/
numjobs=58
iodepth=4
group_reporting
rw=write
bs=4M
size=20G
5 clients(with 10G nic) are used in the test
the 4m write test result is
All clients: (groupid=0, jobs=5): err= 0: pid=0: Mon Jan 15 23:27:57 2018
write: io=6227.8GB, bw=3335.1M/s, iops=795, runt=1866853msec
slat (usec): min=94, max=7059.2K, avg=231179.41, stdev=376712.11
clat (msec): min=154, max=40593, avg=1120.65, stdev=1722.16
lat (msec): min=154, max=41450, avg=1351.83, stdev=1804.07
bw (MB /s): min= 0, max= 0, per=0.41%, avg= inf, stdev= inf
lat (msec) : 250=2.25%, 500=26.95%, 750=23.59%, 1000=13.37%
The throughput looks good but I the latency is high.
I am not sure if the latency is in the reasonable range because I can't find a similar size of cluster to compare.
admin
2,933 Posts
January 25, 2018, 2:07 pmQuote from admin on January 25, 2018, 2:07 pmWell it is good you like the throughput values
Some thoughts:
The 3.3 GB/s write is net to client and is the max your existing cluster is capable of, internally raw writes due to 2 replicas will be 6.6 GB/s and if you do have collocated journals, raw disk writes will be 13 GB/s
Latency is inline with your throughput and io depths. Your total io queue depths is 5 clients x 58 jobs x 4 io depths = 1160. Each io is competing for the 3.33 GB/s, on average each will get 2.88 MB/s, since the io block size is 4M, your average latency will be 4 / 2.84 = 1.38 sec which is what you see.
I would recommend decreasing the io depths so that each io can have more of the bandwidth to lower the latency. In lowering the max io depths, you will slightly decrease the max throughput but it should not be affected that much. You should decide on some goals, like what min throughput each client will see , what max latency .. this will determine how much io depth you can allow for your current setup. I would also recommend you also test latency at small block sizes such as 32k, at this level latency will not be directly determined from thoughput.
To achieve higher performance, you should observe your load charts (cpu/net/disk utilization) and see what bottleneck exit to determine what hardware needs to be augmented: if disk util is high you need more disks, if cpu is high you need more nodes..etc.
How many iSCSI service nodes are you using ? The current io depths per node per disk per is 128 you can change it from:
/opt/petasan/config/tuning/current/lio_tunings
"queue_depth": 128
"default_cmdsn_depth": 64
Well it is good you like the throughput values
Some thoughts:
The 3.3 GB/s write is net to client and is the max your existing cluster is capable of, internally raw writes due to 2 replicas will be 6.6 GB/s and if you do have collocated journals, raw disk writes will be 13 GB/s
Latency is inline with your throughput and io depths. Your total io queue depths is 5 clients x 58 jobs x 4 io depths = 1160. Each io is competing for the 3.33 GB/s, on average each will get 2.88 MB/s, since the io block size is 4M, your average latency will be 4 / 2.84 = 1.38 sec which is what you see.
I would recommend decreasing the io depths so that each io can have more of the bandwidth to lower the latency. In lowering the max io depths, you will slightly decrease the max throughput but it should not be affected that much. You should decide on some goals, like what min throughput each client will see , what max latency .. this will determine how much io depth you can allow for your current setup. I would also recommend you also test latency at small block sizes such as 32k, at this level latency will not be directly determined from thoughput.
To achieve higher performance, you should observe your load charts (cpu/net/disk utilization) and see what bottleneck exit to determine what hardware needs to be augmented: if disk util is high you need more disks, if cpu is high you need more nodes..etc.
How many iSCSI service nodes are you using ? The current io depths per node per disk per is 128 you can change it from:
/opt/petasan/config/tuning/current/lio_tunings
"queue_depth": 128
"default_cmdsn_depth": 64
Last edited on January 25, 2018, 2:17 pm by admin · #26
shadowlin
67 Posts
January 26, 2018, 9:17 amQuote from shadowlin on January 26, 2018, 9:17 amThanks for your info.
It is really helpful.
The test was done only with rbd as a baseline.I will do more test with iscsi then.
Thanks for your info.
It is really helpful.
The test was done only with rbd as a baseline.I will do more test with iscsi then.
About petasan and ceph and iscsi
shadowlin
67 Posts
Quote from shadowlin on January 8, 2018, 9:39 amI have been playing with different fio iodepth numjobs settings to find the max throughput.
Should we consider latency when we try to find the max throughput because when iodepth and numbjobs increased the throughput increased but the latency also increased.
I have been playing with different fio iodepth numjobs settings to find the max throughput.
Should we consider latency when we try to find the max throughput because when iodepth and numbjobs increased the throughput increased but the latency also increased.
admin
2,933 Posts
Quote from admin on January 8, 2018, 10:12 amYou are correct. iops is the inverse of latency when the io depth is 1, but for higher depths they are not directly related and this is why they are both needed to describe performance.
You can maximize iops by adding "more" disks / cpu resources. The only way to increase latency is to use faster "types" of cpu (with faster frequencies not core count) /disk / network which generally is more expensive.
Generally with many concurrent users, iops is the most common measure. You would care about latency when you have specific requirement placed on applications like databases.
Note: v1.5 has a lot of improvements in iops and latency performance.
You are correct. iops is the inverse of latency when the io depth is 1, but for higher depths they are not directly related and this is why they are both needed to describe performance.
You can maximize iops by adding "more" disks / cpu resources. The only way to increase latency is to use faster "types" of cpu (with faster frequencies not core count) /disk / network which generally is more expensive.
Generally with many concurrent users, iops is the most common measure. You would care about latency when you have specific requirement placed on applications like databases.
Note: v1.5 has a lot of improvements in iops and latency performance.
shadowlin
67 Posts
Quote from shadowlin on January 8, 2018, 10:33 amI want to benchmark my cluster's maximum throughput by fio?.At the begaining I found the iops doubled with doubled numjobs setting or iodepth setting but the latency also doubled.
After certain number to double numbjobs or iodepth can only gain a few more iops but increased a lot of latency.
How should I know when to stop to increase the numjobs and iodepth?Should I continue increase iodepth ,numbjobs ,clients until there wouldn't be any noticeable iops gains despite there would be a very high latency?
I want to benchmark my cluster's maximum throughput by fio?.At the begaining I found the iops doubled with doubled numjobs setting or iodepth setting but the latency also doubled.
After certain number to double numbjobs or iodepth can only gain a few more iops but increased a lot of latency.
How should I know when to stop to increase the numjobs and iodepth?Should I continue increase iodepth ,numbjobs ,clients until there wouldn't be any noticeable iops gains despite there would be a very high latency?
admin
2,933 Posts
Quote from admin on January 8, 2018, 1:01 pmI think it really depends on what you want to achieve. For some the goal is to show the highest iops possible for marketing purpose..the correct approach is to measure how good your system is to your workload. So what is your expected workload io depth ? block size ? mix percentage read/write, do you have any latency requirement you do not want to exceed ?
I think it really depends on what you want to achieve. For some the goal is to show the highest iops possible for marketing purpose..the correct approach is to measure how good your system is to your workload. So what is your expected workload io depth ? block size ? mix percentage read/write, do you have any latency requirement you do not want to exceed ?
shadowlin
67 Posts
Quote from shadowlin on January 25, 2018, 6:36 amI have been testing my cluster for some time and get some results.
The cluster has 142 osds(with 10TB hdd),pool size is 2
fio parameter:
[4m]description="rbd 4m-seq-write"direct=1ioengine=libaiodirectory=/mnt/rbd_benchmark/fio_benchmark/4m/numjobs=58iodepth=4group_reportingrw=writebs=4Msize=20G
5 clients(with 10G nic) are used in the test
the 4m write test result is
All clients: (groupid=0, jobs=5): err= 0: pid=0: Mon Jan 15 23:27:57 2018write: io=6227.8GB, bw=3335.1M/s, iops=795, runt=1866853msecslat (usec): min=94, max=7059.2K, avg=231179.41, stdev=376712.11clat (msec): min=154, max=40593, avg=1120.65, stdev=1722.16lat (msec): min=154, max=41450, avg=1351.83, stdev=1804.07bw (MB /s): min= 0, max= 0, per=0.41%, avg= inf, stdev= inflat (msec) : 250=2.25%, 500=26.95%, 750=23.59%, 1000=13.37%The throughput looks good but I the latency is high.I am not sure if the latency is in the reasonable range because I can't find a similar size of cluster to compare.
I have been testing my cluster for some time and get some results.
The cluster has 142 osds(with 10TB hdd),pool size is 2
fio parameter:
5 clients(with 10G nic) are used in the test
the 4m write test result is
admin
2,933 Posts
Quote from admin on January 25, 2018, 2:07 pmWell it is good you like the throughput values
Some thoughts:
The 3.3 GB/s write is net to client and is the max your existing cluster is capable of, internally raw writes due to 2 replicas will be 6.6 GB/s and if you do have collocated journals, raw disk writes will be 13 GB/s
Latency is inline with your throughput and io depths. Your total io queue depths is 5 clients x 58 jobs x 4 io depths = 1160. Each io is competing for the 3.33 GB/s, on average each will get 2.88 MB/s, since the io block size is 4M, your average latency will be 4 / 2.84 = 1.38 sec which is what you see.
I would recommend decreasing the io depths so that each io can have more of the bandwidth to lower the latency. In lowering the max io depths, you will slightly decrease the max throughput but it should not be affected that much. You should decide on some goals, like what min throughput each client will see , what max latency .. this will determine how much io depth you can allow for your current setup. I would also recommend you also test latency at small block sizes such as 32k, at this level latency will not be directly determined from thoughput.
To achieve higher performance, you should observe your load charts (cpu/net/disk utilization) and see what bottleneck exit to determine what hardware needs to be augmented: if disk util is high you need more disks, if cpu is high you need more nodes..etc.
How many iSCSI service nodes are you using ? The current io depths per node per disk per is 128 you can change it from:
/opt/petasan/config/tuning/current/lio_tunings
"queue_depth": 128
"default_cmdsn_depth": 64
Well it is good you like the throughput values
Some thoughts:
The 3.3 GB/s write is net to client and is the max your existing cluster is capable of, internally raw writes due to 2 replicas will be 6.6 GB/s and if you do have collocated journals, raw disk writes will be 13 GB/s
Latency is inline with your throughput and io depths. Your total io queue depths is 5 clients x 58 jobs x 4 io depths = 1160. Each io is competing for the 3.33 GB/s, on average each will get 2.88 MB/s, since the io block size is 4M, your average latency will be 4 / 2.84 = 1.38 sec which is what you see.
I would recommend decreasing the io depths so that each io can have more of the bandwidth to lower the latency. In lowering the max io depths, you will slightly decrease the max throughput but it should not be affected that much. You should decide on some goals, like what min throughput each client will see , what max latency .. this will determine how much io depth you can allow for your current setup. I would also recommend you also test latency at small block sizes such as 32k, at this level latency will not be directly determined from thoughput.
To achieve higher performance, you should observe your load charts (cpu/net/disk utilization) and see what bottleneck exit to determine what hardware needs to be augmented: if disk util is high you need more disks, if cpu is high you need more nodes..etc.
How many iSCSI service nodes are you using ? The current io depths per node per disk per is 128 you can change it from:
/opt/petasan/config/tuning/current/lio_tunings
"queue_depth": 128
"default_cmdsn_depth": 64
shadowlin
67 Posts
Quote from shadowlin on January 26, 2018, 9:17 amThanks for your info.
It is really helpful.
The test was done only with rbd as a baseline.I will do more test with iscsi then.
Thanks for your info.
It is really helpful.
The test was done only with rbd as a baseline.I will do more test with iscsi then.