Forums

Home / Forums

You need to log in to create posts and topics. Login · Register

EC RBD over iscsi performance

Pages: 1 2 3

Most of our focus is random small block performance, since it is the pattern most seen in virtualized workloads.

The numbers depend a lot on hardware, but in the ballpark for small writes random with spinning hdd disks : the journal gives approx 2x, cache controller another 2.5-5x.  Latency with pure hdds from 15-25 ms, with both journal and cache we get below 5 ms, with higher end systems reaching 2 ms. In contrast all flash (enterprise ssds) read: 0.3-0.5 ms, write: 1.2-2ms. these are replicated pools.

I have been doing tuning test recently.

The test env is:

4+2 ec_profile

10TB EC RBD

use fio to test with following parameters:

direct=1
ioengine=libaio
numjobs=8
iodepth=16
group_reporting
rw=write/read
bs=4M/1M/512K/128K/64k

 

Tests are done in following config

native rbd

test with krbd

default iscsi

use default target and initiator setting

petasan recommanded iscsi setting

TGP attribute

set attribute default_cmdsn_depth=256

backstore attribute

set parameter FirstBurstLength=1048576
set parameter HeaderDigest=None
set parameter InitialR2T=No
set parameter MaxBurstLength=1048576
set parameter MaxOutstandingR2T=8
set parameter MaxRecvDataSegmentLength=1048576
set parameter MaxXmitDataSegmentLength=1048576

initiator(/etc/iscsi/iscsid.conf)

node.session.cmds_max = 256
node.conn[0].iscsi.MaxRecvDataSegmentLength = 1048576
node.session.iscsi.FirstBurstLength = 1048576

4m length setting

TGP attribute

set attribute default_cmdsn_depth=256

TGP parameter

set parameter FirstBurstLength=4194304
set parameter HeaderDigest=None
set parameter InitialR2T=No
set parameter MaxBurstLength=4194304
set parameter MaxOutstandingR2T=8
set parameter MaxRecvDataSegmentLength=4194304
set parameter MaxXmitDataSegmentLength=4194304

initiator(/etc/iscsi/iscsid.conf)

node.session.cmds_max = 256
node.session.iscsi.MaxBurstLength = 4194304
node.conn[0].iscsi.MaxRecvDataSegmentLength = 4194304
node.session.iscsi.FirstBurstLength = 4194304

4m length and 256qd

TGP attribute

set attribute default_cmdsn_depth=256

TGP parameter

set parameter FirstBurstLength=4194304
set parameter HeaderDigest=None
set parameter InitialR2T=No
set parameter MaxBurstLength=4194304
set parameter MaxOutstandingR2T=8
set parameter MaxRecvDataSegmentLength=4194304
set parameter MaxXmitDataSegmentLength=4194304

inititor(/etc/iscsi/iscsid.conf)

node.session.queue_depth = 256
node.session.cmds_max = 256
node.session.iscsi.MaxBurstLength = 4194304
node.conn[0].iscsi.MaxRecvDataSegmentLength = 4194304
node.session.iscsi.FirstBurstLength = 4194304

The test results:

4M write

name bs io mode bandwidth(MB/s) latency(ms)
native rbd 4M write 907.15 564.16
iscsi-default 4M write 260.94 1960.69
iscsi-petasan-optimized 4M write 257.63 1986.7
4m-block-optimized 4M write 266.35 1921.1
4m-block-256qd-optimized 4M write 427.74 1196.55

4m read

name bs io mode bandwidth(MB/s) latency(ms)
native rbd 4M read 1113.6 459.54
iscsi-default 4M read 384.83 1329.95
iscsi-petasan-optimized 4M read 429.44 1191.9
4m-block-optimized 4M read 400.26 1278.65
4m-block-256qd-optimized 4M read 683.84 748.52

1M write

name bs io mode bandwidth(MB/s) latency(ms)
native rbd 1M write 446.32 286.66
iscsi-default 1M write 428.8 298.44
iscsi-petasan-optimized 1M write 426.63 299.98
4m-block-optimized 1M write 446.12 286.87
4m-block-256qd-optimized 1M write 491.25 260.52

1M read

name bs io mode bandwidth(MB/s) latency(ms)
native rbd 1M read 1048.1 122
iscsi-default 1M read 654.89 195.4
iscsi-petasan-optimized 1M read 687.94 186.02
4m-block-optimized 1M read 684.09 187
4m-block-256qd-optimized 1M read 737.3 173.59

512k write

name bs io mode bandwidth(MB/s) latency(ms)
native rbd 512K write 248.34 257.65
iscsi-default 512K write 280.95 227.78
iscsi-petasan-optimized 512K write 297.32 215.09
4m-block-optimized 512K write 305.48 209.49
4m-block-256qd-optimized 512K write 300.86 212.62

512k read

name bs io mode bandwidth(MB/s) latency(ms)
native rbd 512K read 706.3 90.59
iscsi-default 512K read 545.07 117.4
iscsi-petasan-optimized 512K read 593.45 107.83
4m-block-optimized 512K read 588.62 108.72
4m-block-256qd-optimized 512K read 675.75 94.7

128k write

native rbd 128K write 92.81 172.35
iscsi-default 128K write 98.01 163.23
iscsi-petasan-optimized 128K write 103.7 154.24
4m-block-optimized 128K write 103.3 154.88
4m-block-256qd-optimized 128K write 104.74 152.53

128k read

name bs io mode bandwidth(MB/s) latency(ms)
native rbd 128K read 271.06 59.02
iscsi-default 128K read 279.62 57.22
iscsi-petasan-optimized 128K read 348.87 45.86
4m-block-optimized 128K read 342.6 46.7
4m-block-256qd-optimized 128K read 362.62 44.12

64k write

name bs io mode bandwidth(MB/s) latency(ms)
native rbd 64K write 58.93 135.73
iscsi-default 64K write 66.98 119.42
iscsi-petasan-optimized 64K write 72.06 111.01
4m-block-optimized 64K write 68.6 116.61
4m-block-256qd-optimized 64K write 65.96 121.25

64k read

name bs io mode bandwidth(MB/s) latency(ms)
native rbd 64K read 161.25 49.6
iscsi-default 64K read 198.62 40.28
iscsi-petasan-optimized 64K read 246.77 32.42
4m-block-optimized 64K read 240.92 33.2
4m-block-256qd-optimized 64K read 239.24 33.43

 

The questions:

1.It seems when the block size is below 1M the write performance of native rbd and rbd over iscsi are nearly the same.What is the reason behind it? How can I improve iscsi performance for block size above 1M?

2.It seems when block size is above 512k the read performance of native rbd is always better than rbd over iscsi.How can I improve it?

 

Thank you

 

Thanks a lot for sharing this.

Good to hear the iSCSI speed is as native below 1M, as indicated this is our focus when we tune since block sizes above this are un-common in virtualization. Also initiators can put a limit below this, VMWare for example has a n iSCSI block size limit of 512k (its default is 128k but can be bumped to 512k) . It is quite possible your initiator did put a limit when negotiating.

for 2), hard to say, maybe related to the above ?

I do not know your environment but generally the native ceph latencies look high, our benchmarks are done on replicated pools, we see 64k latencies in the 1 ms range for ssds.

I am using open-iscsi as initiator.

The environment is a pure hdd cluster so the latency is not very good especially when using EC.

Pages: 1 2 3