Forums

Home / Forums

You need to log in to create posts and topics. Login · Register

Built-in benchmark is using multiple nodes only one at a time in series, shouldn't it be testing from all clients in parallel?

Hi guys, great work on petasan and thanks for making it available.

I'm playing around with the built-in benchmark on a 6 nodes cluster, 3 of which run the data services while the other 3 are left without any service so I can use them as benchmarking servers.

When I select the 3 service-less servers to run an RBD 4M benchmark on a test pool, I see the ressource utilisation go up on only 1 benchmark server at a time, not all 3 simultaneously.

I'm limited to just over 1000mb/s in the benchmark result no matter how many threads or benchmark servers I choose... and that seems to clearly be from hitting the 10GbE limit on a single benchmark server.

Is there a quick fix or alternative solution I can try?

Thanks

It should run them in parallel

can you check on the client nodes

ps aux | grep rados

it should show the rados bench commands running on all clients.

You can run a test for 5 min and look at the node stats charts from the dashboard to look at network traffic

I can confirm that writes seem to be happening from all three benchmark clients, but when it goes to read I get this odd behavior that only a single benchmark client is doing work.

https://ibb.co/7WvRCX2

 

did you run

ps aux | grep rados

does the test  happen in parallel across nodes ? or does it run on only 1 node ? or 1 node at a time ?

yes, all 3 benchmark clients run rados during writing, confirmed with ps.

And during reading, two of the clients don't run rados anymore and only a single client is running the rest of the benchmark.

So you cannot reproduce this behavior on one of your own clusters? I'm running petasan 3.1.

Thanks.

Seems rados bench itself is crashing...

I tried running this on all 3 clients:
rados bench -p test 60 write --no-cleanup
Which ran without issues on all 3 clients.
Followed by:
rados bench -p test 60 seq

Which failed on client 1 and 2 with this message:
root@petasanbench01:~# rados bench -p test 60 seq
hints = 1
sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s)
0 0 0 0 0 0 - 0
benchmark_data_petasanbench01_378340_object1 is not correct!
read got -2
error during benchmark: (2) No such file or directory
error 2: (2) No such file or directory

and

root@petasanbench02:~# rados bench -p test 60 seq
hints = 1
sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s)
0 13 13 0 0 0 - 0
benchmark_data_petasanbench02_378340_object1 is not correct!
read got -2
error during benchmark: (2) No such file or directory
error 2: (2) No such file or directory

and lastly the working 3rd client:

root@petasanbench03:~# rados bench -p test 60 seq
hints = 1
sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s)
0 0 0 0 0 0 - 0
1 16 195 179 715.659 716 0.064593 0.0810232
[...]
18 15 3969 3954 878.17 832 0.0283596 0.0718273
Total time run: 18.5487
Total reads made: 4073
Read size: 4194304
Object size: 4194304
Bandwidth (MB/sec): 878.335
[...]

This was on a fresh EC4/2 pool with nothing on it.

So redid the experiment on a fresh pool again running rados bench manually but this time added the argument "--run-name=$HOSTNAME" and it worked flawlessly.

Reran a few more times with "--run-name=$HOSTNAME" and all works fine.

I can point at /usr/lib/python3/dist-packages/PetaSAN/core/ceph/api.py as the place to fix def rados_write(), def rados_read() and def rados_benchmark_clean() but I'm unable to submit clean code to fix the issue.

In rados_read/rados_write we need to add an argument for --run-name=[some unique identifier like hostname]

And in rados_benchmark_clean() need to add an argument for --prefix=benchmark_data_[same unique identifier as above, like hostname]

as per the documentation rados cleanup also supports --run-name so add that instead.

And that would fix the issue for benchmarking.

Cheers 🙂

ref. https://docs.ceph.com/en/latest/man/8/rados/

oh.. and noticed that rados_read() only runs the random read benchmark... shouldn't it be running "seq" instead of "rand" for the mode when doing the throughput 4M benchmark?

Thanks

After doing the dirty fix (hardcoding --run-name=<hostname> on each client) in PetaSAN/core/ceph/api.py I can confirm that the benchmark runs as it should on all three clients and getting much improved (more accurate) results from the built-in benchmark.

Cheers

Thanks a lot for the detailed info, yes i confirm reads were not running in parallel and --run-name does fix this as you suggested. We will add this in next release, thanks so much 🙂

To use sequential reads, you should define your object_size (-O) to be much larger than the block size (-b) during writes, else the object_size will be the same as the block and it will become random reads.  In general, when using large block sizes like 4MB there is no advantage in using sequential over random, also in the case of Ceph, sequential will be hitting the same OSD with many threads, random threads will  read from different OSDs which could be better and more reflective of true load.

Thank you, and the whole team on PetaSAN!