Forums - PetaSAN

ForumGeneral DiscussionSanity Check on NVMe system throu …
You need to log in to create posts and topics. Login · Register
Sanity Check on NVMe system throughput

garfield659
10 Posts

February 6, 2024, 11:02 pm
Quote from garfield659 on February 6, 2024, 11:02 pm
I setup PetaSAN with 7 nodes, 3 for management only, and 4 with 4X U.2 NVMe OSD drives each (16 NVMe total OSDs), and a 10GB network. Running the PetaSAN benchmark on the 3:2 replicated RBD pool (512 PGs) I am getting the following:

Test: 4M Throughput rados benchmark with 16 threads per client and 1 min duration using 1 client

Write: 622MB/s             Read: 1398MB/s

Test: 4k IOPS rados benchmark

Write: 18812 IOPS       Read: 36909 IOPS

Does this seem reasonable? I was expecting these number to be a lot higher. CPU utilization is under 20% max, memory is under 10% max, and network is well under 10% max. The Disk Util% is 99-100 on each OSD when running the IOPS test, which leads me to believe everything is fine, just my expectations were too high. The Disk Util% is avg 21-41 and max 25-90 when running throughput test. Oh and the NVMe drives are Samsung MZQLW960HMJP-00003 960GB U.2 drives.

Thank you.

I setup PetaSAN with 7 nodes, 3 for management only, and 4 with 4X U.2 NVMe OSD drives each (16 NVMe total OSDs), and a 10GB network. Running the PetaSAN benchmark on the 3:2 replicated RBD pool (512 PGs) I am getting the following:

Test: 4M Throughput rados benchmark with 16 threads per client and 1 min duration using 1 client

Write: 622MB/s             Read: 1398MB/s

Test: 4k IOPS rados benchmark

Write: 18812 IOPS       Read: 36909 IOPS

Does this seem reasonable? I was expecting these number to be a lot higher. CPU utilization is under 20% max, memory is under 10% max, and network is well under 10% max. The Disk Util% is 99-100 on each OSD when running the IOPS test, which leads me to believe everything is fine, just my expectations were too high. The Disk Util% is avg 21-41 and max 25-90 when running throughput test. Oh and the NVMe drives are Samsung MZQLW960HMJP-00003 960GB U.2 drives.

Thank you.

#1

garfield659
10 Posts

February 7, 2024, 12:04 am
Quote from garfield659 on February 7, 2024, 12:04 am
Update, I went into the "Show Details" and found 2 OSDs that had high aqu-sz and await. I replaced them and my speeds went up to 1051MB/s write and 1392MB/s read. I found I had some Dell 960GB drives in the mix and they seem to be having some problems.

In any case if someone can let me know if I am in the ballpark of expected speed I would appreciate it. The read speed still seems rather low to me.

Update, I went into the "Show Details" and found 2 OSDs that had high aqu-sz and await. I replaced them and my speeds went up to 1051MB/s write and 1392MB/s read. I found I had some Dell 960GB drives in the mix and they seem to be having some problems.

In any case if someone can let me know if I am in the ballpark of expected speed I would appreciate it. The read speed still seems rather low to me.

#2

garfield659
10 Posts

February 7, 2024, 7:59 pm
Quote from garfield659 on February 7, 2024, 7:59 pm
After closer examination I found that 1392MB/s translates to roughly to 11Gbps (I have bonded 2 10Gbps links). The problem now is PetaSAN benchmark will work every once in a while, but most of the time I get "Alert: Error loading benchmark test report". I tired with multiple clients and it still happens. I watched the switch and it is performing the test, I see input rate go up for a bit and then output rate from the client I am testing with, but then the error is returned.

Anyone know how to fix the issue with "Alert: Error loading benchmark test report"?

After closer examination I found that 1392MB/s translates to roughly to 11Gbps (I have bonded 2 10Gbps links). The problem now is PetaSAN benchmark will work every once in a while, but most of the time I get "Alert: Error loading benchmark test report". I tired with multiple clients and it still happens. I watched the switch and it is performing the test, I see input rate go up for a bit and then output rate from the client I am testing with, but then the error is returned.

Anyone know how to fix the issue with "Alert: Error loading benchmark test report"?

#3

admin
2,930 Posts

February 12, 2024, 10:00 pm
Quote from admin on February 12, 2024, 10:00 pm
The latency is approx 1 ms write ( 1000 iops for 1 thread), 0.3 ms reads ( 3000 iops for 1 thread). If you increase the number of client threads, you will get higher results.

The latency is approx 1 ms write ( 1000 iops for 1 thread), 0.3 ms reads ( 3000 iops for 1 thread). If you increase the number of client threads, you will get higher results.

#4

Post Reply: Sanity Check on NVMe system throughput

Cancel