HDD and SSD layout
mirek@gsnet.cz
3 Posts
February 16, 2018, 9:37 amQuote from mirek@gsnet.cz on February 16, 2018, 9:37 amHello,
i have a question about layout of disk ....
I have 4 same node Supermicro 2x6core cpu and 144Gb ram.
1st node) 8x 146 HDD SAS, 2x 10GBE, IB, IT mode raid ---> 8 OSD
2nd node) 8x 146 HDD SAS, 2x 10GBE, IB, IT mode raid ---> 8 OSD
3rd node) 8x 146 HDD SAS, 2x 10GBE, IB, IT mode raid ---> 8 OSD
4 node) 8x 64GB SSD Patriot flare, 2x 10GBE, IB, IT mode raid SATA3 ---> 8 journals
All servers i have placed in same rack, redundant 10gbe backend, jumbo frame and IB for conectivity to esxi bades.
It is good layout ?
Thank you much for answer
Mirek
Hello,
i have a question about layout of disk ....
I have 4 same node Supermicro 2x6core cpu and 144Gb ram.
1st node) 8x 146 HDD SAS, 2x 10GBE, IB, IT mode raid ---> 8 OSD
2nd node) 8x 146 HDD SAS, 2x 10GBE, IB, IT mode raid ---> 8 OSD
3rd node) 8x 146 HDD SAS, 2x 10GBE, IB, IT mode raid ---> 8 OSD
4 node) 8x 64GB SSD Patriot flare, 2x 10GBE, IB, IT mode raid SATA3 ---> 8 journals
All servers i have placed in same rack, redundant 10gbe backend, jumbo frame and IB for conectivity to esxi bades.
It is good layout ?
Thank you much for answer
Mirek
admin
2,930 Posts
February 16, 2018, 11:38 amQuote from admin on February 16, 2018, 11:38 amHi Mirek,
1) We do not currently support infiniband, there are users that set it up:
http://www.petasan.org/forums/?view=thread&id=226
http://www.petasan.org/forums/?view=thread&id=151
2) The journals need to be distributed among the nodes, a journal is used to speed up the operation of a local OSD on the same node. Typically you an SSD journal can speed up up to 4 local HDDs
Hatem
Hi Mirek,
1) We do not currently support infiniband, there are users that set it up:
http://www.petasan.org/forums/?view=thread&id=226
http://www.petasan.org/forums/?view=thread&id=151
2) The journals need to be distributed among the nodes, a journal is used to speed up the operation of a local OSD on the same node. Typically you an SSD journal can speed up up to 4 local HDDs
Hatem
mirek@gsnet.cz
3 Posts
February 17, 2018, 3:14 pmQuote from mirek@gsnet.cz on February 17, 2018, 3:14 pmHi Hatem,
it means that i have to have 2 SSD for 8 HDD in each node ?
- is there any recommendation for SSD capacity for one HDD in node ?
- does that mean if I have one node in a cluster without SSD, will I destroy the performance of the whole cluster?
- does the SSD journal still need to be assigned to a particular OSD? (I mean CLI) or just add petasan using the web interface?
Actualy i have 3 nodes, in each node 144GB RAM, 2x 6core CPU, 8x 146GB SAS OSD for test 2x10gbe backend and ib for iscsi - all work ok. but speed is terrible ....
Read 700 MB/s but write only 41 MB/s
- is it a typical value?
Thank you very much for your time, would not it be bad, in the future, to jointly create some performance recommendations? how to achieve a reasonable result 🙂
Thank you
Mirek
Hi Hatem,
it means that i have to have 2 SSD for 8 HDD in each node ?
- is there any recommendation for SSD capacity for one HDD in node ?
- does that mean if I have one node in a cluster without SSD, will I destroy the performance of the whole cluster?
- does the SSD journal still need to be assigned to a particular OSD? (I mean CLI) or just add petasan using the web interface?
Actualy i have 3 nodes, in each node 144GB RAM, 2x 6core CPU, 8x 146GB SAS OSD for test 2x10gbe backend and ib for iscsi - all work ok. but speed is terrible ....
Read 700 MB/s but write only 41 MB/s
- is it a typical value?
Thank you very much for your time, would not it be bad, in the future, to jointly create some performance recommendations? how to achieve a reasonable result 🙂
Thank you
Mirek
admin
2,930 Posts
February 17, 2018, 6:02 pmQuote from admin on February 17, 2018, 6:02 pmit means that i have to have 2 SSD for 8 HDD in each node ?
Yes it is a good ratio
is there any recommendation for SSD capacity for one HDD in node ?
We use 20 G partition for each HDD OSD, so it does not need to be large
does that mean if I have one node in a cluster without SSD, will I destroy the performance of the whole cluster?
You should make your storage nodes as symmetric as possible. It will not "destroy" performance, if you have 1 slow node out of 10, it will slow 10% of read requests and about 30% of write requests, but again avoid this.
does the SSD journal still need to be assigned to a particular OSD? (I mean CLI) or just add petasan using the web interface?
The max is about 4 OSDs per SSDs, it does not have to be assigned to particular one. it is best to balance the OSDs among journals, if you select "Auto" assign, PetaSAN will pick the least loaded journal.
Actualy i have 3 nodes, in each node 144GB RAM, 2x 6core CPU, 8x 146GB SAS OSD for test 2x10gbe backend and ib for iscsi - all work ok. but speed is terrible .... Read 700 MB/s but write only 41 MB/s is it a typical value?
How did you measure this speed ?
How many replicas do you have ?
Can you test speed using the cluster benchmark for throughput at 1/16/64 threads ?
it means that i have to have 2 SSD for 8 HDD in each node ?
Yes it is a good ratio
is there any recommendation for SSD capacity for one HDD in node ?
We use 20 G partition for each HDD OSD, so it does not need to be large
does that mean if I have one node in a cluster without SSD, will I destroy the performance of the whole cluster?
You should make your storage nodes as symmetric as possible. It will not "destroy" performance, if you have 1 slow node out of 10, it will slow 10% of read requests and about 30% of write requests, but again avoid this.
does the SSD journal still need to be assigned to a particular OSD? (I mean CLI) or just add petasan using the web interface?
The max is about 4 OSDs per SSDs, it does not have to be assigned to particular one. it is best to balance the OSDs among journals, if you select "Auto" assign, PetaSAN will pick the least loaded journal.
Actualy i have 3 nodes, in each node 144GB RAM, 2x 6core CPU, 8x 146GB SAS OSD for test 2x10gbe backend and ib for iscsi - all work ok. but speed is terrible .... Read 700 MB/s but write only 41 MB/s is it a typical value?
How did you measure this speed ?
How many replicas do you have ?
Can you test speed using the cluster benchmark for throughput at 1/16/64 threads ?
Last edited on February 17, 2018, 6:03 pm by admin · #4
RutgerK
1 Post
February 28, 2018, 9:50 amQuote from RutgerK on February 28, 2018, 9:50 amI'm having the same performance issues using 10k HDD's.
Setup:
3x DL380 G7 with LSI 9207-8i HBA
Per node:
32GB RAM
6x 300GB 10k SAS HDD
2x10Gb Chelsio T520-LL-CR (Backend1 and iSCSI1 mapped to one port, Backend2 and iSCSI2 to another)
1Gb Management
I've tested sequential speeds using the cluster benchmark with one of the three nodes as client (16 threads) and I get around 380MB/s write and 1040MB/s read.
After that I've created an iSCSI disk and connected to it from a Server 2016 host with 2x10Gb iSCSI using 4 paths.
The speed from the Server 2016 host is around 980MB/s read but only 36MB/s write. I've tested this with CrystalDiskMark on the iSCSI volume and inside a VM that's stored on the iSCSI volume. Increasing the threads in CrystalDiskMark makes no difference to the write speed.
Is this performance to be expected? The raw write speed of a single 300GB 10k HDD is around 130MB/s using the benchmark utility in the console.
Any help would be appreciated 🙂
I'm having the same performance issues using 10k HDD's.
Setup:
3x DL380 G7 with LSI 9207-8i HBA
Per node:
32GB RAM
6x 300GB 10k SAS HDD
2x10Gb Chelsio T520-LL-CR (Backend1 and iSCSI1 mapped to one port, Backend2 and iSCSI2 to another)
1Gb Management
I've tested sequential speeds using the cluster benchmark with one of the three nodes as client (16 threads) and I get around 380MB/s write and 1040MB/s read.
After that I've created an iSCSI disk and connected to it from a Server 2016 host with 2x10Gb iSCSI using 4 paths.
The speed from the Server 2016 host is around 980MB/s read but only 36MB/s write. I've tested this with CrystalDiskMark on the iSCSI volume and inside a VM that's stored on the iSCSI volume. Increasing the threads in CrystalDiskMark makes no difference to the write speed.
Is this performance to be expected? The raw write speed of a single 300GB 10k HDD is around 130MB/s using the benchmark utility in the console.
Any help would be appreciated 🙂
admin
2,930 Posts
February 28, 2018, 3:58 pmQuote from admin on February 28, 2018, 3:58 pmHi,
Can you please run the Windows test for at about 10 min and see the dashboard node resources graphs for "disk iops" and "disk utilization" for OSD disks...they should be almost similar values for different OSDs/nodes. Can you please post these values.
The rados throughput benchmark uses 4M block size, CrystalDiskMark most probably uses a smaller value, the rados benchmark uses a test pool we create with replica count of 2, you probably use 3, lastly CrystalDiskMark sequential queues most probably hit the same OSDs disks...i would recommend trying a different test like copying several large files at the same time or better use iometer http://www.iometer.org/doc/downloads.html which gives much more control.
Please let me know the chart values and also if you were able to use a different test.
Hi,
Can you please run the Windows test for at about 10 min and see the dashboard node resources graphs for "disk iops" and "disk utilization" for OSD disks...they should be almost similar values for different OSDs/nodes. Can you please post these values.
The rados throughput benchmark uses 4M block size, CrystalDiskMark most probably uses a smaller value, the rados benchmark uses a test pool we create with replica count of 2, you probably use 3, lastly CrystalDiskMark sequential queues most probably hit the same OSDs disks...i would recommend trying a different test like copying several large files at the same time or better use iometer http://www.iometer.org/doc/downloads.html which gives much more control.
Please let me know the chart values and also if you were able to use a different test.
HDD and SSD layout
mirek@gsnet.cz
3 Posts
Quote from mirek@gsnet.cz on February 16, 2018, 9:37 amHello,
i have a question about layout of disk ....
I have 4 same node Supermicro 2x6core cpu and 144Gb ram.
1st node) 8x 146 HDD SAS, 2x 10GBE, IB, IT mode raid ---> 8 OSD
2nd node) 8x 146 HDD SAS, 2x 10GBE, IB, IT mode raid ---> 8 OSD
3rd node) 8x 146 HDD SAS, 2x 10GBE, IB, IT mode raid ---> 8 OSD
4 node) 8x 64GB SSD Patriot flare, 2x 10GBE, IB, IT mode raid SATA3 ---> 8 journals
All servers i have placed in same rack, redundant 10gbe backend, jumbo frame and IB for conectivity to esxi bades.
It is good layout ?
Thank you much for answer
Mirek
Hello,
i have a question about layout of disk ....
I have 4 same node Supermicro 2x6core cpu and 144Gb ram.
1st node) 8x 146 HDD SAS, 2x 10GBE, IB, IT mode raid ---> 8 OSD
2nd node) 8x 146 HDD SAS, 2x 10GBE, IB, IT mode raid ---> 8 OSD
3rd node) 8x 146 HDD SAS, 2x 10GBE, IB, IT mode raid ---> 8 OSD
4 node) 8x 64GB SSD Patriot flare, 2x 10GBE, IB, IT mode raid SATA3 ---> 8 journals
All servers i have placed in same rack, redundant 10gbe backend, jumbo frame and IB for conectivity to esxi bades.
It is good layout ?
Thank you much for answer
Mirek
admin
2,930 Posts
Quote from admin on February 16, 2018, 11:38 amHi Mirek,
1) We do not currently support infiniband, there are users that set it up:
http://www.petasan.org/forums/?view=thread&id=226
http://www.petasan.org/forums/?view=thread&id=1512) The journals need to be distributed among the nodes, a journal is used to speed up the operation of a local OSD on the same node. Typically you an SSD journal can speed up up to 4 local HDDs
Hatem
Hi Mirek,
1) We do not currently support infiniband, there are users that set it up:
http://www.petasan.org/forums/?view=thread&id=226
http://www.petasan.org/forums/?view=thread&id=151
2) The journals need to be distributed among the nodes, a journal is used to speed up the operation of a local OSD on the same node. Typically you an SSD journal can speed up up to 4 local HDDs
Hatem
mirek@gsnet.cz
3 Posts
Quote from mirek@gsnet.cz on February 17, 2018, 3:14 pmHi Hatem,
it means that i have to have 2 SSD for 8 HDD in each node ?
- is there any recommendation for SSD capacity for one HDD in node ?
- does that mean if I have one node in a cluster without SSD, will I destroy the performance of the whole cluster?
- does the SSD journal still need to be assigned to a particular OSD? (I mean CLI) or just add petasan using the web interface?
Actualy i have 3 nodes, in each node 144GB RAM, 2x 6core CPU, 8x 146GB SAS OSD for test 2x10gbe backend and ib for iscsi - all work ok. but speed is terrible ....
Read 700 MB/s but write only 41 MB/s
- is it a typical value?
Thank you very much for your time, would not it be bad, in the future, to jointly create some performance recommendations? how to achieve a reasonable result 🙂
Thank you
Mirek
Hi Hatem,
it means that i have to have 2 SSD for 8 HDD in each node ?
- is there any recommendation for SSD capacity for one HDD in node ?
- does that mean if I have one node in a cluster without SSD, will I destroy the performance of the whole cluster?
- does the SSD journal still need to be assigned to a particular OSD? (I mean CLI) or just add petasan using the web interface?
Actualy i have 3 nodes, in each node 144GB RAM, 2x 6core CPU, 8x 146GB SAS OSD for test 2x10gbe backend and ib for iscsi - all work ok. but speed is terrible ....
Read 700 MB/s but write only 41 MB/s
- is it a typical value?
Thank you very much for your time, would not it be bad, in the future, to jointly create some performance recommendations? how to achieve a reasonable result 🙂
Thank you
Mirek
admin
2,930 Posts
Quote from admin on February 17, 2018, 6:02 pmit means that i have to have 2 SSD for 8 HDD in each node ?
Yes it is a good ratio
is there any recommendation for SSD capacity for one HDD in node ?
We use 20 G partition for each HDD OSD, so it does not need to be large
does that mean if I have one node in a cluster without SSD, will I destroy the performance of the whole cluster?
You should make your storage nodes as symmetric as possible. It will not "destroy" performance, if you have 1 slow node out of 10, it will slow 10% of read requests and about 30% of write requests, but again avoid this.
does the SSD journal still need to be assigned to a particular OSD? (I mean CLI) or just add petasan using the web interface?
The max is about 4 OSDs per SSDs, it does not have to be assigned to particular one. it is best to balance the OSDs among journals, if you select "Auto" assign, PetaSAN will pick the least loaded journal.
Actualy i have 3 nodes, in each node 144GB RAM, 2x 6core CPU, 8x 146GB SAS OSD for test 2x10gbe backend and ib for iscsi - all work ok. but speed is terrible .... Read 700 MB/s but write only 41 MB/s is it a typical value?
How did you measure this speed ?
How many replicas do you have ?
Can you test speed using the cluster benchmark for throughput at 1/16/64 threads ?
it means that i have to have 2 SSD for 8 HDD in each node ?
Yes it is a good ratio
is there any recommendation for SSD capacity for one HDD in node ?
We use 20 G partition for each HDD OSD, so it does not need to be large
does that mean if I have one node in a cluster without SSD, will I destroy the performance of the whole cluster?
You should make your storage nodes as symmetric as possible. It will not "destroy" performance, if you have 1 slow node out of 10, it will slow 10% of read requests and about 30% of write requests, but again avoid this.
does the SSD journal still need to be assigned to a particular OSD? (I mean CLI) or just add petasan using the web interface?
The max is about 4 OSDs per SSDs, it does not have to be assigned to particular one. it is best to balance the OSDs among journals, if you select "Auto" assign, PetaSAN will pick the least loaded journal.
Actualy i have 3 nodes, in each node 144GB RAM, 2x 6core CPU, 8x 146GB SAS OSD for test 2x10gbe backend and ib for iscsi - all work ok. but speed is terrible .... Read 700 MB/s but write only 41 MB/s is it a typical value?
How did you measure this speed ?
How many replicas do you have ?
Can you test speed using the cluster benchmark for throughput at 1/16/64 threads ?
RutgerK
1 Post
Quote from RutgerK on February 28, 2018, 9:50 amI'm having the same performance issues using 10k HDD's.
Setup:
3x DL380 G7 with LSI 9207-8i HBA
Per node:
32GB RAM
6x 300GB 10k SAS HDD
2x10Gb Chelsio T520-LL-CR (Backend1 and iSCSI1 mapped to one port, Backend2 and iSCSI2 to another)
1Gb Management
I've tested sequential speeds using the cluster benchmark with one of the three nodes as client (16 threads) and I get around 380MB/s write and 1040MB/s read.
After that I've created an iSCSI disk and connected to it from a Server 2016 host with 2x10Gb iSCSI using 4 paths.
The speed from the Server 2016 host is around 980MB/s read but only 36MB/s write. I've tested this with CrystalDiskMark on the iSCSI volume and inside a VM that's stored on the iSCSI volume. Increasing the threads in CrystalDiskMark makes no difference to the write speed.
Is this performance to be expected? The raw write speed of a single 300GB 10k HDD is around 130MB/s using the benchmark utility in the console.
Any help would be appreciated 🙂
I'm having the same performance issues using 10k HDD's.
Setup:
3x DL380 G7 with LSI 9207-8i HBA
Per node:
32GB RAM
6x 300GB 10k SAS HDD
2x10Gb Chelsio T520-LL-CR (Backend1 and iSCSI1 mapped to one port, Backend2 and iSCSI2 to another)
1Gb Management
I've tested sequential speeds using the cluster benchmark with one of the three nodes as client (16 threads) and I get around 380MB/s write and 1040MB/s read.
After that I've created an iSCSI disk and connected to it from a Server 2016 host with 2x10Gb iSCSI using 4 paths.
The speed from the Server 2016 host is around 980MB/s read but only 36MB/s write. I've tested this with CrystalDiskMark on the iSCSI volume and inside a VM that's stored on the iSCSI volume. Increasing the threads in CrystalDiskMark makes no difference to the write speed.
Is this performance to be expected? The raw write speed of a single 300GB 10k HDD is around 130MB/s using the benchmark utility in the console.
Any help would be appreciated 🙂
admin
2,930 Posts
Quote from admin on February 28, 2018, 3:58 pmHi,
Can you please run the Windows test for at about 10 min and see the dashboard node resources graphs for "disk iops" and "disk utilization" for OSD disks...they should be almost similar values for different OSDs/nodes. Can you please post these values.
The rados throughput benchmark uses 4M block size, CrystalDiskMark most probably uses a smaller value, the rados benchmark uses a test pool we create with replica count of 2, you probably use 3, lastly CrystalDiskMark sequential queues most probably hit the same OSDs disks...i would recommend trying a different test like copying several large files at the same time or better use iometer http://www.iometer.org/doc/downloads.html which gives much more control.
Please let me know the chart values and also if you were able to use a different test.
Hi,
Can you please run the Windows test for at about 10 min and see the dashboard node resources graphs for "disk iops" and "disk utilization" for OSD disks...they should be almost similar values for different OSDs/nodes. Can you please post these values.
The rados throughput benchmark uses 4M block size, CrystalDiskMark most probably uses a smaller value, the rados benchmark uses a test pool we create with replica count of 2, you probably use 3, lastly CrystalDiskMark sequential queues most probably hit the same OSDs disks...i would recommend trying a different test like copying several large files at the same time or better use iometer http://www.iometer.org/doc/downloads.html which gives much more control.
Please let me know the chart values and also if you were able to use a different test.