Low IOPS performance
thodinh
13 Posts
November 14, 2019, 4:38 pmQuote from thodinh on November 14, 2019, 4:38 pmHi,
I've setup a system using following hardware:
- Monitor node: 3 vms with 8 cores 8gb ram each on SSD
- Node: 4 cores CPU, 48gb Ram and 6 OSDs for 3 nodes. OSD is 2tb sata 5k4 rpm with around 300iops.
- NIC 1gbps 2 ports
The speed of newly setup system is acceptable but it slows down over time. Currently I only use around 6tb of data as vm shared disk and the iops of the whole system is only 300 (I expected it should be at least 1000 something).
Cpu utilz is around 25%, ram is 75% on my storage nodes, network hardly exceed 10%
Disk utilz is low but there is 2 osds with 100% utilz. I think it is a bottleneck somewhere? May you help me out with the issue?
Ps: petasan 2.2.0
Hi,
I've setup a system using following hardware:
- Monitor node: 3 vms with 8 cores 8gb ram each on SSD
- Node: 4 cores CPU, 48gb Ram and 6 OSDs for 3 nodes. OSD is 2tb sata 5k4 rpm with around 300iops.
- NIC 1gbps 2 ports
The speed of newly setup system is acceptable but it slows down over time. Currently I only use around 6tb of data as vm shared disk and the iops of the whole system is only 300 (I expected it should be at least 1000 something).
Cpu utilz is around 25%, ram is 75% on my storage nodes, network hardly exceed 10%
Disk utilz is low but there is 2 osds with 100% utilz. I think it is a bottleneck somewhere? May you help me out with the issue?
Ps: petasan 2.2.0
Last edited on November 14, 2019, 4:39 pm by thodinh · #1
admin
2,930 Posts
November 14, 2019, 7:41 pmQuote from admin on November 14, 2019, 7:41 pmi'd start to look into why you have 2 disks with 100% busy
are they bad ? too high capacity relative to others ? some bad crush setup or weights..
i'd start to look into why you have 2 disks with 100% busy
are they bad ? too high capacity relative to others ? some bad crush setup or weights..
thodinh
13 Posts
November 15, 2019, 3:14 amQuote from thodinh on November 15, 2019, 3:14 amBrand new HDDs for the whole system, the disks are all 2TB.
Crush set to default, I never touch crush menu item.
The point is that, after extending the log, I noticed there is always 2 OSD with 100% utilz at a time. Sometimes it was sdc, some times it was sde, sdf... The 100% disk utilz happened on all of the 3 storage nodes time after time but node 2 seems like less busy.
Everytime one OSD reached 100%, the performance is very terrible.
So I think the configuration default it self might not span loads over all OSDs which cause the "bottleneck - like" situation.
Brand new HDDs for the whole system, the disks are all 2TB.
Crush set to default, I never touch crush menu item.
The point is that, after extending the log, I noticed there is always 2 OSD with 100% utilz at a time. Sometimes it was sdc, some times it was sde, sdf... The 100% disk utilz happened on all of the 3 storage nodes time after time but node 2 seems like less busy.
Everytime one OSD reached 100%, the performance is very terrible.
So I think the configuration default it self might not span loads over all OSDs which cause the "bottleneck - like" situation.
Last edited on November 15, 2019, 4:52 am by thodinh · #3
admin
2,930 Posts
November 15, 2019, 11:05 amQuote from admin on November 15, 2019, 11:05 amsince this is not fixed to specific 2 disks, most probably it is due to scrubbing.
In v 2.2.0 we already had lowered scrub activity from default, we lowered even more in later versions.
set the following configuration params
osd_scrub_sleep = 1
osd_scrub_load_threshold = 0.3
since this is not fixed to specific 2 disks, most probably it is due to scrubbing.
In v 2.2.0 we already had lowered scrub activity from default, we lowered even more in later versions.
set the following configuration params
osd_scrub_sleep = 1
osd_scrub_load_threshold = 0.3
thodinh
13 Posts
November 15, 2019, 11:53 amQuote from thodinh on November 15, 2019, 11:53 amThanks for your help, will try it today.
Thanks for your help, will try it today.
Low IOPS performance
thodinh
13 Posts
Quote from thodinh on November 14, 2019, 4:38 pmHi,
I've setup a system using following hardware:
- Monitor node: 3 vms with 8 cores 8gb ram each on SSD
- Node: 4 cores CPU, 48gb Ram and 6 OSDs for 3 nodes. OSD is 2tb sata 5k4 rpm with around 300iops.
- NIC 1gbps 2 ports
The speed of newly setup system is acceptable but it slows down over time. Currently I only use around 6tb of data as vm shared disk and the iops of the whole system is only 300 (I expected it should be at least 1000 something).
Cpu utilz is around 25%, ram is 75% on my storage nodes, network hardly exceed 10%
Disk utilz is low but there is 2 osds with 100% utilz. I think it is a bottleneck somewhere? May you help me out with the issue?
Ps: petasan 2.2.0
Hi,
I've setup a system using following hardware:
- Monitor node: 3 vms with 8 cores 8gb ram each on SSD
- Node: 4 cores CPU, 48gb Ram and 6 OSDs for 3 nodes. OSD is 2tb sata 5k4 rpm with around 300iops.
- NIC 1gbps 2 ports
The speed of newly setup system is acceptable but it slows down over time. Currently I only use around 6tb of data as vm shared disk and the iops of the whole system is only 300 (I expected it should be at least 1000 something).
Cpu utilz is around 25%, ram is 75% on my storage nodes, network hardly exceed 10%
Disk utilz is low but there is 2 osds with 100% utilz. I think it is a bottleneck somewhere? May you help me out with the issue?
Ps: petasan 2.2.0
admin
2,930 Posts
Quote from admin on November 14, 2019, 7:41 pmi'd start to look into why you have 2 disks with 100% busy
are they bad ? too high capacity relative to others ? some bad crush setup or weights..
i'd start to look into why you have 2 disks with 100% busy
are they bad ? too high capacity relative to others ? some bad crush setup or weights..
thodinh
13 Posts
Quote from thodinh on November 15, 2019, 3:14 amBrand new HDDs for the whole system, the disks are all 2TB.
Crush set to default, I never touch crush menu item.
The point is that, after extending the log, I noticed there is always 2 OSD with 100% utilz at a time. Sometimes it was sdc, some times it was sde, sdf... The 100% disk utilz happened on all of the 3 storage nodes time after time but node 2 seems like less busy.
Everytime one OSD reached 100%, the performance is very terrible.
So I think the configuration default it self might not span loads over all OSDs which cause the "bottleneck - like" situation.
Brand new HDDs for the whole system, the disks are all 2TB.
Crush set to default, I never touch crush menu item.
The point is that, after extending the log, I noticed there is always 2 OSD with 100% utilz at a time. Sometimes it was sdc, some times it was sde, sdf... The 100% disk utilz happened on all of the 3 storage nodes time after time but node 2 seems like less busy.
Everytime one OSD reached 100%, the performance is very terrible.
So I think the configuration default it self might not span loads over all OSDs which cause the "bottleneck - like" situation.
admin
2,930 Posts
Quote from admin on November 15, 2019, 11:05 amsince this is not fixed to specific 2 disks, most probably it is due to scrubbing.
In v 2.2.0 we already had lowered scrub activity from default, we lowered even more in later versions.
set the following configuration params
osd_scrub_sleep = 1
osd_scrub_load_threshold = 0.3
since this is not fixed to specific 2 disks, most probably it is due to scrubbing.
In v 2.2.0 we already had lowered scrub activity from default, we lowered even more in later versions.
set the following configuration params
osd_scrub_sleep = 1
osd_scrub_load_threshold = 0.3
thodinh
13 Posts
Quote from thodinh on November 15, 2019, 11:53 amThanks for your help, will try it today.
Thanks for your help, will try it today.