iscsi Disks are gone after update or adding nodes
eazyadm
25 Posts
June 22, 2021, 6:29 pmQuote from eazyadm on June 22, 2021, 6:29 pmHi,
I've done some pre livetests with new hardware.
First I updated the lab from 2.6.3 to the latest release 2.7.3
Node by node with an reboot after upgrade
Everything looked good, but I havn't checked the iSCSI disk.
Then I added 3 Nodes successfully, but after that I realized that my iscsi disk is missing, not accessable and not listed in the petasan gui.
What went wrong? how to debug and how can I prevent that something like that happens on the productiv environment ?
In the Warnings I see:
Reduced data availability: 259 pgs inactive
44 slow ops, oldest one blocked for 3380 sec, daemons [osd.15,osd.35] have slow ops.
And in Pools I see an inactive rbd Pool
Thanks
bye
eazy
Hi,
I've done some pre livetests with new hardware.
First I updated the lab from 2.6.3 to the latest release 2.7.3
Node by node with an reboot after upgrade
Everything looked good, but I havn't checked the iSCSI disk.
Then I added 3 Nodes successfully, but after that I realized that my iscsi disk is missing, not accessable and not listed in the petasan gui.
What went wrong? how to debug and how can I prevent that something like that happens on the productiv environment ?
In the Warnings I see:
Reduced data availability: 259 pgs inactive
44 slow ops, oldest one blocked for 3380 sec, daemons [osd.15,osd.35] have slow ops.
And in Pools I see an inactive rbd Pool
Thanks
bye
eazy
Last edited on June 22, 2021, 6:53 pm by eazyadm · #1
eazyadm
25 Posts
June 23, 2021, 9:59 amQuote from eazyadm on June 23, 2021, 9:59 amHi,
I have no idea what the problem triggert, but I was able to fix it.
I had to set the min_size of the inactive Pool to 1. After that change, the pool switched to active and the iscsi disk was available again.
Everything was fine, so I changed the min_size value back to 2.
bye
eazy
Hi,
I have no idea what the problem triggert, but I was able to fix it.
I had to set the min_size of the inactive Pool to 1. After that change, the pool switched to active and the iscsi disk was available again.
Everything was fine, so I changed the min_size value back to 2.
bye
eazy
admin
2,930 Posts
June 23, 2021, 12:56 pmQuote from admin on June 23, 2021, 12:56 pmThanks for the update.
Look at your backfill speed and %utilization of your disks, if you are using hdds and the backfill is too fast, the data rebalance load when adding a new node can saturate your hdd.
Thanks for the update.
Look at your backfill speed and %utilization of your disks, if you are using hdds and the backfill is too fast, the data rebalance load when adding a new node can saturate your hdd.
eazyadm
25 Posts
June 23, 2021, 6:26 pmQuote from eazyadm on June 23, 2021, 6:26 pmBackfill is set to medium.
The prod environment has 72x 14 TB HDD and 6x 2TB NVMe for journal.
So 6 Nodes each with 2tb nvme journal and 12x 14 TB HDD
The Disk utilization looks unusual, all disks are at ~20% but the nvme is almost at 100% on all nodes !?
is that normal ?
Thanks
Backfill is set to medium.
The prod environment has 72x 14 TB HDD and 6x 2TB NVMe for journal.
So 6 Nodes each with 2tb nvme journal and 12x 14 TB HDD
The Disk utilization looks unusual, all disks are at ~20% but the nvme is almost at 100% on all nodes !?
is that normal ?
Thanks
Last edited on June 23, 2021, 6:26 pm by eazyadm · #4
admin
2,930 Posts
June 23, 2021, 6:45 pmQuote from admin on June 23, 2021, 6:45 pmThe nvme utilization stats is a kernel issue
https://github.com/sysstat/sysstat/issues/187
The nvme utilization stats is a kernel issue
iscsi Disks are gone after update or adding nodes
eazyadm
25 Posts
Quote from eazyadm on June 22, 2021, 6:29 pmHi,
I've done some pre livetests with new hardware.
First I updated the lab from 2.6.3 to the latest release 2.7.3
Node by node with an reboot after upgradeEverything looked good, but I havn't checked the iSCSI disk.
Then I added 3 Nodes successfully, but after that I realized that my iscsi disk is missing, not accessable and not listed in the petasan gui.
What went wrong? how to debug and how can I prevent that something like that happens on the productiv environment ?
In the Warnings I see:
Reduced data availability: 259 pgs inactive
44 slow ops, oldest one blocked for 3380 sec, daemons [osd.15,osd.35] have slow ops.And in Pools I see an inactive rbd Pool
Thanks
bye
eazy
Hi,
I've done some pre livetests with new hardware.
First I updated the lab from 2.6.3 to the latest release 2.7.3
Node by node with an reboot after upgrade
Everything looked good, but I havn't checked the iSCSI disk.
Then I added 3 Nodes successfully, but after that I realized that my iscsi disk is missing, not accessable and not listed in the petasan gui.
What went wrong? how to debug and how can I prevent that something like that happens on the productiv environment ?
In the Warnings I see:
Reduced data availability: 259 pgs inactive
44 slow ops, oldest one blocked for 3380 sec, daemons [osd.15,osd.35] have slow ops.
And in Pools I see an inactive rbd Pool
Thanks
bye
eazy
eazyadm
25 Posts
Quote from eazyadm on June 23, 2021, 9:59 amHi,
I have no idea what the problem triggert, but I was able to fix it.
I had to set the min_size of the inactive Pool to 1. After that change, the pool switched to active and the iscsi disk was available again.
Everything was fine, so I changed the min_size value back to 2.
bye
eazy
Hi,
I have no idea what the problem triggert, but I was able to fix it.
I had to set the min_size of the inactive Pool to 1. After that change, the pool switched to active and the iscsi disk was available again.
Everything was fine, so I changed the min_size value back to 2.
bye
eazy
admin
2,930 Posts
Quote from admin on June 23, 2021, 12:56 pmThanks for the update.
Look at your backfill speed and %utilization of your disks, if you are using hdds and the backfill is too fast, the data rebalance load when adding a new node can saturate your hdd.
Thanks for the update.
Look at your backfill speed and %utilization of your disks, if you are using hdds and the backfill is too fast, the data rebalance load when adding a new node can saturate your hdd.
eazyadm
25 Posts
Quote from eazyadm on June 23, 2021, 6:26 pmBackfill is set to medium.
The prod environment has 72x 14 TB HDD and 6x 2TB NVMe for journal.
So 6 Nodes each with 2tb nvme journal and 12x 14 TB HDD
The Disk utilization looks unusual, all disks are at ~20% but the nvme is almost at 100% on all nodes !?
is that normal ?
Thanks
Backfill is set to medium.
The prod environment has 72x 14 TB HDD and 6x 2TB NVMe for journal.
So 6 Nodes each with 2tb nvme journal and 12x 14 TB HDD
The Disk utilization looks unusual, all disks are at ~20% but the nvme is almost at 100% on all nodes !?
is that normal ?
Thanks
admin
2,930 Posts
Quote from admin on June 23, 2021, 6:45 pmThe nvme utilization stats is a kernel issue
https://github.com/sysstat/sysstat/issues/187
The nvme utilization stats is a kernel issue