Forums - PetaSAN

ForumGeneral Discussionpools and iscsi disks sometimes f …
You need to log in to create posts and topics. Login · Register
pools and iscsi disks sometimes fluctuate inactive active,

Pages: 1 2

hjallisnorra
19 Posts

January 7, 2020, 11:49 am
Quote from hjallisnorra on January 7, 2020, 11:49 am
Hi we are testing a setup for production,

with 3 mgm nodes in virtual,

5 nodes with iscsi and some storage,

6 nodes with only storage, (50 1T sata 2 1Tssd)

we have created pools and crush maps,

all seams to be working fine,

created some disks mounted in ovirt and are testing some vm's on the storage, no issues with anything there.

but in the petasan pool config the pools are fluctuating active inactive.

and sometimes in the iscsi disk list there is only: "No data available in table" and after refresing a few times the disks are visible.

this is rather frustrating since the cluster is in ok state and no visible issues anywhere.

ceph status:

root@av-petasan-mgm-ash1-001:~# ceph status
cluster:
id: 48cc93e7-b2ee-4fe1-8b01-b6aacb9dda66
health: HEALTH_OK

services:
mon: 3 daemons, quorum av-petasan-mgm-ash1-003,av-petasan-mgm-ash1-001,av-petasan-mgm-ash1-002 (age 8h)
mgr: av-petasan-mgm-ash1-003(active, since 4d), standbys: av-petasan-mgm-ash1-002, av-petasan-mgm-ash1-001
osd: 367 osds: 367 up (since 8h), 367 in (since 8h)

data:
pools: 4 pools, 16384 pgs
objects: 48.27k objects, 188 GiB
usage: 11 TiB used, 499 TiB / 510 TiB avail
pgs: 16384 active+clean

io:
client: 29 KiB/s rd, 11 KiB/s wr, 0 op/s rd, 0 op/s wr

root@av-petasan-mgm-ash1-001:~#

and last, is it possible to add to more mgm servers?

Hi we are testing a setup for production,

with 3 mgm nodes in virtual,

5 nodes with iscsi and some storage,

6 nodes with only storage, (50 1T sata 2 1Tssd)

we have created pools and crush maps,

all seams to be working fine,

created some disks mounted in ovirt and are testing some vm's on the storage, no issues with anything there.

but in the petasan pool config the pools are fluctuating active inactive.

and sometimes in the iscsi disk list there is only: "No data available in table" and after refresing a few times the disks are visible.

this is rather frustrating since the cluster is in ok state and no visible issues anywhere.

ceph status:

root@av-petasan-mgm-ash1-001:~# ceph status
cluster:
id: 48cc93e7-b2ee-4fe1-8b01-b6aacb9dda66
health: HEALTH_OK

services:
mon: 3 daemons, quorum av-petasan-mgm-ash1-003,av-petasan-mgm-ash1-001,av-petasan-mgm-ash1-002 (age 8h)
mgr: av-petasan-mgm-ash1-003(active, since 4d), standbys: av-petasan-mgm-ash1-002, av-petasan-mgm-ash1-001
osd: 367 osds: 367 up (since 8h), 367 in (since 8h)

data:
pools: 4 pools, 16384 pgs
objects: 48.27k objects, 188 GiB
usage: 11 TiB used, 499 TiB / 510 TiB avail
pgs: 16384 active+clean

io:
client: 29 KiB/s rd, 11 KiB/s wr, 0 op/s rd, 0 op/s wr

root@av-petasan-mgm-ash1-001:~#

and last, is it possible to add to more mgm servers?

#1

admin
2,969 Posts

January 7, 2020, 12:19 pm
Quote from admin on January 7, 2020, 12:19 pm
This is most likely either under powered hardware or network hardware problem.

You can add more ceph monitor and consul servers via cli if you want. PetaSAN setup 3 nodes but you can increase manually.

This is most likely either under powered hardware or network hardware problem.

You can add more ceph monitor and consul servers via cli if you want. PetaSAN setup 3 nodes but you can increase manually.

Last edited on January 7, 2020, 12:22 pm by admin · #2

hjallisnorra
19 Posts

January 7, 2020, 1:48 pm
Quote from hjallisnorra on January 7, 2020, 1:48 pm
Hi again,

do i need especially powerful mgm nodes depending on how many osd, or pg we are using?

hardware:

2 G9 hp dl380 64G mem 4 25G nic's, 4 1G nic's. (9 * 8T sata disks) 1 * 1T ssd. iscsi/storage nodes

3 G9 hp dl380 128G mem 4 25G nic's, 4 1G nic's. (22 * 2T sas disks) 2 * 1T ssd.  iscsi/storage nodes

6 G8 hp dl360 64G mem 2 25G nic's. 4 1G nic's storage nodes (added 2 dummy nic's when deploying) 50 1T sata 2 1T ssd. (changed the bluestore_block_db_size = 32212238227 so as the journal disk partition is 30G per disk), (changed the  osd_memory_target = 1073741824)

3 virtual mgm server 16G mem 8 cpu cores.

cluster_info.txt:

{
"backend_1_base_ip": "10.118.64.0",
"backend_1_eth_name": "bond0",
"backend_1_mask": "255.255.255.0",
"backend_1_vlan_id": "2064",
"backend_2_base_ip": "10.118.65.0",
"backend_2_eth_name": "bond0",
"backend_2_mask": "255.255.255.0",
"backend_2_vlan_id": "2065",
"bonds": [
{
"interfaces": "eth2,eth3",
"is_jumbo_frames": true,
"mode": "802.3ad",
"name": "bond0",
"primary_interface": ""
},
{
"interfaces": "eth4,eth5",
"is_jumbo_frames": true,
"mode": "802.3ad",
"name": "bond1",
"primary_interface": ""
}
],
"eth_count": 8,
"iscsi_1_eth_name": "bond1",
"iscsi_2_eth_name": "bond1",
"jumbo_frames": [
"eth4",
"eth2",
"eth5",
"eth3"
],
"management_eth_name": "eth1",
"management_nodes": [
{
"backend_1_ip": "10.118.64.101",
"backend_2_ip": "10.118.65.101",
"is_backup": false,
"is_iscsi": false,
"is_management": true,
"is_storage": false,
"management_ip": "10.117.64.101",
"name": "av-petasan-mgm-ash1-001"
},
{
"backend_1_ip": "10.118.64.102",
"backend_2_ip": "10.118.65.102",
"is_backup": false,
"is_iscsi": false,
"is_management": true,
"is_storage": false,
"management_ip": "10.117.64.102",
"name": "av-petasan-mgm-ash1-002"
},
{
"backend_1_ip": "10.118.64.103",
"backend_2_ip": "10.118.65.103",
"is_backup": false,
"is_iscsi": false,
"is_management": true,
"is_storage": false,
"management_ip": "10.117.64.103",
"name": "av-petasan-mgm-ash1-003"
}
],
"name": "ash1-petasan",
"storage_engine": "bluestore"
}

we did some ping tests for all the ip numbers in the cluster and no package loss detected.

Hi again,

do i need especially powerful mgm nodes depending on how many osd, or pg we are using?

hardware:

2 G9 hp dl380 64G mem 4 25G nic's, 4 1G nic's. (9 * 8T sata disks) 1 * 1T ssd. iscsi/storage nodes

3 G9 hp dl380 128G mem 4 25G nic's, 4 1G nic's. (22 * 2T sas disks) 2 * 1T ssd.  iscsi/storage nodes

6 G8 hp dl360 64G mem 2 25G nic's. 4 1G nic's storage nodes (added 2 dummy nic's when deploying) 50 1T sata 2 1T ssd. (changed the bluestore_block_db_size = 32212238227 so as the journal disk partition is 30G per disk), (changed the  osd_memory_target = 1073741824)

3 virtual mgm server 16G mem 8 cpu cores.

cluster_info.txt:

{
"backend_1_base_ip": "10.118.64.0",
"backend_1_eth_name": "bond0",
"backend_1_mask": "255.255.255.0",
"backend_1_vlan_id": "2064",
"backend_2_base_ip": "10.118.65.0",
"backend_2_eth_name": "bond0",
"backend_2_mask": "255.255.255.0",
"backend_2_vlan_id": "2065",
"bonds": [
{
"interfaces": "eth2,eth3",
"is_jumbo_frames": true,
"mode": "802.3ad",
"name": "bond0",
"primary_interface": ""
},
{
"interfaces": "eth4,eth5",
"is_jumbo_frames": true,
"mode": "802.3ad",
"name": "bond1",
"primary_interface": ""
}
],
"eth_count": 8,
"iscsi_1_eth_name": "bond1",
"iscsi_2_eth_name": "bond1",
"jumbo_frames": [
"eth4",
"eth2",
"eth5",
"eth3"
],
"management_eth_name": "eth1",
"management_nodes": [
{
"backend_1_ip": "10.118.64.101",
"backend_2_ip": "10.118.65.101",
"is_backup": false,
"is_iscsi": false,
"is_management": true,
"is_storage": false,
"management_ip": "10.117.64.101",
"name": "av-petasan-mgm-ash1-001"
},
{
"backend_1_ip": "10.118.64.102",
"backend_2_ip": "10.118.65.102",
"is_backup": false,
"is_iscsi": false,
"is_management": true,
"is_storage": false,
"management_ip": "10.117.64.102",
"name": "av-petasan-mgm-ash1-002"
},
{
"backend_1_ip": "10.118.64.103",
"backend_2_ip": "10.118.65.103",
"is_backup": false,
"is_iscsi": false,
"is_management": true,
"is_storage": false,
"management_ip": "10.117.64.103",
"name": "av-petasan-mgm-ash1-003"
}
],
"name": "ash1-petasan",
"storage_engine": "bluestore"
}

we did some ping tests for all the ip numbers in the cluster and no package loss detected.

#3

admin
2,969 Posts

January 7, 2020, 3:14 pm
Quote from admin on January 7, 2020, 3:14 pm
It seems the management nodes are slow to respond, hard to say why. 16 G ram should be enough, but it could be an issue with the vm setup or maybe the network connection between them.

When you see pools going active/in-active in the ui, can you ssh to the vm you are connecting too and run

ceph osd dump
ceph pg ls-by-pool POOL_NAME
example:
ceph pg ls-by-pool rbd

run them a couple of times in a row and see if they responsive or if they take a long time to complete.

It seems the management nodes are slow to respond, hard to say why. 16 G ram should be enough, but it could be an issue with the vm setup or maybe the network connection between them.

When you see pools going active/in-active in the ui, can you ssh to the vm you are connecting too and run

ceph osd dump
ceph pg ls-by-pool POOL_NAME
example:
ceph pg ls-by-pool rbd

run them a couple of times in a row and see if they responsive or if they take a long time to complete.

Last edited on January 7, 2020, 3:36 pm by admin · #4

hjallisnorra
19 Posts

January 7, 2020, 4:57 pm
Quote from hjallisnorra on January 7, 2020, 4:57 pm
Yes they take a long time to respond:

root@av-petasan-mgm-ash1-001:~# time ceph pg ls-by-pool SATA1T3 | wc
8195 155688 1761710

real 0m16.016s
user 0m0.559s
sys 0m0.052s

root@av-petasan-mgm-ash1-001:~# time ceph pg ls-by-pool SATA1T3 | wc
8195 155688 1761710

real 0m2.016s
user 0m0.559s
sys 0m0.052s
root@av-petasan-mgm-ash1-001:~# time ceph pg ls-by-pool SATA1T3 | wc
8195 155688 1761710

real 0m3.540s
user 0m0.524s
sys 0m0.113s
root@av-petasan-mgm-ash1-001:~# time ceph pg ls-by-pool SATA1T3 | wc
8195 155688 1761710

real 0m3.206s
user 0m0.609s
sys 0m0.052s
root@av-petasan-mgm-ash1-001:~# time ceph pg ls-by-pool SATA1T3 | wc
8195 155688 1761710

real 0m5.542s
user 0m0.496s
sys 0m0.096s
root@av-petasan-mgm-ash1-001:~# time ceph pg ls-by-pool SATA1T3 | wc
8195 155688 1761710

real 0m3.810s
user 0m0.507s
sys 0m0.052s

started long and then quicker.

Yes they take a long time to respond:

root@av-petasan-mgm-ash1-001:~# time ceph pg ls-by-pool SATA1T3 | wc
8195 155688 1761710

real 0m16.016s
user 0m0.559s
sys 0m0.052s

root@av-petasan-mgm-ash1-001:~# time ceph pg ls-by-pool SATA1T3 | wc
8195 155688 1761710

real 0m2.016s
user 0m0.559s
sys 0m0.052s
root@av-petasan-mgm-ash1-001:~# time ceph pg ls-by-pool SATA1T3 | wc
8195 155688 1761710

real 0m3.540s
user 0m0.524s
sys 0m0.113s
root@av-petasan-mgm-ash1-001:~# time ceph pg ls-by-pool SATA1T3 | wc
8195 155688 1761710

real 0m3.206s
user 0m0.609s
sys 0m0.052s
root@av-petasan-mgm-ash1-001:~# time ceph pg ls-by-pool SATA1T3 | wc
8195 155688 1761710

real 0m5.542s
user 0m0.496s
sys 0m0.096s
root@av-petasan-mgm-ash1-001:~# time ceph pg ls-by-pool SATA1T3 | wc
8195 155688 1761710

real 0m3.810s
user 0m0.507s
sys 0m0.052s

started long and then quicker.

#5

hjallisnorra
19 Posts

January 7, 2020, 5:25 pm
Quote from hjallisnorra on January 7, 2020, 5:25 pm
But running "ceph osd dump" is always within 1 sec.

root@av-petasan-mgm-ash1-001:~# time ceph osd dump | wc
385 6367 95761

real 0m0.700s
user 0m0.603s
sys 0m0.048s

But running "ceph osd dump" is always within 1 sec.

root@av-petasan-mgm-ash1-001:~# time ceph osd dump | wc
385 6367 95761

real 0m0.700s
user 0m0.603s
sys 0m0.048s

#6

admin
2,969 Posts

January 7, 2020, 5:57 pm
Quote from admin on January 7, 2020, 5:57 pm
This is taking a long time, apart from checking hardware speed and network connections you can increase the timeout as follows

line 72 in /usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/pool_checker.py
class PoolChecker():

def init(self, timeout=5.0):
self.timeout = timeout

change timeout=5.0 to for example timeout=30.0

even if this fixes the issue, i cannot say all is OK as it could be masking some other issue

This is taking a long time, apart from checking hardware speed and network connections you can increase the timeout as follows

line 72 in /usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/pool_checker.py
class PoolChecker():

def init(self, timeout=5.0):
self.timeout = timeout

change timeout=5.0 to for example timeout=30.0

even if this fixes the issue, i cannot say all is OK as it could be masking some other issue

Last edited on January 7, 2020, 5:58 pm by admin · #7

hjallisnorra
19 Posts

January 8, 2020, 11:26 am
Quote from hjallisnorra on January 8, 2020, 11:26 am
Thank you, this is working.

Thank you, this is working.

#8

hjallisnorra
19 Posts

January 16, 2020, 11:46 am
Quote from hjallisnorra on January 16, 2020, 11:46 am
Hi again,

we have this same problem with iscsi path assignment list, it is there not there fluctuating,

we are not seeing any problems, it just frustrating when the list is populated and then empty and populated and empty,

was wondering if there is another timeout we can adjust to get rid of the fluctuation?

Thanks in advance

hjalli.

Hi again,

we have this same problem with iscsi path assignment list, it is there not there fluctuating,

we are not seeing any problems, it just frustrating when the list is populated and then empty and populated and empty,

was wondering if there is another timeout we can adjust to get rid of the fluctuation?

Thanks in advance

hjalli.

#9

admin
2,969 Posts

January 16, 2020, 12:15 pm
Quote from admin on January 16, 2020, 12:15 pm
It is the same timeout used for both.

basically ceph will block if a pool is not responding, this is because a pool could be in the process of recovery. From a ui we need to specify a timeout so we do not hang forever, it should also not be too long so in case a pool is down we do not delay the ui too much, we chose 5 sec, we have not seen issues with this value.

If you wish you can increase the timeout over your current 30 sec which is already too high, as indicated earlier, increasing the timeout could mask a root cause of why in your setup it is taking too long to respond, as indicated it could be underpowered hardware or network issues.

It is the same timeout used for both.

basically ceph will block if a pool is not responding, this is because a pool could be in the process of recovery. From a ui we need to specify a timeout so we do not hang forever, it should also not be too long so in case a pool is down we do not delay the ui too much, we chose 5 sec, we have not seen issues with this value.

If you wish you can increase the timeout over your current 30 sec which is already too high, as indicated earlier, increasing the timeout could mask a root cause of why in your setup it is taking too long to respond, as indicated it could be underpowered hardware or network issues.

Last edited on January 16, 2020, 12:19 pm by admin · #10

Post Reply: pools and iscsi disks sometimes fluctuate inactive active,

Cancel

Pages: 1 2