Forums

Home / Forums

You need to log in to create posts and topics. Login · Register

pools and iscsi disks sometimes fluctuate inactive active,

Pages: 1 2

Hi we are testing a setup for production,

 

with 3 mgm nodes in virtual,

5 nodes with iscsi and some storage,

6 nodes with only storage, (50 *1T sata 2* 1Tssd)

we have created pools and crush maps,

all seams to be working fine,

created some disks mounted in ovirt and are testing some vm's on the storage, no issues with anything there.

but in the petasan pool config the pools are fluctuating active inactive.

and sometimes in the iscsi disk list there is only: "No data available in table" and after refresing a few times the disks are visible.

this is rather frustrating since the cluster is in ok state and no visible issues anywhere.

 

ceph status:

root@av-petasan-mgm-ash1-001:~# ceph status
cluster:
id: 48cc93e7-b2ee-4fe1-8b01-b6aacb9dda66
health: HEALTH_OK

services:
mon: 3 daemons, quorum av-petasan-mgm-ash1-003,av-petasan-mgm-ash1-001,av-petasan-mgm-ash1-002 (age 8h)
mgr: av-petasan-mgm-ash1-003(active, since 4d), standbys: av-petasan-mgm-ash1-002, av-petasan-mgm-ash1-001
osd: 367 osds: 367 up (since 8h), 367 in (since 8h)

data:
pools: 4 pools, 16384 pgs
objects: 48.27k objects, 188 GiB
usage: 11 TiB used, 499 TiB / 510 TiB avail
pgs: 16384 active+clean

io:
client: 29 KiB/s rd, 11 KiB/s wr, 0 op/s rd, 0 op/s wr

root@av-petasan-mgm-ash1-001:~#

 

and last, is it possible to add to more mgm servers?

 

 

This is most likely either under powered hardware or network hardware problem.

You can add more ceph monitor  and consul servers via cli if you want. PetaSAN setup 3 nodes but you can increase manually.

Hi again,

 

do i need especially powerful mgm nodes depending on how many osd, or pg we are using?

hardware:

2 G9 hp dl380  64G mem 4 25G nic's, 4 1G nic's. (9 * 8T sata disks) 1 * 1T ssd.  iscsi/storage nodes

3 G9 hp dl380  128G mem 4 25G nic's, 4 1G nic's. (22 * 2T sas disks) 2 * 1T ssd.  iscsi/storage nodes

6 G8 hp dl360 64G mem 2 25G nic's. 4 1G nic's storage nodes (added 2 dummy nic's when deploying) 50 1T sata 2 1T ssd.  (changed the bluestore_block_db_size = 32212238227 so as the journal disk partition is 30G per disk), (changed the  osd_memory_target = 1073741824)

3 virtual mgm server 16G mem 8 cpu cores.

cluster_info.txt:

{
"backend_1_base_ip": "10.118.64.0",
"backend_1_eth_name": "bond0",
"backend_1_mask": "255.255.255.0",
"backend_1_vlan_id": "2064",
"backend_2_base_ip": "10.118.65.0",
"backend_2_eth_name": "bond0",
"backend_2_mask": "255.255.255.0",
"backend_2_vlan_id": "2065",
"bonds": [
{
"interfaces": "eth2,eth3",
"is_jumbo_frames": true,
"mode": "802.3ad",
"name": "bond0",
"primary_interface": ""
},
{
"interfaces": "eth4,eth5",
"is_jumbo_frames": true,
"mode": "802.3ad",
"name": "bond1",
"primary_interface": ""
}
],
"eth_count": 8,
"iscsi_1_eth_name": "bond1",
"iscsi_2_eth_name": "bond1",
"jumbo_frames": [
"eth4",
"eth2",
"eth5",
"eth3"
],
"management_eth_name": "eth1",
"management_nodes": [
{
"backend_1_ip": "10.118.64.101",
"backend_2_ip": "10.118.65.101",
"is_backup": false,
"is_iscsi": false,
"is_management": true,
"is_storage": false,
"management_ip": "10.117.64.101",
"name": "av-petasan-mgm-ash1-001"
},
{
"backend_1_ip": "10.118.64.102",
"backend_2_ip": "10.118.65.102",
"is_backup": false,
"is_iscsi": false,
"is_management": true,
"is_storage": false,
"management_ip": "10.117.64.102",
"name": "av-petasan-mgm-ash1-002"
},
{
"backend_1_ip": "10.118.64.103",
"backend_2_ip": "10.118.65.103",
"is_backup": false,
"is_iscsi": false,
"is_management": true,
"is_storage": false,
"management_ip": "10.117.64.103",
"name": "av-petasan-mgm-ash1-003"
}
],
"name": "ash1-petasan",
"storage_engine": "bluestore"
}

we did some ping tests for all the ip numbers in the cluster and no package loss detected.

 

 

It seems the management nodes are slow to respond, hard to say why. 16 G ram should be enough, but it could be an issue with the vm setup or maybe the network connection between them.

When you see pools going active/in-active in the ui, can you ssh to the vm you are connecting too and run

ceph osd dump
ceph pg ls-by-pool POOL_NAME
example:
ceph pg ls-by-pool rbd

run them a couple of times in a row and see if they responsive or if they take a long time to complete.

Yes they take a long time to respond:

root@av-petasan-mgm-ash1-001:~# time ceph pg ls-by-pool SATA1T3 | wc
8195 155688 1761710

real 0m16.016s
user 0m0.559s
sys 0m0.052s

root@av-petasan-mgm-ash1-001:~# time ceph pg ls-by-pool SATA1T3 | wc
8195 155688 1761710

real 0m2.016s
user 0m0.559s
sys 0m0.052s
root@av-petasan-mgm-ash1-001:~# time ceph pg ls-by-pool SATA1T3 | wc
8195 155688 1761710

real 0m3.540s
user 0m0.524s
sys 0m0.113s
root@av-petasan-mgm-ash1-001:~# time ceph pg ls-by-pool SATA1T3 | wc
8195 155688 1761710

real 0m3.206s
user 0m0.609s
sys 0m0.052s
root@av-petasan-mgm-ash1-001:~# time ceph pg ls-by-pool SATA1T3 | wc
8195 155688 1761710

real 0m5.542s
user 0m0.496s
sys 0m0.096s
root@av-petasan-mgm-ash1-001:~# time ceph pg ls-by-pool SATA1T3 | wc
8195 155688 1761710

real 0m3.810s
user 0m0.507s
sys 0m0.052s

 

started long and then quicker.

But running "ceph osd dump" is always within 1 sec.

 

root@av-petasan-mgm-ash1-001:~# time ceph osd dump | wc
385 6367 95761

real 0m0.700s
user 0m0.603s
sys 0m0.048s

This is taking a long time, apart from checking hardware speed and network connections you can increase the timeout as follows

line 72 in /usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/pool_checker.py
class PoolChecker():

def __init__(self, timeout=5.0):
self.timeout = timeout

change timeout=5.0 to for example timeout=30.0

even if this fixes the issue, i cannot say all is OK as it could be masking some other issue

 

Thank you, this is working.

Hi again,

 

we have this same problem with iscsi path assignment list, it is there not there fluctuating,

we are not seeing any problems, it just frustrating when the list is populated and then empty and populated and empty,

 

was wondering if there is another timeout we can adjust to get rid of the fluctuation?

 

Thanks in advance

hjalli.

It is the same timeout used for both.

basically ceph will block if a pool is not responding, this is because a pool could be in the process of recovery. From a ui we need to specify a timeout so we do not hang forever, it should also not be too long so in case a pool is down we do not delay the ui too much, we chose 5 sec, we have not seen issues with this value.

If you wish you can increase the timeout over your current 30 sec  which is already too high, as indicated earlier, increasing the timeout could mask a root cause of why in your setup it is taking too long to respond, as indicated it could be underpowered hardware or network issues.

Pages: 1 2