Forums

Home / Forums

You need to log in to create posts and topics. Login · Register

How to add cephfs to cluster?

Hi!

I have petasan cluster without cephfs.

Now I want add cephfs support to cluster (withoit CIFS/NFS - only clean cephfs).

I created two pools (meta and data).

I created new filesystem.

Now I have next state:

root@petasan-mon1:~# ceph -s
  cluster:
    id:     982c2213-6936-4285-a641-56d1ab906e04
    health: HEALTH_ERR
            1 filesystem is offline
            1 filesystem is online with fewer MDS than max_mds

  services:
    mon: 3 daemons, quorum petasan-mon3,petasan-mon1,petasan-mon2 (age 27s)
    mgr: petasan-mon1(active, since 18h), standbys: petasan-mon3, petasan-mon2
    mds: fs:0 1 up:standby
    osd: 120 osds: 120 up (since 2w), 120 in (since 5M)

How to activate MDS servers for cephfs?

WBR,

Fyodor.

What version are you using ?

what is output of

ceph fs ls
ceph fs dump
ceph fs status

on first 3 nodes
systemctl status ceph-mds@HOSTNAME

Version - 2.6.2

root@petasan-mon1:~# ceph fs ls
name: fs, metadata pool: fs-meta, data pools: [fs-data ]

 

root@petasan-mon1:~# ceph fs dump
dumped fsmap epoch 361
e361
enable_multiple, ever_enabled_multiple: 0,0
compat: compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=no anchor table,9=file layout v2,10=snaprealm v2}
legacy client fscid: 2

Filesystem 'fs' (2)
fs_name fs
epoch   361
flags   12
created 2020-11-02 12:32:28.456865
modified        2020-11-02 12:32:28.456883
tableserver     0
root    0
session_timeout 60
session_autoclose       300
max_file_size   1099511627776
min_compat_client       -1 (unspecified)
last_failure    0
last_failure_osd_epoch  0
compat  compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=no anchor table,9=file layout v2,10=snaprealm v2}
max_mds 1
in
up      {}
failed
damaged
stopped
data_pools      [5]
metadata_pool   6
inline_data     disabled
balancer
standby_count_wanted    0

Standby daemons:

35016481:       [v2:10.5.108.13:6800/1614510040,v1:10.5.108.13:6801/1614510040] 'petasan-mon3' mds.-1.0 up:standby seq 268228 laggy since 2020-10-31 08:24:21.955938

 

root@petasan-mon1:~# ceph fs status
Error EINVAL: Traceback (most recent call last):
File "/usr/share/ceph/mgr/mgr_module.py", line 914, in _handle_command
return self.handle_command(inbuf, cmd)
File "/usr/share/ceph/mgr/status/module.py", line 251, in handle_command
return self.handle_fs_status(cmd)
File "/usr/share/ceph/mgr/status/module.py", line 176, in handle_fs_status
mds_versions[metadata.get('ceph_version', "unknown")].append(standby['name'])
AttributeError: 'NoneType' object has no attribute 'get'

 

root@petasan-mon1:~# systemctl status ceph-mds@petasan-mon1
ceph-mds@petasan-mon1.service - Ceph metadata server daemon
Loaded: loaded (/lib/systemd/system/ceph-mds@.service; disabled; vendor preset: enabled)
Active: inactive (dead)

root@petasan-mon2:~# systemctl status ceph-mds@petasan-mon2
ceph-mds@petasan-mon2.service - Ceph metadata server daemon
Loaded: loaded (/lib/systemd/system/ceph-mds@.service; disabled; vendor preset: enabled)
Active: inactive (dead)

root@petasan-mon3:~# systemctl status ceph-mds@petasan-mon3
ceph-mds@petasan-mon3.service - Ceph metadata server daemon
Loaded: loaded (/lib/systemd/system/ceph-mds@.service; disabled; vendor preset: enabled)
Active: inactive (dead)

 

can you start the mds services with systemctl or do you get an error ?

Error.

-- The start-up result is RESULT.
Nov 02 20:19:04 petasan-mon2 systemd[1]: Started Ceph metadata server daemon.
-- Subject: Unit ceph-mds@petasan-mon2.service has finished start-up
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
--
-- Unit ceph-mds@petasan-mon2.service has finished starting up.
--
-- The start-up result is RESULT.
Nov 02 20:19:04 petasan-mon2 ceph-mds[675065]: 2020-11-02 20:19:04.788 7fcb28971700 -1 monclient(hunting): handle_auth_bad_method server allowed_$
ethods [2] but i only support [2]
Nov 02 20:19:04 petasan-mon2 ceph-mds[675065]: 2020-11-02 20:19:04.788 7fcb2796f700 -1 monclient(hunting): handle_auth_bad_method server allowed_$
ethods [2] but i only support [2]
Nov 02 20:19:04 petasan-mon2 ceph-mds[675065]: 2020-11-02 20:19:04.788 7fcb28170700 -1 monclient(hunting): handle_auth_bad_method server allowed_$
ethods [2] but i only support [2]
Nov 02 20:19:04 petasan-mon2 ceph-mds[675065]: failed to fetch mon config (--no-mon-config to skip)
Nov 02 20:19:04 petasan-mon2 systemd[1]: ceph-mds@petasan-mon2.service: Main process exited, code=exited, status=1/FAILURE
Nov 02 20:19:04 petasan-mon2 systemd[1]: ceph-mds@petasan-mon2.service: Failed with result 'exit-code'.

1) can you run
ceph versions

2) Was this a fresh 2.6 install or what was this an older cluster that was upgraded ?

3) Did you try manually to add other mds servers yourself ?

4) Try to re-create the mds servers
on management node:

# edit installed flag
nano /opt/petasan/config/flags/flags.json
change line
"ceph_mds_installed": true
to
"ceph_mds_installed": false

# delete key
ceph auth del mds.HOSTNAME

# recreate mds
/opt/petasan/scripts/create_mds.py

# check if up

systemctl status ceph-mds@HOSTNAME

if it works, do the same on the other 2 management nodes.

root@petasan-mon2:~# ceph versions
{
    "mon": {
        "ceph version 14.2.7 (3d58626ebeec02d8385a4cefb92c6cbc3a45bfe8) nautilus (stable)": 3
    },
    "mgr": {
        "ceph version 14.2.7 (3d58626ebeec02d8385a4cefb92c6cbc3a45bfe8) nautilus (stable)": 3
    },
    "osd": {
        "ceph version 14.2.7 (3d58626ebeec02d8385a4cefb92c6cbc3a45bfe8) nautilus (stable)": 120
    },
    "mds": {},
    "overall": {
        "ceph version 14.2.7 (3d58626ebeec02d8385a4cefb92c6cbc3a45bfe8) nautilus (stable)": 126
    }
}

2 - It's 2.5 cluster upgraded to current state. Additionally, all control servers were moved to new hardware (one at a time through the standard replacement function).

3 - no.

4 -  I don't have such a line. This is how this file looks:

{
    "ceph_mgr_installed": true,
    "ceph_config_upload": true
}

I have added the specified line.

After delete key and recreate all looks fine:

root@petasan-mon3:~# ceph -s
  cluster:
    id:     982c2213-6936-4285-a641-56d1ab906e04
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum petasan-mon3,petasan-mon1,petasan-mon2 (age 2d)
    mgr: petasan-mon1(active, since 2d), standbys: petasan-mon3, petasan-mon2
    mds: fs:1 {0=petasan-mon1=up:active} 2 up:standby
    osd: 120 osds: 120 up (since 2w), 120 in (since 5M)

Thank you very much for the help. It remains to understand - what was it?

 

Glad it worked 🙂 Not sure the cause but my suspect is the replacement of all management nodes could have dropped something maybe.