Forums - PetaSAN

ForumGeneral Discussioncant get mds services started
You need to log in to create posts and topics. Login · Register
cant get mds services started

ghbiz
76 Posts

March 30, 2024, 3:42 am
Quote from ghbiz on March 30, 2024, 3:42 am
MDS services appear to be down and not starting.

see below where it is having an issue with the ${CLUSTER} variable...

root@ceph-public1:~# systemctl status ceph-mds@ceph-public1
● ceph-mds@ceph-public1.service - Ceph metadata server daemon
Loaded: loaded (/lib/systemd/system/ceph-mds@.service; disabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Fri 2024-03-29 20:08:41 EDT; 3h 31min ago
Process: 1235714 ExecStart=/usr/bin/ceph-mds -f --cluster ${CLUSTER} --id ceph-public1 --setuser ceph --setgroup ceph (code=exited, status=1/FAILURE)
Main PID: 1235714 (code=exited, status=1/FAILURE)

Mar 29 20:08:41 ceph-public1 systemd[1]: ceph-mds@ceph-public1.service: Service hold-off time over, scheduling restart.
Mar 29 20:08:41 ceph-public1 systemd[1]: ceph-mds@ceph-public1.service: Scheduled restart job, restart counter is at 3.
Mar 29 20:08:41 ceph-public1 systemd[1]: Stopped Ceph metadata server daemon.
Mar 29 20:08:41 ceph-public1 systemd[1]: ceph-mds@ceph-public1.service: Start request repeated too quickly.
Mar 29 20:08:41 ceph-public1 systemd[1]: ceph-mds@ceph-public1.service: Failed with result 'exit-code'.
Mar 29 20:08:41 ceph-public1 systemd[1]: Failed to start Ceph metadata server daemon.
root@ceph-public1:~#

MDS services appear to be down and not starting.

see below where it is having an issue with the ${CLUSTER} variable...

root@ceph-public1:~# systemctl status ceph-mds@ceph-public1
● ceph-mds@ceph-public1.service - Ceph metadata server daemon
Loaded: loaded (/lib/systemd/system/ceph-mds@.service; disabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Fri 2024-03-29 20:08:41 EDT; 3h 31min ago
Process: 1235714 ExecStart=/usr/bin/ceph-mds -f --cluster ${CLUSTER} --id ceph-public1 --setuser ceph --setgroup ceph (code=exited, status=1/FAILURE)
Main PID: 1235714 (code=exited, status=1/FAILURE)

Mar 29 20:08:41 ceph-public1 systemd[1]: ceph-mds@ceph-public1.service: Service hold-off time over, scheduling restart.
Mar 29 20:08:41 ceph-public1 systemd[1]: ceph-mds@ceph-public1.service: Scheduled restart job, restart counter is at 3.
Mar 29 20:08:41 ceph-public1 systemd[1]: Stopped Ceph metadata server daemon.
Mar 29 20:08:41 ceph-public1 systemd[1]: ceph-mds@ceph-public1.service: Start request repeated too quickly.
Mar 29 20:08:41 ceph-public1 systemd[1]: ceph-mds@ceph-public1.service: Failed with result 'exit-code'.
Mar 29 20:08:41 ceph-public1 systemd[1]: Failed to start Ceph metadata server daemon.
root@ceph-public1:~#

#1

ghbiz
76 Posts

March 30, 2024, 7:15 pm
Quote from ghbiz on March 30, 2024, 7:15 pm
Magid, i forgot to mention that above is on a 2.8.1 cluster that due to internal reasons are hesitating to migrate to 3.2.1 (latest) as we are working on other issues that are underlining this cluster with regards to iSCSI ...

Magid, i forgot to mention that above is on a 2.8.1 cluster that due to internal reasons are hesitating to migrate to 3.2.1 (latest) as we are working on other issues that are underlining this cluster with regards to iSCSI ...

#2

ghbiz
76 Posts

March 30, 2024, 9:10 pm
Quote from ghbiz on March 30, 2024, 9:10 pm
as an update. i followed your instructions on following link to recreate the mds services.

https://www.petasan.org/forums/?view=thread&id=790

November 2, 2020, 9:54 pm

Quote

1) can you run
ceph versions

2) Was this a fresh 2.6 install or what was this an older cluster that was upgraded ?

3) Did you try manually to add other mds servers yourself ?

4) Try to re-create the mds servers
on management node:

# edit installed flag
nano /opt/petasan/config/flags/flags.json
change line
"ceph_mds_installed": true
to
"ceph_mds_installed": false

# delete key
ceph auth del mds.HOSTNAME

# recreate mds
/opt/petasan/scripts/create_mds.py

# check if up

systemctl status ceph-mds@HOSTNAME

if it works, do the same on the other 2 management nodes.

as an update. i followed your instructions on following link to recreate the mds services.

Forums

November 2, 2020, 9:54 pm

Quote

1) can you run
ceph versions

2) Was this a fresh 2.6 install or what was this an older cluster that was upgraded ?

3) Did you try manually to add other mds servers yourself ?

4) Try to re-create the mds servers
on management node:

# edit installed flag
nano /opt/petasan/config/flags/flags.json
change line
"ceph_mds_installed": true
to
"ceph_mds_installed": false

# delete key
ceph auth del mds.HOSTNAME

# recreate mds
/opt/petasan/scripts/create_mds.py

# check if up

systemctl status ceph-mds@HOSTNAME

if it works, do the same on the other 2 management nodes.

#3

Post Reply: cant get mds services started

Cancel