cant get mds services started
ghbiz
76 Posts
March 30, 2024, 3:42 amQuote from ghbiz on March 30, 2024, 3:42 amMDS services appear to be down and not starting.
see below where it is having an issue with the ${CLUSTER} variable...
root@ceph-public1:~# systemctl status ceph-mds@ceph-public1
● ceph-mds@ceph-public1.service - Ceph metadata server daemon
Loaded: loaded (/lib/systemd/system/ceph-mds@.service; disabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Fri 2024-03-29 20:08:41 EDT; 3h 31min ago
Process: 1235714 ExecStart=/usr/bin/ceph-mds -f --cluster ${CLUSTER} --id ceph-public1 --setuser ceph --setgroup ceph (code=exited, status=1/FAILURE)
Main PID: 1235714 (code=exited, status=1/FAILURE)
Mar 29 20:08:41 ceph-public1 systemd[1]: ceph-mds@ceph-public1.service: Service hold-off time over, scheduling restart.
Mar 29 20:08:41 ceph-public1 systemd[1]: ceph-mds@ceph-public1.service: Scheduled restart job, restart counter is at 3.
Mar 29 20:08:41 ceph-public1 systemd[1]: Stopped Ceph metadata server daemon.
Mar 29 20:08:41 ceph-public1 systemd[1]: ceph-mds@ceph-public1.service: Start request repeated too quickly.
Mar 29 20:08:41 ceph-public1 systemd[1]: ceph-mds@ceph-public1.service: Failed with result 'exit-code'.
Mar 29 20:08:41 ceph-public1 systemd[1]: Failed to start Ceph metadata server daemon.
root@ceph-public1:~#
MDS services appear to be down and not starting.
see below where it is having an issue with the ${CLUSTER} variable...
root@ceph-public1:~# systemctl status ceph-mds@ceph-public1
● ceph-mds@ceph-public1.service - Ceph metadata server daemon
Loaded: loaded (/lib/systemd/system/ceph-mds@.service; disabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Fri 2024-03-29 20:08:41 EDT; 3h 31min ago
Process: 1235714 ExecStart=/usr/bin/ceph-mds -f --cluster ${CLUSTER} --id ceph-public1 --setuser ceph --setgroup ceph (code=exited, status=1/FAILURE)
Main PID: 1235714 (code=exited, status=1/FAILURE)
Mar 29 20:08:41 ceph-public1 systemd[1]: ceph-mds@ceph-public1.service: Service hold-off time over, scheduling restart.
Mar 29 20:08:41 ceph-public1 systemd[1]: ceph-mds@ceph-public1.service: Scheduled restart job, restart counter is at 3.
Mar 29 20:08:41 ceph-public1 systemd[1]: Stopped Ceph metadata server daemon.
Mar 29 20:08:41 ceph-public1 systemd[1]: ceph-mds@ceph-public1.service: Start request repeated too quickly.
Mar 29 20:08:41 ceph-public1 systemd[1]: ceph-mds@ceph-public1.service: Failed with result 'exit-code'.
Mar 29 20:08:41 ceph-public1 systemd[1]: Failed to start Ceph metadata server daemon.
root@ceph-public1:~#
ghbiz
76 Posts
March 30, 2024, 7:15 pmQuote from ghbiz on March 30, 2024, 7:15 pmMagid, i forgot to mention that above is on a 2.8.1 cluster that due to internal reasons are hesitating to migrate to 3.2.1 (latest) as we are working on other issues that are underlining this cluster with regards to iSCSI ...
Magid, i forgot to mention that above is on a 2.8.1 cluster that due to internal reasons are hesitating to migrate to 3.2.1 (latest) as we are working on other issues that are underlining this cluster with regards to iSCSI ...
ghbiz
76 Posts
March 30, 2024, 9:10 pmQuote from ghbiz on March 30, 2024, 9:10 pmas an update. i followed your instructions on following link to recreate the mds services.
https://www.petasan.org/forums/?view=thread&id=790
November 2, 2020, 9:54 pm
1) can you run
ceph versions
2) Was this a fresh 2.6 install or what was this an older cluster that was upgraded ?
3) Did you try manually to add other mds servers yourself ?
4) Try to re-create the mds servers
on management node:
# edit installed flag
nano /opt/petasan/config/flags/flags.json
change line
"ceph_mds_installed": true
to
"ceph_mds_installed": false
# delete key
ceph auth del mds.HOSTNAME
# recreate mds
/opt/petasan/scripts/create_mds.py
# check if up
systemctl status ceph-mds@HOSTNAME
if it works, do the same on the other 2 management nodes.
as an update. i followed your instructions on following link to recreate the mds services.
November 2, 2020, 9:54 pm
1) can you run
ceph versions
2) Was this a fresh 2.6 install or what was this an older cluster that was upgraded ?
3) Did you try manually to add other mds servers yourself ?
4) Try to re-create the mds servers
on management node:
# edit installed flag
nano /opt/petasan/config/flags/flags.json
change line
"ceph_mds_installed": true
to
"ceph_mds_installed": false
# delete key
ceph auth del mds.HOSTNAME
# recreate mds
/opt/petasan/scripts/create_mds.py
# check if up
systemctl status ceph-mds@HOSTNAME
if it works, do the same on the other 2 management nodes.
cant get mds services started
ghbiz
76 Posts
Quote from ghbiz on March 30, 2024, 3:42 amMDS services appear to be down and not starting.
see below where it is having an issue with the ${CLUSTER} variable...
root@ceph-public1:~# systemctl status ceph-mds@ceph-public1
● ceph-mds@ceph-public1.service - Ceph metadata server daemon
Loaded: loaded (/lib/systemd/system/ceph-mds@.service; disabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Fri 2024-03-29 20:08:41 EDT; 3h 31min ago
Process: 1235714 ExecStart=/usr/bin/ceph-mds -f --cluster ${CLUSTER} --id ceph-public1 --setuser ceph --setgroup ceph (code=exited, status=1/FAILURE)
Main PID: 1235714 (code=exited, status=1/FAILURE)Mar 29 20:08:41 ceph-public1 systemd[1]: ceph-mds@ceph-public1.service: Service hold-off time over, scheduling restart.
Mar 29 20:08:41 ceph-public1 systemd[1]: ceph-mds@ceph-public1.service: Scheduled restart job, restart counter is at 3.
Mar 29 20:08:41 ceph-public1 systemd[1]: Stopped Ceph metadata server daemon.
Mar 29 20:08:41 ceph-public1 systemd[1]: ceph-mds@ceph-public1.service: Start request repeated too quickly.
Mar 29 20:08:41 ceph-public1 systemd[1]: ceph-mds@ceph-public1.service: Failed with result 'exit-code'.
Mar 29 20:08:41 ceph-public1 systemd[1]: Failed to start Ceph metadata server daemon.
root@ceph-public1:~#
MDS services appear to be down and not starting.
see below where it is having an issue with the ${CLUSTER} variable...
root@ceph-public1:~# systemctl status ceph-mds@ceph-public1
● ceph-mds@ceph-public1.service - Ceph metadata server daemon
Loaded: loaded (/lib/systemd/system/ceph-mds@.service; disabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Fri 2024-03-29 20:08:41 EDT; 3h 31min ago
Process: 1235714 ExecStart=/usr/bin/ceph-mds -f --cluster ${CLUSTER} --id ceph-public1 --setuser ceph --setgroup ceph (code=exited, status=1/FAILURE)
Main PID: 1235714 (code=exited, status=1/FAILURE)
Mar 29 20:08:41 ceph-public1 systemd[1]: ceph-mds@ceph-public1.service: Service hold-off time over, scheduling restart.
Mar 29 20:08:41 ceph-public1 systemd[1]: ceph-mds@ceph-public1.service: Scheduled restart job, restart counter is at 3.
Mar 29 20:08:41 ceph-public1 systemd[1]: Stopped Ceph metadata server daemon.
Mar 29 20:08:41 ceph-public1 systemd[1]: ceph-mds@ceph-public1.service: Start request repeated too quickly.
Mar 29 20:08:41 ceph-public1 systemd[1]: ceph-mds@ceph-public1.service: Failed with result 'exit-code'.
Mar 29 20:08:41 ceph-public1 systemd[1]: Failed to start Ceph metadata server daemon.
root@ceph-public1:~#
ghbiz
76 Posts
Quote from ghbiz on March 30, 2024, 7:15 pmMagid, i forgot to mention that above is on a 2.8.1 cluster that due to internal reasons are hesitating to migrate to 3.2.1 (latest) as we are working on other issues that are underlining this cluster with regards to iSCSI ...
Magid, i forgot to mention that above is on a 2.8.1 cluster that due to internal reasons are hesitating to migrate to 3.2.1 (latest) as we are working on other issues that are underlining this cluster with regards to iSCSI ...
ghbiz
76 Posts
Quote from ghbiz on March 30, 2024, 9:10 pmas an update. i followed your instructions on following link to recreate the mds services.
https://www.petasan.org/forums/?view=thread&id=790
November 2, 2020, 9:54 pm1) can you run
ceph versions2) Was this a fresh 2.6 install or what was this an older cluster that was upgraded ?
3) Did you try manually to add other mds servers yourself ?
4) Try to re-create the mds servers
on management node:# edit installed flag
nano /opt/petasan/config/flags/flags.json
change line
"ceph_mds_installed": true
to
"ceph_mds_installed": false# delete key
ceph auth del mds.HOSTNAME# recreate mds
/opt/petasan/scripts/create_mds.py# check if up
systemctl status ceph-mds@HOSTNAME
if it works, do the same on the other 2 management nodes.
as an update. i followed your instructions on following link to recreate the mds services.
1) can you run
ceph versions
2) Was this a fresh 2.6 install or what was this an older cluster that was upgraded ?
3) Did you try manually to add other mds servers yourself ?
4) Try to re-create the mds servers
on management node:
# edit installed flag
nano /opt/petasan/config/flags/flags.json
change line
"ceph_mds_installed": true
to
"ceph_mds_installed": false
# delete key
ceph auth del mds.HOSTNAME
# recreate mds
/opt/petasan/scripts/create_mds.py
# check if up
systemctl status ceph-mds@HOSTNAME
if it works, do the same on the other 2 management nodes.