How to add mds service

kpiti
26 Posts
January 2, 2025, 5:12 pmQuote from kpiti on January 2, 2025, 5:12 pmHi,
we've expanded our cluster with some additional 16TB disks and we wanted to create another EC cephfs pool but I got a warning that there are not enough mds servers. Before there were 3 installled (still are), 1 active/2 standby, now I've stopped the new pool but there are still 2 active+1 standby. At the time all 3 were in active state..
I've tried to restart the installation on one of the new (and unused) nodes to add MDS service but the mgmt services option was grayed out. I didn't destroy the node beforehand, just restarted the installation on previously joined node (set noout+norebalance first).. Is there a way to add MDS service to an already installed node somehow?
Thanks..
Jure
Hi,
we've expanded our cluster with some additional 16TB disks and we wanted to create another EC cephfs pool but I got a warning that there are not enough mds servers. Before there were 3 installled (still are), 1 active/2 standby, now I've stopped the new pool but there are still 2 active+1 standby. At the time all 3 were in active state..
I've tried to restart the installation on one of the new (and unused) nodes to add MDS service but the mgmt services option was grayed out. I didn't destroy the node beforehand, just restarted the installation on previously joined node (set noout+norebalance first).. Is there a way to add MDS service to an already installed node somehow?
Thanks..
Jure

admin
2,957 Posts
January 2, 2025, 7:43 pmQuote from admin on January 2, 2025, 7:43 pmHaving 1 active and 2 standby is correct. Is your existing cephfs working now ? what is exact warning you get ?
Having 1 active and 2 standby is correct. Is your existing cephfs working now ? what is exact warning you get ?

kpiti
26 Posts
January 2, 2025, 8:51 pmQuote from kpiti on January 2, 2025, 8:51 pmThat was until I added another cephs pool. The original one is REP3 and the new one is supposed to be an EC low-performance/big-capacity for testing first.. Our existing one is working ok, I stopped the new one for the moment. Currently I have:
# ceph fs status
cephfs - 32 clients
======
RANK STATE MDS ACTIVITY DNS INOS DIRS CAPS
0 active CEPH02 Reqs: 0 /s 1504k 1496k 4124 4838
1 active CEPH01 Reqs: 0 /s 3139 3121 78 327
POOL TYPE USED AVAIL
cephfs_metadata metadata 12.0G 63.4T
cephfs_data data 156T 63.4T
cephfs16 - 0 clients
========
POOL TYPE USED AVAIL
cephfs_16t_nocach_metadata metadata 760k 63.4T
cephfs_16t_nocache data 0 61.4T
STANDBY MDS
CEPH03
MDS version: ceph version 17.2.7 (b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy (stable)
The cephfs_16t ones are the new one.. At the time I wanted to mount the new fs on a client I got the warning on the client:
# mount error: no mds server is up or the cluster is laggy
and on the ceph -s
# ceph -s
cluster:
id: 2e7a0a56-89a1-481d-b78b-7ed5a44f1881
health: HEALTH_WARN
insufficient standby MDS daemons available
207 pgs not deep-scrubbed in time
services:
mon: 3 daemons, quorum CEPH03,CEPH01,CEPH02 (age 3M)
mgr: CEPH01(active, since 3M), standbys: CEPH02, CEPH03
mds: 3/3 daemons up
osd: 70 osds: 70 up (since 5d), 70 in (since 5d); 1374 remapped pgs
At the moment (with the new fs stopped) I have:
# ceph -s
cluster:
id: 2e7a0a56-89a1-481d-b78b-7ed5a44f1881
health: HEALTH_OK
services:
mon: 3 daemons, quorum CEPH03,CEPH01,CEPH02 (age 3M)
mgr: CEPH01(active, since 3M), standbys: CEPH02, CEPH03
mds: 2/2 daemons up, 1 standby
osd: 70 osds: 70 up (since 3d), 70 in (since 11d)
What useful info can I provide?
Thanx
Jure
That was until I added another cephs pool. The original one is REP3 and the new one is supposed to be an EC low-performance/big-capacity for testing first.. Our existing one is working ok, I stopped the new one for the moment. Currently I have:
# ceph fs status
cephfs - 32 clients
======
RANK STATE MDS ACTIVITY DNS INOS DIRS CAPS
0 active CEPH02 Reqs: 0 /s 1504k 1496k 4124 4838
1 active CEPH01 Reqs: 0 /s 3139 3121 78 327
POOL TYPE USED AVAIL
cephfs_metadata metadata 12.0G 63.4T
cephfs_data data 156T 63.4T
cephfs16 - 0 clients
========
POOL TYPE USED AVAIL
cephfs_16t_nocach_metadata metadata 760k 63.4T
cephfs_16t_nocache data 0 61.4T
STANDBY MDS
CEPH03
MDS version: ceph version 17.2.7 (b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy (stable)
The cephfs_16t ones are the new one.. At the time I wanted to mount the new fs on a client I got the warning on the client:
# mount error: no mds server is up or the cluster is laggy
and on the ceph -s
# ceph -s
cluster:
id: 2e7a0a56-89a1-481d-b78b-7ed5a44f1881
health: HEALTH_WARN
insufficient standby MDS daemons available
207 pgs not deep-scrubbed in time
services:
mon: 3 daemons, quorum CEPH03,CEPH01,CEPH02 (age 3M)
mgr: CEPH01(active, since 3M), standbys: CEPH02, CEPH03
mds: 3/3 daemons up
osd: 70 osds: 70 up (since 5d), 70 in (since 5d); 1374 remapped pgs
At the moment (with the new fs stopped) I have:
# ceph -s
cluster:
id: 2e7a0a56-89a1-481d-b78b-7ed5a44f1881
health: HEALTH_OK
services:
mon: 3 daemons, quorum CEPH03,CEPH01,CEPH02 (age 3M)
mgr: CEPH01(active, since 3M), standbys: CEPH02, CEPH03
mds: 2/2 daemons up, 1 standby
osd: 70 osds: 70 up (since 3d), 70 in (since 11d)
What useful info can I provide?
Thanx
Jure

kpiti
26 Posts
January 5, 2025, 1:12 amQuote from kpiti on January 5, 2025, 1:12 amHi, I've been digging into this a bit. In ceph docs it says: Each CephFS file system requires at least one MDS...
So having 2 active at the moment seems ok. I went by the instructions and added another MDS on a new host (our hosts are named CEPH{id} for location reference):
mkdir -p /var/lib/ceph/mds/ceph-CEPH{id}
ceph auth get-or-create mds.CEPH{id} mon 'profile mds' mgr 'profile mds' mds 'allow *' osd 'allow *' > /var/lib/ceph/mds/ceph-CEPH{id}/keyring
chown -R ceph:ceph /var/lib/ceph/mds
chmod 600 /var/lib/ceph/mds/ceph-CEPH{id}/keyring
Afterwards you need to start the service
systemctl start ceph-mds@CEPH{id}
At first it wouldn't start saying Failed. Start request repeated too quickly.
But after a while I tried again and it started.. Don't know why and how, rather annoying but..
Afterwards you have to enable the service for later I presume:
systemctl enable ceph-mds.target
systemctl enable ceph-mds\@CEPH{id}.service
And now I have 4 MDSs running:
# ceph -s
cluster:
id: 2e7a0a56-89a1-481d-b78b-7ed5a44f1881
health: HEALTH_OK
services:
mon: 3 daemons, quorum CEPH03,CEPH01,CEPH02 (age 3M)
mgr: CEPH01(active, since 3M), standbys: CEPH02, CEPH03
mds: 2/2 daemons up, 2 standby
osd: 70 osds: 70 up (since 2d), 70 in (since 2d)
data:
volumes: 1/2 healthy, 1 stopped
pools: 6 pools, 2241 pgs
objects: 37.06M objects, 52 TiB
usage: 161 TiB used, 398 TiB / 560 TiB avail
pgs: 2238 active+clean
3 active+clean+scrubbing+deep
I think I need to fiddle a bit with the fs settings as ceph fs dump shows some differences in some params (namely mds related)..
Which seems OK I think. Do you think I have missed something regarding Petasan distro compliance (like future upgrades proof)?
Thanks a lot..
Jure
Hi, I've been digging into this a bit. In ceph docs it says: Each CephFS file system requires at least one MDS...
So having 2 active at the moment seems ok. I went by the instructions and added another MDS on a new host (our hosts are named CEPH{id} for location reference):
mkdir -p /var/lib/ceph/mds/ceph-CEPH{id}
ceph auth get-or-create mds.CEPH{id} mon 'profile mds' mgr 'profile mds' mds 'allow *' osd 'allow *' > /var/lib/ceph/mds/ceph-CEPH{id}/keyring
chown -R ceph:ceph /var/lib/ceph/mds
chmod 600 /var/lib/ceph/mds/ceph-CEPH{id}/keyring
Afterwards you need to start the service
systemctl start ceph-mds@CEPH{id}
At first it wouldn't start saying Failed. Start request repeated too quickly.
But after a while I tried again and it started.. Don't know why and how, rather annoying but..
Afterwards you have to enable the service for later I presume:
systemctl enable ceph-mds.target
systemctl enable ceph-mds\@CEPH{id}.service
And now I have 4 MDSs running:
# ceph -s
cluster:
id: 2e7a0a56-89a1-481d-b78b-7ed5a44f1881
health: HEALTH_OK
services:
mon: 3 daemons, quorum CEPH03,CEPH01,CEPH02 (age 3M)
mgr: CEPH01(active, since 3M), standbys: CEPH02, CEPH03
mds: 2/2 daemons up, 2 standby
osd: 70 osds: 70 up (since 2d), 70 in (since 2d)
data:
volumes: 1/2 healthy, 1 stopped
pools: 6 pools, 2241 pgs
objects: 37.06M objects, 52 TiB
usage: 161 TiB used, 398 TiB / 560 TiB avail
pgs: 2238 active+clean
3 active+clean+scrubbing+deep
I think I need to fiddle a bit with the fs settings as ceph fs dump shows some differences in some params (namely mds related)..
Which seems OK I think. Do you think I have missed something regarding Petasan distro compliance (like future upgrades proof)?
Thanks a lot..
Jure
How to add mds service
kpiti
26 Posts
Quote from kpiti on January 2, 2025, 5:12 pmHi,
we've expanded our cluster with some additional 16TB disks and we wanted to create another EC cephfs pool but I got a warning that there are not enough mds servers. Before there were 3 installled (still are), 1 active/2 standby, now I've stopped the new pool but there are still 2 active+1 standby. At the time all 3 were in active state..
I've tried to restart the installation on one of the new (and unused) nodes to add MDS service but the mgmt services option was grayed out. I didn't destroy the node beforehand, just restarted the installation on previously joined node (set noout+norebalance first).. Is there a way to add MDS service to an already installed node somehow?
Thanks..
Jure
Hi,
we've expanded our cluster with some additional 16TB disks and we wanted to create another EC cephfs pool but I got a warning that there are not enough mds servers. Before there were 3 installled (still are), 1 active/2 standby, now I've stopped the new pool but there are still 2 active+1 standby. At the time all 3 were in active state..
I've tried to restart the installation on one of the new (and unused) nodes to add MDS service but the mgmt services option was grayed out. I didn't destroy the node beforehand, just restarted the installation on previously joined node (set noout+norebalance first).. Is there a way to add MDS service to an already installed node somehow?
Thanks..
Jure
admin
2,957 Posts
Quote from admin on January 2, 2025, 7:43 pmHaving 1 active and 2 standby is correct. Is your existing cephfs working now ? what is exact warning you get ?
Having 1 active and 2 standby is correct. Is your existing cephfs working now ? what is exact warning you get ?
kpiti
26 Posts
Quote from kpiti on January 2, 2025, 8:51 pmThat was until I added another cephs pool. The original one is REP3 and the new one is supposed to be an EC low-performance/big-capacity for testing first.. Our existing one is working ok, I stopped the new one for the moment. Currently I have:
# ceph fs status
cephfs - 32 clients
======
RANK STATE MDS ACTIVITY DNS INOS DIRS CAPS
0 active CEPH02 Reqs: 0 /s 1504k 1496k 4124 4838
1 active CEPH01 Reqs: 0 /s 3139 3121 78 327
POOL TYPE USED AVAIL
cephfs_metadata metadata 12.0G 63.4T
cephfs_data data 156T 63.4T
cephfs16 - 0 clients
========
POOL TYPE USED AVAIL
cephfs_16t_nocach_metadata metadata 760k 63.4T
cephfs_16t_nocache data 0 61.4T
STANDBY MDS
CEPH03
MDS version: ceph version 17.2.7 (b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy (stable)
The cephfs_16t ones are the new one.. At the time I wanted to mount the new fs on a client I got the warning on the client:
# mount error: no mds server is up or the cluster is laggy
and on the ceph -s
# ceph -s
cluster:
id: 2e7a0a56-89a1-481d-b78b-7ed5a44f1881
health: HEALTH_WARN
insufficient standby MDS daemons available
207 pgs not deep-scrubbed in time
services:
mon: 3 daemons, quorum CEPH03,CEPH01,CEPH02 (age 3M)
mgr: CEPH01(active, since 3M), standbys: CEPH02, CEPH03
mds: 3/3 daemons up
osd: 70 osds: 70 up (since 5d), 70 in (since 5d); 1374 remapped pgs
At the moment (with the new fs stopped) I have:
# ceph -s
cluster:
id: 2e7a0a56-89a1-481d-b78b-7ed5a44f1881
health: HEALTH_OK
services:
mon: 3 daemons, quorum CEPH03,CEPH01,CEPH02 (age 3M)
mgr: CEPH01(active, since 3M), standbys: CEPH02, CEPH03
mds: 2/2 daemons up, 1 standby
osd: 70 osds: 70 up (since 3d), 70 in (since 11d)
What useful info can I provide?
Thanx
Jure
That was until I added another cephs pool. The original one is REP3 and the new one is supposed to be an EC low-performance/big-capacity for testing first.. Our existing one is working ok, I stopped the new one for the moment. Currently I have:
# ceph fs status
cephfs - 32 clients
======
RANK STATE MDS ACTIVITY DNS INOS DIRS CAPS
0 active CEPH02 Reqs: 0 /s 1504k 1496k 4124 4838
1 active CEPH01 Reqs: 0 /s 3139 3121 78 327
POOL TYPE USED AVAIL
cephfs_metadata metadata 12.0G 63.4T
cephfs_data data 156T 63.4T
cephfs16 - 0 clients
========
POOL TYPE USED AVAIL
cephfs_16t_nocach_metadata metadata 760k 63.4T
cephfs_16t_nocache data 0 61.4T
STANDBY MDS
CEPH03
MDS version: ceph version 17.2.7 (b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy (stable)
The cephfs_16t ones are the new one.. At the time I wanted to mount the new fs on a client I got the warning on the client:
# mount error: no mds server is up or the cluster is laggy
and on the ceph -s
# ceph -s
cluster:
id: 2e7a0a56-89a1-481d-b78b-7ed5a44f1881
health: HEALTH_WARN
insufficient standby MDS daemons available
207 pgs not deep-scrubbed in time
services:
mon: 3 daemons, quorum CEPH03,CEPH01,CEPH02 (age 3M)
mgr: CEPH01(active, since 3M), standbys: CEPH02, CEPH03
mds: 3/3 daemons up
osd: 70 osds: 70 up (since 5d), 70 in (since 5d); 1374 remapped pgs
At the moment (with the new fs stopped) I have:
# ceph -s
cluster:
id: 2e7a0a56-89a1-481d-b78b-7ed5a44f1881
health: HEALTH_OK
services:
mon: 3 daemons, quorum CEPH03,CEPH01,CEPH02 (age 3M)
mgr: CEPH01(active, since 3M), standbys: CEPH02, CEPH03
mds: 2/2 daemons up, 1 standby
osd: 70 osds: 70 up (since 3d), 70 in (since 11d)
What useful info can I provide?
Thanx
Jure
kpiti
26 Posts
Quote from kpiti on January 5, 2025, 1:12 amHi, I've been digging into this a bit. In ceph docs it says: Each CephFS file system requires at least one MDS...
So having 2 active at the moment seems ok. I went by the instructions and added another MDS on a new host (our hosts are named CEPH{id} for location reference):
mkdir -p /var/lib/ceph/mds/ceph-CEPH{id} ceph auth get-or-create mds.CEPH{id} mon 'profile mds' mgr 'profile mds' mds 'allow *' osd 'allow *' > /var/lib/ceph/mds/ceph-CEPH{id}/keyring chown -R ceph:ceph /var/lib/ceph/mds chmod 600 /var/lib/ceph/mds/ceph-CEPH{id}/keyringAfterwards you need to start the service
systemctl start ceph-mds@CEPH{id}At first it wouldn't start saying Failed. Start request repeated too quickly.
But after a while I tried again and it started.. Don't know why and how, rather annoying but..
Afterwards you have to enable the service for later I presume:
systemctl enable ceph-mds.target systemctl enable ceph-mds\@CEPH{id}.serviceAnd now I have 4 MDSs running:
# ceph -s cluster: id: 2e7a0a56-89a1-481d-b78b-7ed5a44f1881 health: HEALTH_OK services: mon: 3 daemons, quorum CEPH03,CEPH01,CEPH02 (age 3M) mgr: CEPH01(active, since 3M), standbys: CEPH02, CEPH03 mds: 2/2 daemons up, 2 standby osd: 70 osds: 70 up (since 2d), 70 in (since 2d) data: volumes: 1/2 healthy, 1 stopped pools: 6 pools, 2241 pgs objects: 37.06M objects, 52 TiB usage: 161 TiB used, 398 TiB / 560 TiB avail pgs: 2238 active+clean 3 active+clean+scrubbing+deepI think I need to fiddle a bit with the fs settings as ceph fs dump shows some differences in some params (namely mds related)..
Which seems OK I think. Do you think I have missed something regarding Petasan distro compliance (like future upgrades proof)?
Thanks a lot..
Jure
Hi, I've been digging into this a bit. In ceph docs it says: Each CephFS file system requires at least one MDS...
So having 2 active at the moment seems ok. I went by the instructions and added another MDS on a new host (our hosts are named CEPH{id} for location reference):
mkdir -p /var/lib/ceph/mds/ceph-CEPH{id} ceph auth get-or-create mds.CEPH{id} mon 'profile mds' mgr 'profile mds' mds 'allow *' osd 'allow *' > /var/lib/ceph/mds/ceph-CEPH{id}/keyring chown -R ceph:ceph /var/lib/ceph/mds chmod 600 /var/lib/ceph/mds/ceph-CEPH{id}/keyring
Afterwards you need to start the service
systemctl start ceph-mds@CEPH{id}
At first it wouldn't start saying Failed. Start request repeated too quickly.
But after a while I tried again and it started.. Don't know why and how, rather annoying but..
Afterwards you have to enable the service for later I presume:
systemctl enable ceph-mds.target systemctl enable ceph-mds\@CEPH{id}.service
And now I have 4 MDSs running:
# ceph -s cluster: id: 2e7a0a56-89a1-481d-b78b-7ed5a44f1881 health: HEALTH_OK services: mon: 3 daemons, quorum CEPH03,CEPH01,CEPH02 (age 3M) mgr: CEPH01(active, since 3M), standbys: CEPH02, CEPH03 mds: 2/2 daemons up, 2 standby osd: 70 osds: 70 up (since 2d), 70 in (since 2d) data: volumes: 1/2 healthy, 1 stopped pools: 6 pools, 2241 pgs objects: 37.06M objects, 52 TiB usage: 161 TiB used, 398 TiB / 560 TiB avail pgs: 2238 active+clean 3 active+clean+scrubbing+deep
I think I need to fiddle a bit with the fs settings as ceph fs dump shows some differences in some params (namely mds related)..
Which seems OK I think. Do you think I have missed something regarding Petasan distro compliance (like future upgrades proof)?
Thanks a lot..
Jure