Forums - PetaSAN

ForumGeneral Discussioncluster sizing
You need to log in to create posts and topics. Login · Register
cluster sizing

rophee
8 Posts

February 28, 2024, 6:37 pm
Quote from rophee on February 28, 2024, 6:37 pm
we have at least 5TB worth of data. our cluster is made up of 4 nodes with 8X898GB OSDs. we created 2X 5TB iSCSI drives. we connected the 2 5TB to a Windows Server 2022 creating 2 drives. we started migrating our 5TB data, splitting it to 2.5TB between the 2 drives. is our cluster undersized? How does Petasan (CEPH) reclaim space where data/file was deleted? we kept our data under 5TB in total by deleting files and yet it seems the space is not being reused.

898GBX32 = 28,736TB (Total space)
3X5TB = 15TB (Theoretical used space)

we have at least 5TB worth of data. our cluster is made up of 4 nodes with 8X898GB OSDs. we created 2X 5TB iSCSI drives. we connected the 2 5TB to a Windows Server 2022 creating 2 drives. we started migrating our 5TB data, splitting it to 2.5TB between the 2 drives. is our cluster undersized? How does Petasan (CEPH) reclaim space where data/file was deleted? we kept our data under 5TB in total by deleting files and yet it seems the space is not being reused.

898GBX32 = 28,736TB (Total space)
3X5TB = 15TB (Theoretical used space)

#1

admin
2,930 Posts

February 28, 2024, 10:41 pm
Quote from admin on February 28, 2024, 10:41 pm
Are you running out of space: look at the storage wheel on dashboard, OSD %usage, storage chart or pool all on dashboard. You can also use the ceph df command.

When you delete large files, the filesystem on Windows (NTFS) does not actually unwrite the data block/sectors written, it will just change some metadata to indicate the previously writt blocks can be over-written so will report the space as free from Windows, but PetaSAN at the block layer does not understand the client filesystem so it will report the data as used. It is possible to edit the disk and enable trim/reclaim which will make the filesystem at times send block commands to remove the sectors labeled as free, but this is not recommended for Windows, it will slow things without a real advantage.

Are you running out of space: look at the storage wheel on dashboard, OSD %usage, storage chart or pool all on dashboard. You can also use the ceph df command.

When you delete large files, the filesystem on Windows (NTFS) does not actually unwrite the data block/sectors written, it will just change some metadata to indicate the previously writt blocks can be over-written so will report the space as free from Windows, but PetaSAN at the block layer does not understand the client filesystem so it will report the data as used. It is possible to edit the disk and enable trim/reclaim which will make the filesystem at times send block commands to remove the sectors labeled as free, but this is not recommended for Windows, it will slow things without a real advantage.

#2

rophee
8 Posts

February 29, 2024, 1:09 am
Quote from rophee on February 29, 2024, 1:09 am
your comment has the same tone as what i've read from googling. our cluster is totally out of space. since we are at a point where we can still rework the cluster what can we do differently?

#ceph -s

root@ps-node-01:~# ceph -s
cluster:
id: b544c7c8-9f3b-4572-b999-d863ed0f99dc
health: HEALTH_ERR
10 backfillfull osd(s)
1 full osd(s)
5 nearfull osd(s)
4 pool(s) full

services:
mon: 3 daemons, quorum ps-node-03,ps-node-01,ps-node-02 (age 2w)
mgr: ps-node-02(active, since 2w), standbys: ps-node-03, ps-node-01
mds: 1/1 daemons up, 2 standby
osd: 32 osds: 32 up (since 2w), 32 in (since 5w)

data:
volumes: 1/1 healthy
pools: 4 pools, 1696 pgs
objects: 1.96M objects, 7.5 TiB
usage: 24 TiB used, 3.8 TiB / 28 TiB avail
pgs: 1695 active+clean
1 active+clean+scrubbing+deep

your comment has the same tone as what i've read from googling. our cluster is totally out of space. since we are at a point where we can still rework the cluster what can we do differently?

#ceph -s

root@ps-node-01:~# ceph -s
cluster:
id: b544c7c8-9f3b-4572-b999-d863ed0f99dc
health: HEALTH_ERR
10 backfillfull osd(s)
1 full osd(s)
5 nearfull osd(s)
4 pool(s) full

services:
mon: 3 daemons, quorum ps-node-03,ps-node-01,ps-node-02 (age 2w)
mgr: ps-node-02(active, since 2w), standbys: ps-node-03, ps-node-01
mds: 1/1 daemons up, 2 standby
osd: 32 osds: 32 up (since 2w), 32 in (since 5w)

data:
volumes: 1/1 healthy
pools: 4 pools, 1696 pgs
objects: 1.96M objects, 7.5 TiB
usage: 24 TiB used, 3.8 TiB / 28 TiB avail
pgs: 1695 active+clean
1 active+clean+scrubbing+deep

#3

admin
2,930 Posts

February 29, 2024, 5:53 am
Quote from admin on February 29, 2024, 5:53 am
best is to add new OSD drives. start with the host that has the full OSD, you could see this from dashboard.

less recommended is to try to lower the OSD crush weight slightly of the full OSDs, but this may result in creating other OSDs that are filled. least will be to enable trim/reclaim by editing the iSCSI and issue commands from Windows to invoke trim. last. last would be to backup data from one iSCSI disk and delete it then recreate smaller one.

best is to add new OSD drives. start with the host that has the full OSD, you could see this from dashboard.

less recommended is to try to lower the OSD crush weight slightly of the full OSDs, but this may result in creating other OSDs that are filled. least will be to enable trim/reclaim by editing the iSCSI and issue commands from Windows to invoke trim. last. last would be to backup data from one iSCSI disk and delete it then recreate smaller one.

Last edited on February 29, 2024, 5:54 am by admin · #4

rophee
8 Posts

March 3, 2024, 2:13 pm
Quote from rophee on March 3, 2024, 2:13 pm
for completeness and posterity, and to give thanks to PetaSAN for the continued help to people like us.

as recommended, this part we have an unused drive in our storage sled, we added new OSD to the node that has the full OSD - viewing from the dashboard. as the cluster starts to backfill, we monitored the usage and added new OSD where the high usage is, until we've added new OSDs on all the nodes. we only added 1 OSD per node. it took a while and let the cluster heal itself. only after the cluster is back to "HEALTH_OK", that we started to modify each iSCSI drive and ticked 'Yes' under 'Enable Trim/Discard'. At first, we thought of letting the cluster do it's cleanup, but it's somehow needing a nudge from our Windows server. we ran 'optimize-volume -driveletter -<drive letter> -defrag -verbose', and the cluster started to gain free space. we ran the powershell command a couple of times on each iSCSI drive as the cluster slowly reclaim free space. the cluster stopped almost at the theoretical used drive space, 15.2TB.

for completeness and posterity, and to give thanks to PetaSAN for the continued help to people like us.

as recommended, this part we have an unused drive in our storage sled, we added new OSD to the node that has the full OSD - viewing from the dashboard. as the cluster starts to backfill, we monitored the usage and added new OSD where the high usage is, until we've added new OSDs on all the nodes. we only added 1 OSD per node. it took a while and let the cluster heal itself. only after the cluster is back to "HEALTH_OK", that we started to modify each iSCSI drive and ticked 'Yes' under 'Enable Trim/Discard'. At first, we thought of letting the cluster do it's cleanup, but it's somehow needing a nudge from our Windows server. we ran 'optimize-volume -driveletter -<drive letter> -defrag -verbose', and the cluster started to gain free space. we ran the powershell command a couple of times on each iSCSI drive as the cluster slowly reclaim free space. the cluster stopped almost at the theoretical used drive space, 15.2TB.

#5

admin
2,930 Posts

March 3, 2024, 7:19 pm
Quote from admin on March 3, 2024, 7:19 pm
Thanks for the feedback and glad things worked.

Just a recommendation, since you added storage, there is no need for trim. As per earlier, the filesystem will re-use the space it gained from deleting files, so no space waste without having to use trim. Trim overhead can reduce performance if done regularly.

Thanks for the feedback and glad things worked.

Just a recommendation, since you added storage, there is no need for trim. As per earlier, the filesystem will re-use the space it gained from deleting files, so no space waste without having to use trim. Trim overhead can reduce performance if done regularly.

#6

rophee
8 Posts

March 8, 2024, 1:26 am
Quote from rophee on March 8, 2024, 1:26 am
your recommendation is to turn off 'Enable Trim/Discard'?

your recommendation is to turn off 'Enable Trim/Discard'?

#7

Post Reply: cluster sizing

Cancel