Forums

Home / Forums

You need to log in to create posts and topics. Login · Register

Internal Server Error while adding OSD

Hi admin,

I've tried to add an OSD disk with a Journal and a Cache disk. After clicking the "Add" button, we get the following message:

Internal Server Error

The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application.

So we've two questions:

1st: Can you fix that? 🙂
2nd: Do we have to add a second cache disk if we've more than eight OSDs?

 

Thank you and kind regards,
Reto

Typically we see this in case you do not have enough ram.

we have a information message when adding osds : each osd needs 4GB ram + write cache requires 2% of cache partition size in ram.

We've 96GB per Node.

12 Disks x 4GB = 48GB
2% of 600GB = 12GB

Total: 60GB --> So it should work.

 

Is it possible that we need one cache disk per 8 OSDs?

We've the same fault at adding the 10th OSD (on all nodes).

The first 9 OSDs are linked to the journal disk sda1 to sda9. If we try to add the 10th disk, the Internal Server Error is shown.

The memory is used at 1.9GB from 96GB, The average load is at 0.46/0.44/0.46.

Is it possible that we need one cache disk per 8 OSDs?

yes we put a max limit, but i believe we do/should error out gracefully, if you confirm this we can double check and fix.

2% of 600GB = 12GB

i take it the 600 GB is your entire disk size, not the

Note we recommend cache to serve from 2-4 HDDs, when you go more you may improve latency but your throughput will be limited by the single SSD.

Quote from admin on July 28, 2020, 7:35 pm

Is it possible that we need one cache disk per 8 OSDs?

yes we put a max limit, but i believe we do/should error out gracefully, if you confirm this we can double check and fix.

Yes, that whould be nice.

Quote from admin on July 28, 2020, 7:35 pm

2% of 600GB = 12GB

i take it the 600 GB is your entire disk size, not the

Note we recommend cache to serve from 2-4 HDDs, when you go more you may improve latency but your throughput will be limited by the single SSD.

Sorry, but I don't understand what you want to tell me 🙂
We've 12 disks 600G, 15k. Actually we've installed 1 disk for journal, 1 disk for cache and 10 OSD disks per node. Isn't that what you recommend?

The problem is each OSD requires a 64 GB partition on the journal and you ran out of disk space on the journal. I can confirm there was a bug in 2.5 it will throw this exception when you do not have enough journal space. This was not in 2.4 and is not in 2.6 ( due in 1 day), which do return some meaningful error.

Having said this, the idea of having a journal or/and cache is to have a faster device backing a slower device. So you need an SSD or NVME to act as journals and caches. The recommended ratio of SSD to HDD is 1:4 for journals and 1:4 for cache. For NVME you can go 1:12 for HDD journals, for cache 1:4-8.

Your existing setup of having 1 HDD act as journals (or cache) for 10 other HDDs is not correct and will cause deep performance drop like by a factor of 10 from a pure HDD OSD setup. Our recommendation is all flash or at least have SSD as journals with the above ratios.