Crash due to big mem usage of monitor

therm
121 Posts

September 11, 2017, 9:48 am

Hi,

this weekend we had a crash(of one ESX-Server) probably because of a memory leak of one mon-daemon. This one had 32GB of RAM allocated. Journalctl tells that petasans services did not had enough ram(64GB installed), so they stopped and started. In addition there were 'wrongly marked me down'-messages. On a backend port on a switch(where the server with the giant ram consuming mon-daemon is on) there were pause-frames leading me to the assumption that the server did not accept IO anylonger.

Any idea how to prevent this or if this is fixed in 1.4(due to ceph upgrade)?

Regards,

Dennis

admin
2,930 Posts

September 11, 2017, 12:45 pm

Hi,

can you please clarify some points:

Are you running PetaSAN virtualized with ESX ?
When you say "probably because of a memory leak of one mon-daemon." do you mean the ceph monitor daemon specifically or do you mean the leak happened one a management node in general but not specific to a particular service ?
"This one had 32GB of RAM allocated", you mean the management node has 32GB RAM, what about the 64GB installed ?
How many OSDs do you have per node ?
Were you able to measure memory usage by process using ps/top/pidstat ?
Does the problem with the PetaSAN node still happen or is it working normally now after the services restarted ?
Was there any changes to /etc/sysctl.conf or /etc/ceph/ceph.conf you did ?

v1.4 updates Ceph from 10.2.5 to 10.2.9, but we are not aware of an issue in the older version

therm
121 Posts

September 11, 2017, 12:56 pm

Are you running PetaSAN virtualized with ESX ?

No, bare metal.

When you say "probably because of a memory leak of one mon-daemon." do you mean the ceph monitor daemon specifically or do you mean the leak happened one a management node in general but not specific to a particular service ?

the ceph monitor did consume 32 GB.

"This one had 32GB of RAM allocated", you mean the management node has 32GB RAM, what about the 64GB installed ?

No the ceph mon. The node has 64GB RAM in sum. So the ceph mon consumed half of the nodes RAM.

How many OSDs do you have per node ?

24.

Were you able to measure memory usage by process using ps/top/pidstat ?

this is a top line

11077 ceph 20 0 31.724g 0.014t 9652 S 0.3 23.3 91:04.57 ceph-mon

Does the problem with the PetaSAN node still happen or is it working normally now after the services restarted ?

I did restart the mon service and let things recover. Top line now:

12436 ceph 20 0 1917792 134548 19356 S 0.7 0.2 1:15.53 ceph-mon

Was there any changes to /etc/sysctl.conf or /etc/ceph/ceph.conf you did ?

this is my rc.local

# Set vm_min_free_kbytes and zone_reclaim_mode to prevent 'page allocation failure'
# See https://access.redhat.com/solutions/641323
/sbin/sysctl -w vm.zone_reclaim_mode=1
echo '4194304' > /proc/sys/vm/min_free_kbytes

As you can see we were suffering page allocation failures in the past. With that it did not happen again.

admin
2,930 Posts

September 11, 2017, 5:03 pm

I would suggest to keep monitoring the mem usage for the monitor daemon.

also maybe worth checking the size of the monitor db

du -sch /var/lib/ceph/mon/CLUSTER_NAME-NOD_NAME/store.db/

example:

du -sch /var/lib/ceph/mon/demo-ps-node-01/store.db/

Also increase the log level on the Ceph monitor daemon

example:

ceph daemon mon.ps-node-01 config set debug_mon 10/10 --cluster demo