Forums

Home / Forums

You need to log in to create posts and topics. Login · Register

All 3 monitors down !

Pages: 1 2
Quote from admin on September 1, 2021, 10:20 am

-For ram: 32 GB is not a lot depending on osds/services, can you make sure you meet recommendations.

Unfortunately I do not meet the requirements. When I configured the nodes I remembered 2GB ram/OSD, but now I read 4 GB/OSD, and as I have 10 OSDs I'm certainly out from the minimim. Could this cause the poweroff of the nodes ? I suspect this happens when heavy jobs are running.

 

-For missing file(s) such as /etc/fstab, maybe you are not booting from correct drive / root partitions ?

Exactly, when I boot with the grub commands reported above all files are at the right place.

 

-You could try to re-install grub

if you boot using bios:
grub-install --target=i386-pc --no-floppy /dev/sdXX # where sdXX is the drive name

if you boot EFI ( you should see a /sys/firmware/efi directory present )

grub-install --target=x86_64-efi --efi-directory=/boot/efi --no-floppy --bootloader-id=petasan

I tried to reinstall grub, but I was not successful. I tried both options, here are the commands:

root@petasan04:~# ll /sys/firmware/efi/
total 0
drwxr-xr-x  7 root root    0 Sep  3 11:49 ./
drwxr-xr-x  6 root root    0 Sep  1 18:51 ../
-r--r--r--  1 root root 4096 Sep  3 11:49 config_table
drwxr-xr-x  2 root root    0 Sep  1 18:51 efivars/
drwxr-xr-x  3 root root    0 Sep  3 11:49 esrt/
-r--r--r--  1 root root 4096 Sep  3 11:49 fw_platform_size
-r--r--r--  1 root root 4096 Sep  3 11:49 fw_vendor
-r--r--r--  1 root root 4096 Sep  3 11:49 runtime
drwxr-xr-x 11 root root    0 Sep  3 11:49 runtime-map/
drwxr-xr-x  2 root root    0 Sep  3 11:49 secret-key/
-r--------  1 root root 4096 Sep  3 11:49 systab
drwxr-xr-x 78 root root    0 Sep  3 11:49 vars/
root@petasan04:~#
root@petasan04:~#
root@petasan04:~#
root@petasan04:~#
root@petasan04:~# grub-install --target=x86_64-efi --efi-directory=/boot/efi --no-floppy --bootloader-id=petasan
Installing for x86_64-efi platform.
grub-install: error: /boot/efi doesn't look like an EFI partition.
root@petasan04:~#

root@petasan04:~# grub-install --target=i386-pc --no-floppy /dev/sdk3
grub-install: error: /usr/lib/grub/i386-pc/modinfo.sh doesn't exist. Please specify --target or --directory.

 

can you list output of:

mount | grep boot
find /boot/efi
cat /etc/fstab
blkid -s UUID -o value /dev/sdX2 # where /dev/sdX is your boot disk
blkid -s UUID -o value /dev/sdX3 # where /dev/sdX is your boot disk

 

Hi, sorry for the delay, but I was away last week. Here is the output of the commands, I list it only for one monitor, the others show consistent outputs:

root@petasan01:~# mount | grep boot
/dev/sdk2 on /boot/efi type vfat (rw,relatime,fmask=0022,dmask=0022,codepage=437,iocharset=iso8859-1,shortname=mixed,errors=remount-ro)

root@petasan01:~# find /boot/efi
/boot/efi
/boot/efi/EFI
/boot/efi/EFI/petasan
/boot/efi/EFI/petasan/grubx64.efi
/boot/efi/EFI/petasan/grub.cfg
/boot/efi/EFI/BOOT
/boot/efi/EFI/BOOT/BOOTX64.EFI

root@petasan01:~# cat /etc/fstab
UUID=8225-9FE1 /boot/efi vfat defaults 0 0
UUID=97018ecd-ca17-4e80-b429-c12e929d5f04 / ext4 defaults 0 0
UUID=0a92b47c-b9e2-4616-9099-02c6870ffc88 /var/lib/ceph ext4 defaults 0 0
UUID=ee53ffa1-0d8a-4253-8d69-d5bb17d9fcc6 /opt/petasan/config ext4 defaults 0 0

root@petasan01:~# blkid -s UUID -o value /dev/sdk2
8225-9FE1
root@petasan01:~# blkid -s UUID -o value /dev/sdk3
97018ecd-ca17-4e80-b429-c12e929d5f04

 

In the meanwhile, maybe the cause of reboots was fencing. I disabled it in management tab and today all nodes and OSDs are up and online, even if I received some emails stating that some nodes and OSDs have been down for a certain time last week. Here are the graphs related to the high load operation:

try:

grub-install --target=x86_64-efi --efi-directory=/boot/efi --no-floppy --bootloader-id=petasan
update-grub

 

It seems to work:

root@petasan01:~# grub-install --target=x86_64-efi --efi-directory=/boot/efi --no-floppy --bootloader-id=petasan
Installing for x86_64-efi platform.
Installation finished. No error reported.

root@petasan01:~# update-grub
Sourcing file `/etc/default/grub'
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-4.12.14-28-petasan
Found initrd image: /boot/initrd.img-4.12.14-28-petasan
Adding boot menu entry for EFI firmware configuration
done

but when I reboot it still enters "grub>" command prompt and I must manually issue the commands to boot the node.

 

is your bios settings set to boot from EFI ?

running 10 OSD"s on a single node with only 32GB of RAM is extremely problematic. Especially if you are also running iSCSI on those same nodes. What do you have your OSD memory target set to?

Quote from admin on September 14, 2021, 2:20 pm

is your bios settings set to boot from EFI ?

Bios settings are the same that has been working for over a year, I didn't change anything:

I also switched the two entries but I experience the same behaviour.

Quote from DividedByPi on September 14, 2021, 4:50 pm

What do you have your OSD memory target set to?

It is set to 4294967296 (4 GB I guess). Yes, I understood that with this new setting RAM is not enough, but this can explain the node failures (due to fencing), not the fact that nodes can't automatically boot anymore. From one day to another...

Please upgrade to 2.8.1 as it includes bug fix for grub bootloader. There was a recent Ubuntu grub update which resulted in a mix of both grub 2.02 and 2.04, which could cause issues with UEFI. In version 2.8.1 we do not reply on the Ubuntu version, so please perform an online update to fix this.

Quote from admin on September 27, 2021, 9:09 am

Please upgrade to 2.8.1 as it includes bug fix for grub bootloader.

Finally I got some RAM and added to my cluster nodes, now with 48 GB ram. Then I upgraded to version 2.8.1 and now the 3 monitors boot normally ! So it was definitely an Ubuntu issue...

Thanks and bye. Dk

 

Pages: 1 2