Forums - PetaSAN

ForumGeneral Discussion504 Gateway Time-out on every nod …
You need to log in to create posts and topics. Login · Register
504 Gateway Time-out on every node, 100% disk usage, I believe on a journal drive

Pages: 1 2

admin
2,959 Posts

March 2, 2023, 8:37 pm
Quote from admin on March 2, 2023, 8:37 pm
can you do the suggestion of my earlier post to look at the monitors: are they started ? can you start them or they error out ? do they communicate with each other when they start.

note having things in the mon logs : are you sure they are even running ?

can you do the suggestion of my earlier post to look at the monitors: are they started ? can you start them or they error out ? do they communicate with each other when they start.

note having things in the mon logs : are you sure they are even running ?

Last edited on March 2, 2023, 8:39 pm by admin · #11

moose999
9 Posts

March 14, 2023, 11:32 am
Quote from moose999 on March 14, 2023, 11:32 am
Please can you explain "look at the monitors" further?

I have been starting services manually - ceph is up now - but when I try and start smbd I get the following error:

Mar 14 09:47:13 san02 cifs_service.py[2001532]: mkdir: cannot create directory ‘/opt/petasan/config/shared’: Transport endpoint is not connected
Mar 14 09:47:13 san02 cifs_service.py[2001534]: touch: cannot touch '/opt/petasan/config/shared/ctdb/nodes': Transport endpoint is not connected
Mar 14 09:47:13 san02 cifs_service.py[2001536]: touch: cannot touch '/opt/petasan/config/shared/ctdb/public_addresses': Transport endpoint is not connected

when I try and cd to /opt/petasan/config/shared I get the same:

shared: Transport endpoint is not connected

Which of these services do I need to get started?

[ - ] apache-htcacheclean
[ + ] apache2
[ + ] atop
[ + ] atopacct
[ + ] carbon-cache
[ + ] ceph
[ + ] collectd
[ + ] collectl
[ - ] console-setup.sh
[ + ] cron
[ - ] ctdb
[ + ] dbus
[ - ] fio
[ + ] grafana-server
[ + ] grub-common
[ - ] hwclock.sh
[ - ] iscsid
[ - ] keyboard-setup.sh
[ + ] kmod
[ - ] lvm2
[ - ] lvm2-lvmpolld
[ + ] multipath-tools
[ + ] networking
[ - ] nfs-common
[ + ] nginx
[ - ] nmbd
[ - ] ntp
[ - ] open-iscsi
[ + ] opensm
[ + ] procps
[ + ] radosgw
[ + ] rpcbind
[ + ] rsyslog
[ - ] samba-ad-dc
[ - ] smartmontools
[ - ] smbd
[ + ] ssh
[ - ] sysstat
[ + ] udev
[ - ] uuidd
[ - ] winbind
[ + ] zabbix-agent

Many thanks for all your assistance!

Please can you explain "look at the monitors" further?

I have been starting services manually - ceph is up now - but when I try and start smbd I get the following error:

Mar 14 09:47:13 san02 cifs_service.py[2001532]: mkdir: cannot create directory ‘/opt/petasan/config/shared’: Transport endpoint is not connected
Mar 14 09:47:13 san02 cifs_service.py[2001534]: touch: cannot touch '/opt/petasan/config/shared/ctdb/nodes': Transport endpoint is not connected
Mar 14 09:47:13 san02 cifs_service.py[2001536]: touch: cannot touch '/opt/petasan/config/shared/ctdb/public_addresses': Transport endpoint is not connected

when I try and cd to /opt/petasan/config/shared I get the same:

shared: Transport endpoint is not connected

Which of these services do I need to get started?

[ - ] apache-htcacheclean
[ + ] apache2
[ + ] atop
[ + ] atopacct
[ + ] carbon-cache
[ + ] ceph
[ + ] collectd
[ + ] collectl
[ - ] console-setup.sh
[ + ] cron
[ - ] ctdb
[ + ] dbus
[ - ] fio
[ + ] grafana-server
[ + ] grub-common
[ - ] hwclock.sh
[ - ] iscsid
[ - ] keyboard-setup.sh
[ + ] kmod
[ - ] lvm2
[ - ] lvm2-lvmpolld
[ + ] multipath-tools
[ + ] networking
[ - ] nfs-common
[ + ] nginx
[ - ] nmbd
[ - ] ntp
[ - ] open-iscsi
[ + ] opensm
[ + ] procps
[ + ] radosgw
[ + ] rpcbind
[ + ] rsyslog
[ - ] samba-ad-dc
[ - ] smartmontools
[ - ] smbd
[ + ] ssh
[ - ] sysstat
[ + ] udev
[ - ] uuidd
[ - ] winbind
[ + ] zabbix-agent

Many thanks for all your assistance!

#12

admin
2,959 Posts

March 14, 2023, 1:29 pm
Quote from admin on March 14, 2023, 1:29 pm
Please can you explain "look at the monitors" further?

basically what i posted in my prev replies

I have been starting services manually - ceph is up now

is it healthy ? what is the output of
ceph status
Earlier you had error:

root@san01:~# ceph status
2023-03-01T17:09:02.922+0000 7f87455c3700 0 monclient(hunting): authenticate timed out after 300

how did you fix it ?

Please can you explain "look at the monitors" further?

basically what i posted in my prev replies

I have been starting services manually - ceph is up now

is it healthy ? what is the output of
ceph status
Earlier you had error:

root@san01:~# ceph status
2023-03-01T17:09:02.922+0000 7f87455c3700 0 monclient(hunting): authenticate timed out after 300

how did you fix it ?

#13

moose999
9 Posts

March 14, 2023, 4:22 pm
Quote from moose999 on March 14, 2023, 4:22 pm
Ah, I have not fixed anything then!

The service says 'active' - service ceph status shows:

● ceph.service - LSB: Start Ceph distributed file system daemons at boot time
Loaded: loaded (/etc/init.d/ceph; generated)
Active: active (exited) since Mon 2023-03-06 20:16:08 GMT; 1 weeks 0 days ago
Docs: man:systemd-sysv-generator(8)
Process: 4164194 ExecStart=/etc/init.d/ceph start (code=exited, status=0/SUCCESS)

Mar 06 20:16:08 san03 systemd[1]: Starting LSB: Start Ceph distributed file system daemons at boot time...
Mar 06 20:16:08 san03 systemd[1]: Started LSB: Start Ceph distributed file system daemons at boot time.

however ceph status still shows:

2023-03-14T14:19:22.108+0000 7fc2cdb5c700 0 monclient(hunting): authenticate timed out after 300

So I guess ceph is not up!

What is the command to start the ceph monitor?

The /var/log/ceph/ceph-mon.HOSTNAME.log files still do not show anything after the day the cluster was created in early January - the cluster ran well for over a month after that.

Many thanks!

Ah, I have not fixed anything then!

The service says 'active' - service ceph status shows:

● ceph.service - LSB: Start Ceph distributed file system daemons at boot time
Loaded: loaded (/etc/init.d/ceph; generated)
Active: active (exited) since Mon 2023-03-06 20:16:08 GMT; 1 weeks 0 days ago
Docs: man:systemd-sysv-generator(8)
Process: 4164194 ExecStart=/etc/init.d/ceph start (code=exited, status=0/SUCCESS)

Mar 06 20:16:08 san03 systemd[1]: Starting LSB: Start Ceph distributed file system daemons at boot time...
Mar 06 20:16:08 san03 systemd[1]: Started LSB: Start Ceph distributed file system daemons at boot time.

however ceph status still shows:

2023-03-14T14:19:22.108+0000 7fc2cdb5c700 0 monclient(hunting): authenticate timed out after 300

So I guess ceph is not up!

What is the command to start the ceph monitor?

The /var/log/ceph/ceph-mon.HOSTNAME.log files still do not show anything after the day the cluster was created in early January - the cluster ran well for over a month after that.

Many thanks!

#14

Post Reply: 504 Gateway Time-out on every node, 100% disk usage, I believe on a journal drive

Cancel

Pages: 1 2