504 Gateway Time-out on every node, 100% disk usage, I believe on a journal drive
Pages: 1 2
admin
2,930 Posts
March 2, 2023, 8:37 pmQuote from admin on March 2, 2023, 8:37 pmcan you do the suggestion of my earlier post to look at the monitors: are they started ? can you start them or they error out ? do they communicate with each other when they start.
note having things in the mon logs : are you sure they are even running ?
can you do the suggestion of my earlier post to look at the monitors: are they started ? can you start them or they error out ? do they communicate with each other when they start.
note having things in the mon logs : are you sure they are even running ?
Last edited on March 2, 2023, 8:39 pm by admin · #11
moose999
9 Posts
March 14, 2023, 11:32 amQuote from moose999 on March 14, 2023, 11:32 amPlease can you explain "look at the monitors" further?
I have been starting services manually - ceph is up now - but when I try and start smbd I get the following error:
Mar 14 09:47:13 san02 cifs_service.py[2001532]: mkdir: cannot create directory ‘/opt/petasan/config/shared’: Transport endpoint is not connected
Mar 14 09:47:13 san02 cifs_service.py[2001534]: touch: cannot touch '/opt/petasan/config/shared/ctdb/nodes': Transport endpoint is not connected
Mar 14 09:47:13 san02 cifs_service.py[2001536]: touch: cannot touch '/opt/petasan/config/shared/ctdb/public_addresses': Transport endpoint is not connected
when I try and cd to /opt/petasan/config/shared I get the same:
shared: Transport endpoint is not connected
Which of these services do I need to get started?
[ - ] apache-htcacheclean
[ + ] apache2
[ + ] atop
[ + ] atopacct
[ + ] carbon-cache
[ + ] ceph
[ + ] collectd
[ + ] collectl
[ - ] console-setup.sh
[ + ] cron
[ - ] ctdb
[ + ] dbus
[ - ] fio
[ + ] grafana-server
[ + ] grub-common
[ - ] hwclock.sh
[ - ] iscsid
[ - ] keyboard-setup.sh
[ + ] kmod
[ - ] lvm2
[ - ] lvm2-lvmpolld
[ + ] multipath-tools
[ + ] networking
[ - ] nfs-common
[ + ] nginx
[ - ] nmbd
[ - ] ntp
[ - ] open-iscsi
[ + ] opensm
[ + ] procps
[ + ] radosgw
[ + ] rpcbind
[ + ] rsyslog
[ - ] samba-ad-dc
[ - ] smartmontools
[ - ] smbd
[ + ] ssh
[ - ] sysstat
[ + ] udev
[ - ] uuidd
[ - ] winbind
[ + ] zabbix-agent
Many thanks for all your assistance!
Please can you explain "look at the monitors" further?
I have been starting services manually - ceph is up now - but when I try and start smbd I get the following error:
Mar 14 09:47:13 san02 cifs_service.py[2001532]: mkdir: cannot create directory ‘/opt/petasan/config/shared’: Transport endpoint is not connected
Mar 14 09:47:13 san02 cifs_service.py[2001534]: touch: cannot touch '/opt/petasan/config/shared/ctdb/nodes': Transport endpoint is not connected
Mar 14 09:47:13 san02 cifs_service.py[2001536]: touch: cannot touch '/opt/petasan/config/shared/ctdb/public_addresses': Transport endpoint is not connected
when I try and cd to /opt/petasan/config/shared I get the same:
shared: Transport endpoint is not connected
Which of these services do I need to get started?
[ - ] apache-htcacheclean
[ + ] apache2
[ + ] atop
[ + ] atopacct
[ + ] carbon-cache
[ + ] ceph
[ + ] collectd
[ + ] collectl
[ - ] console-setup.sh
[ + ] cron
[ - ] ctdb
[ + ] dbus
[ - ] fio
[ + ] grafana-server
[ + ] grub-common
[ - ] hwclock.sh
[ - ] iscsid
[ - ] keyboard-setup.sh
[ + ] kmod
[ - ] lvm2
[ - ] lvm2-lvmpolld
[ + ] multipath-tools
[ + ] networking
[ - ] nfs-common
[ + ] nginx
[ - ] nmbd
[ - ] ntp
[ - ] open-iscsi
[ + ] opensm
[ + ] procps
[ + ] radosgw
[ + ] rpcbind
[ + ] rsyslog
[ - ] samba-ad-dc
[ - ] smartmontools
[ - ] smbd
[ + ] ssh
[ - ] sysstat
[ + ] udev
[ - ] uuidd
[ - ] winbind
[ + ] zabbix-agent
Many thanks for all your assistance!
admin
2,930 Posts
March 14, 2023, 1:29 pmQuote from admin on March 14, 2023, 1:29 pmPlease can you explain "look at the monitors" further?
basically what i posted in my prev replies
I have been starting services manually - ceph is up now
is it healthy ? what is the output of
ceph status
Earlier you had error:
root@san01:~# ceph status
2023-03-01T17:09:02.922+0000 7f87455c3700 0 monclient(hunting): authenticate timed out after 300
how did you fix it ?
Please can you explain "look at the monitors" further?
basically what i posted in my prev replies
I have been starting services manually - ceph is up now
is it healthy ? what is the output of
ceph status
Earlier you had error:
root@san01:~# ceph status
2023-03-01T17:09:02.922+0000 7f87455c3700 0 monclient(hunting): authenticate timed out after 300
how did you fix it ?
moose999
9 Posts
March 14, 2023, 4:22 pmQuote from moose999 on March 14, 2023, 4:22 pmAh, I have not fixed anything then!
The service says 'active' - service ceph status shows:
● ceph.service - LSB: Start Ceph distributed file system daemons at boot time
Loaded: loaded (/etc/init.d/ceph; generated)
Active: active (exited) since Mon 2023-03-06 20:16:08 GMT; 1 weeks 0 days ago
Docs: man:systemd-sysv-generator(8)
Process: 4164194 ExecStart=/etc/init.d/ceph start (code=exited, status=0/SUCCESS)
Mar 06 20:16:08 san03 systemd[1]: Starting LSB: Start Ceph distributed file system daemons at boot time...
Mar 06 20:16:08 san03 systemd[1]: Started LSB: Start Ceph distributed file system daemons at boot time.
however ceph status still shows:
2023-03-14T14:19:22.108+0000 7fc2cdb5c700 0 monclient(hunting): authenticate timed out after 300
So I guess ceph is not up!
What is the command to start the ceph monitor?
The /var/log/ceph/ceph-mon.HOSTNAME.log files still do not show anything after the day the cluster was created in early January - the cluster ran well for over a month after that.
Many thanks!
Ah, I have not fixed anything then!
The service says 'active' - service ceph status shows:
● ceph.service - LSB: Start Ceph distributed file system daemons at boot time
Loaded: loaded (/etc/init.d/ceph; generated)
Active: active (exited) since Mon 2023-03-06 20:16:08 GMT; 1 weeks 0 days ago
Docs: man:systemd-sysv-generator(8)
Process: 4164194 ExecStart=/etc/init.d/ceph start (code=exited, status=0/SUCCESS)
Mar 06 20:16:08 san03 systemd[1]: Starting LSB: Start Ceph distributed file system daemons at boot time...
Mar 06 20:16:08 san03 systemd[1]: Started LSB: Start Ceph distributed file system daemons at boot time.
however ceph status still shows:
2023-03-14T14:19:22.108+0000 7fc2cdb5c700 0 monclient(hunting): authenticate timed out after 300
So I guess ceph is not up!
What is the command to start the ceph monitor?
The /var/log/ceph/ceph-mon.HOSTNAME.log files still do not show anything after the day the cluster was created in early January - the cluster ran well for over a month after that.
Many thanks!
Pages: 1 2
504 Gateway Time-out on every node, 100% disk usage, I believe on a journal drive
admin
2,930 Posts
Quote from admin on March 2, 2023, 8:37 pmcan you do the suggestion of my earlier post to look at the monitors: are they started ? can you start them or they error out ? do they communicate with each other when they start.
note having things in the mon logs : are you sure they are even running ?
can you do the suggestion of my earlier post to look at the monitors: are they started ? can you start them or they error out ? do they communicate with each other when they start.
note having things in the mon logs : are you sure they are even running ?
moose999
9 Posts
Quote from moose999 on March 14, 2023, 11:32 amPlease can you explain "look at the monitors" further?
I have been starting services manually - ceph is up now - but when I try and start smbd I get the following error:
Mar 14 09:47:13 san02 cifs_service.py[2001532]: mkdir: cannot create directory ‘/opt/petasan/config/shared’: Transport endpoint is not connected
Mar 14 09:47:13 san02 cifs_service.py[2001534]: touch: cannot touch '/opt/petasan/config/shared/ctdb/nodes': Transport endpoint is not connected
Mar 14 09:47:13 san02 cifs_service.py[2001536]: touch: cannot touch '/opt/petasan/config/shared/ctdb/public_addresses': Transport endpoint is not connectedwhen I try and cd to /opt/petasan/config/shared I get the same:
shared: Transport endpoint is not connected
Which of these services do I need to get started?
[ - ] apache-htcacheclean
[ + ] apache2
[ + ] atop
[ + ] atopacct
[ + ] carbon-cache
[ + ] ceph
[ + ] collectd
[ + ] collectl
[ - ] console-setup.sh
[ + ] cron
[ - ] ctdb
[ + ] dbus
[ - ] fio
[ + ] grafana-server
[ + ] grub-common
[ - ] hwclock.sh
[ - ] iscsid
[ - ] keyboard-setup.sh
[ + ] kmod
[ - ] lvm2
[ - ] lvm2-lvmpolld
[ + ] multipath-tools
[ + ] networking
[ - ] nfs-common
[ + ] nginx
[ - ] nmbd
[ - ] ntp
[ - ] open-iscsi
[ + ] opensm
[ + ] procps
[ + ] radosgw
[ + ] rpcbind
[ + ] rsyslog
[ - ] samba-ad-dc
[ - ] smartmontools
[ - ] smbd
[ + ] ssh
[ - ] sysstat
[ + ] udev
[ - ] uuidd
[ - ] winbind
[ + ] zabbix-agentMany thanks for all your assistance!
Please can you explain "look at the monitors" further?
I have been starting services manually - ceph is up now - but when I try and start smbd I get the following error:
Mar 14 09:47:13 san02 cifs_service.py[2001532]: mkdir: cannot create directory ‘/opt/petasan/config/shared’: Transport endpoint is not connected
Mar 14 09:47:13 san02 cifs_service.py[2001534]: touch: cannot touch '/opt/petasan/config/shared/ctdb/nodes': Transport endpoint is not connected
Mar 14 09:47:13 san02 cifs_service.py[2001536]: touch: cannot touch '/opt/petasan/config/shared/ctdb/public_addresses': Transport endpoint is not connected
when I try and cd to /opt/petasan/config/shared I get the same:
shared: Transport endpoint is not connected
Which of these services do I need to get started?
[ - ] apache-htcacheclean
[ + ] apache2
[ + ] atop
[ + ] atopacct
[ + ] carbon-cache
[ + ] ceph
[ + ] collectd
[ + ] collectl
[ - ] console-setup.sh
[ + ] cron
[ - ] ctdb
[ + ] dbus
[ - ] fio
[ + ] grafana-server
[ + ] grub-common
[ - ] hwclock.sh
[ - ] iscsid
[ - ] keyboard-setup.sh
[ + ] kmod
[ - ] lvm2
[ - ] lvm2-lvmpolld
[ + ] multipath-tools
[ + ] networking
[ - ] nfs-common
[ + ] nginx
[ - ] nmbd
[ - ] ntp
[ - ] open-iscsi
[ + ] opensm
[ + ] procps
[ + ] radosgw
[ + ] rpcbind
[ + ] rsyslog
[ - ] samba-ad-dc
[ - ] smartmontools
[ - ] smbd
[ + ] ssh
[ - ] sysstat
[ + ] udev
[ - ] uuidd
[ - ] winbind
[ + ] zabbix-agent
Many thanks for all your assistance!
admin
2,930 Posts
Quote from admin on March 14, 2023, 1:29 pmPlease can you explain "look at the monitors" further?
basically what i posted in my prev replies
I have been starting services manually - ceph is up now
is it healthy ? what is the output of
ceph status
Earlier you had error:root@san01:~# ceph status
2023-03-01T17:09:02.922+0000 7f87455c3700 0 monclient(hunting): authenticate timed out after 300how did you fix it ?
Please can you explain "look at the monitors" further?
basically what i posted in my prev replies
I have been starting services manually - ceph is up now
is it healthy ? what is the output of
ceph status
Earlier you had error:
root@san01:~# ceph status
2023-03-01T17:09:02.922+0000 7f87455c3700 0 monclient(hunting): authenticate timed out after 300
how did you fix it ?
moose999
9 Posts
Quote from moose999 on March 14, 2023, 4:22 pmAh, I have not fixed anything then!
The service says 'active' - service ceph status shows:
● ceph.service - LSB: Start Ceph distributed file system daemons at boot time
Loaded: loaded (/etc/init.d/ceph; generated)
Active: active (exited) since Mon 2023-03-06 20:16:08 GMT; 1 weeks 0 days ago
Docs: man:systemd-sysv-generator(8)
Process: 4164194 ExecStart=/etc/init.d/ceph start (code=exited, status=0/SUCCESS)Mar 06 20:16:08 san03 systemd[1]: Starting LSB: Start Ceph distributed file system daemons at boot time...
Mar 06 20:16:08 san03 systemd[1]: Started LSB: Start Ceph distributed file system daemons at boot time.however ceph status still shows:
2023-03-14T14:19:22.108+0000 7fc2cdb5c700 0 monclient(hunting): authenticate timed out after 300
So I guess ceph is not up!
What is the command to start the ceph monitor?
The /var/log/ceph/ceph-mon.HOSTNAME.log files still do not show anything after the day the cluster was created in early January - the cluster ran well for over a month after that.
Many thanks!
Ah, I have not fixed anything then!
The service says 'active' - service ceph status shows:
● ceph.service - LSB: Start Ceph distributed file system daemons at boot time
Loaded: loaded (/etc/init.d/ceph; generated)
Active: active (exited) since Mon 2023-03-06 20:16:08 GMT; 1 weeks 0 days ago
Docs: man:systemd-sysv-generator(8)
Process: 4164194 ExecStart=/etc/init.d/ceph start (code=exited, status=0/SUCCESS)Mar 06 20:16:08 san03 systemd[1]: Starting LSB: Start Ceph distributed file system daemons at boot time...
Mar 06 20:16:08 san03 systemd[1]: Started LSB: Start Ceph distributed file system daemons at boot time.
however ceph status still shows:
2023-03-14T14:19:22.108+0000 7fc2cdb5c700 0 monclient(hunting): authenticate timed out after 300
So I guess ceph is not up!
What is the command to start the ceph monitor?
The /var/log/ceph/ceph-mon.HOSTNAME.log files still do not show anything after the day the cluster was created in early January - the cluster ran well for over a month after that.
Many thanks!