[SOLVED]No data points , graphs stop working every few hours

wailer
75 Posts
January 16, 2020, 11:29 pmQuote from wailer on January 16, 2020, 11:29 pmHi ,
After replacing all management nodes we are facing this new issue.
We have 3 nodes, ceph-21, 22 and 23. Stats server is 23.
When graphs show "no data points" we see this error:
root@CEPH-23:~# cd /opt/petasan/config/shared
-bash: cd: /opt/petasan/config/shared: Transport endpoint is not connected
Gluster peer status looks OK to me
root@CEPH-23:~# gluster peer status
Number of Peers: 2
Hostname: 10.74.2.221
Uuid: 1441ff3a-635d-4375-873f-b3f51e1b419e
State: Peer in Cluster (Connected)
Hostname: 10.74.2.222
Uuid: af6ee2b5-3b47-4778-956d-32143965891f
State: Peer in Cluster (Connected)
To fix, we do:
root@CEPH-23:~# systemctl restart petasan-mount-sharedfs
root@CEPH-23:~# cd /opt/petasan/config/shared
root@CEPH-23:/opt/petasan/config/shared#
Now the shared gluster folder its working again.
We execute this 2 commands
root@CEPH-23:/opt/petasan/config/shared# /opt/petasan/scripts/stats-setup.sh
root@CEPH-23:/opt/petasan/config/shared# /opt/petasan/scripts/stats-start.sh
volume set: success
Now graphs start working again ,but PetaSAN.log , keeps showing this lines:
17/01/2020 00:27:08 INFO GlusterFS mount attempt
17/01/2020 00:27:38 INFO GlusterFS mount attempt
17/01/2020 00:28:08 INFO GlusterFS mount attempt
17/01/2020 00:28:38 INFO GlusterFS mount attempt
17/01/2020 00:29:08 INFO GlusterFS mount attempt
Not sure if this is related to something broken with gluster vol.
Any idea where to look at next ?
Thanks as usual,
Hi ,
After replacing all management nodes we are facing this new issue.
We have 3 nodes, ceph-21, 22 and 23. Stats server is 23.
When graphs show "no data points" we see this error:
root@CEPH-23:~# cd /opt/petasan/config/shared
-bash: cd: /opt/petasan/config/shared: Transport endpoint is not connected
Gluster peer status looks OK to me
root@CEPH-23:~# gluster peer status
Number of Peers: 2
Hostname: 10.74.2.221
Uuid: 1441ff3a-635d-4375-873f-b3f51e1b419e
State: Peer in Cluster (Connected)
Hostname: 10.74.2.222
Uuid: af6ee2b5-3b47-4778-956d-32143965891f
State: Peer in Cluster (Connected)
To fix, we do:
root@CEPH-23:~# systemctl restart petasan-mount-sharedfs
root@CEPH-23:~# cd /opt/petasan/config/shared
root@CEPH-23:/opt/petasan/config/shared#
Now the shared gluster folder its working again.
We execute this 2 commands
root@CEPH-23:/opt/petasan/config/shared# /opt/petasan/scripts/stats-setup.sh
root@CEPH-23:/opt/petasan/config/shared# /opt/petasan/scripts/stats-start.sh
volume set: success
Now graphs start working again ,but PetaSAN.log , keeps showing this lines:
17/01/2020 00:27:08 INFO GlusterFS mount attempt
17/01/2020 00:27:38 INFO GlusterFS mount attempt
17/01/2020 00:28:08 INFO GlusterFS mount attempt
17/01/2020 00:28:38 INFO GlusterFS mount attempt
17/01/2020 00:29:08 INFO GlusterFS mount attempt
Not sure if this is related to something broken with gluster vol.
Any idea where to look at next ?
Thanks as usual,
Last edited on January 21, 2020, 1:37 am by wailer · #1

wailer
75 Posts
January 18, 2020, 1:31 amQuote from wailer on January 18, 2020, 1:31 amUPDATE: just noticed all three nodes are showing the same message:
NODE1:
18/01/2020 02:24:37 INFO GlusterFS mount attempt
18/01/2020 02:25:08 INFO GlusterFS mount attempt
18/01/2020 02:25:38 INFO GlusterFS mount attempt
18/01/2020 02:26:08 INFO GlusterFS mount attempt
18/01/2020 02:26:38 INFO GlusterFS mount attempt
18/01/2020 02:27:08 INFO GlusterFS mount attempt
NODE 2
18/01/2020 02:24:20 INFO GlusterFS mount attempt
18/01/2020 02:24:50 INFO GlusterFS mount attempt
18/01/2020 02:25:20 INFO GlusterFS mount attempt
18/01/2020 02:25:50 INFO GlusterFS mount attempt
18/01/2020 02:26:21 INFO GlusterFS mount attempt
18/01/2020 02:26:51 INFO GlusterFS mount attempt
18/01/2020 02:27:21 INFO GlusterFS mount attempt
NODE 3
18/01/2020 02:24:28 INFO GlusterFS mount attempt
18/01/2020 02:24:58 INFO GlusterFS mount attempt
18/01/2020 02:25:28 INFO GlusterFS mount attempt
18/01/2020 02:25:58 INFO GlusterFS mount attempt
18/01/2020 02:26:28 INFO GlusterFS mount attempt
18/01/2020 02:26:58 INFO GlusterFS mount attempt
18/01/2020 02:27:28 INFO GlusterFS mount attempt
18/01/2020 02:27:58 INFO GlusterFS mount attempt
Stats server is node 3.
I compared same logs in 3 nodes from another petasan cluster and I don't see all this spam, so looks like there's something not working there.
UPDATE 2: Solved by deleting and recreating gluster gfs-vol
UPDATE: just noticed all three nodes are showing the same message:
NODE1:
18/01/2020 02:24:37 INFO GlusterFS mount attempt
18/01/2020 02:25:08 INFO GlusterFS mount attempt
18/01/2020 02:25:38 INFO GlusterFS mount attempt
18/01/2020 02:26:08 INFO GlusterFS mount attempt
18/01/2020 02:26:38 INFO GlusterFS mount attempt
18/01/2020 02:27:08 INFO GlusterFS mount attempt
NODE 2
18/01/2020 02:24:20 INFO GlusterFS mount attempt
18/01/2020 02:24:50 INFO GlusterFS mount attempt
18/01/2020 02:25:20 INFO GlusterFS mount attempt
18/01/2020 02:25:50 INFO GlusterFS mount attempt
18/01/2020 02:26:21 INFO GlusterFS mount attempt
18/01/2020 02:26:51 INFO GlusterFS mount attempt
18/01/2020 02:27:21 INFO GlusterFS mount attempt
NODE 3
18/01/2020 02:24:28 INFO GlusterFS mount attempt
18/01/2020 02:24:58 INFO GlusterFS mount attempt
18/01/2020 02:25:28 INFO GlusterFS mount attempt
18/01/2020 02:25:58 INFO GlusterFS mount attempt
18/01/2020 02:26:28 INFO GlusterFS mount attempt
18/01/2020 02:26:58 INFO GlusterFS mount attempt
18/01/2020 02:27:28 INFO GlusterFS mount attempt
18/01/2020 02:27:58 INFO GlusterFS mount attempt
Stats server is node 3.
I compared same logs in 3 nodes from another petasan cluster and I don't see all this spam, so looks like there's something not working there.
UPDATE 2: Solved by deleting and recreating gluster gfs-vol
Last edited on January 21, 2020, 1:37 am by wailer · #2
[SOLVED]No data points , graphs stop working every few hours
wailer
75 Posts
Quote from wailer on January 16, 2020, 11:29 pmHi ,
After replacing all management nodes we are facing this new issue.
We have 3 nodes, ceph-21, 22 and 23. Stats server is 23.
When graphs show "no data points" we see this error:
root@CEPH-23:~# cd /opt/petasan/config/shared
-bash: cd: /opt/petasan/config/shared: Transport endpoint is not connectedGluster peer status looks OK to me
root@CEPH-23:~# gluster peer status
Number of Peers: 2Hostname: 10.74.2.221
Uuid: 1441ff3a-635d-4375-873f-b3f51e1b419e
State: Peer in Cluster (Connected)Hostname: 10.74.2.222
Uuid: af6ee2b5-3b47-4778-956d-32143965891f
State: Peer in Cluster (Connected)To fix, we do:
root@CEPH-23:~# systemctl restart petasan-mount-sharedfs
root@CEPH-23:~# cd /opt/petasan/config/shared
root@CEPH-23:/opt/petasan/config/shared#Now the shared gluster folder its working again.
We execute this 2 commands
root@CEPH-23:/opt/petasan/config/shared# /opt/petasan/scripts/stats-setup.sh
root@CEPH-23:/opt/petasan/config/shared# /opt/petasan/scripts/stats-start.sh
volume set: successNow graphs start working again ,but PetaSAN.log , keeps showing this lines:
17/01/2020 00:27:08 INFO GlusterFS mount attempt
17/01/2020 00:27:38 INFO GlusterFS mount attempt
17/01/2020 00:28:08 INFO GlusterFS mount attempt
17/01/2020 00:28:38 INFO GlusterFS mount attempt
17/01/2020 00:29:08 INFO GlusterFS mount attemptNot sure if this is related to something broken with gluster vol.
Any idea where to look at next ?
Thanks as usual,
Hi ,
After replacing all management nodes we are facing this new issue.
We have 3 nodes, ceph-21, 22 and 23. Stats server is 23.
When graphs show "no data points" we see this error:
root@CEPH-23:~# cd /opt/petasan/config/shared
-bash: cd: /opt/petasan/config/shared: Transport endpoint is not connected
Gluster peer status looks OK to me
root@CEPH-23:~# gluster peer status
Number of Peers: 2
Hostname: 10.74.2.221
Uuid: 1441ff3a-635d-4375-873f-b3f51e1b419e
State: Peer in Cluster (Connected)
Hostname: 10.74.2.222
Uuid: af6ee2b5-3b47-4778-956d-32143965891f
State: Peer in Cluster (Connected)
To fix, we do:
root@CEPH-23:~# systemctl restart petasan-mount-sharedfs
root@CEPH-23:~# cd /opt/petasan/config/shared
root@CEPH-23:/opt/petasan/config/shared#
Now the shared gluster folder its working again.
We execute this 2 commands
root@CEPH-23:/opt/petasan/config/shared# /opt/petasan/scripts/stats-setup.sh
root@CEPH-23:/opt/petasan/config/shared# /opt/petasan/scripts/stats-start.sh
volume set: success
Now graphs start working again ,but PetaSAN.log , keeps showing this lines:
17/01/2020 00:27:08 INFO GlusterFS mount attempt
17/01/2020 00:27:38 INFO GlusterFS mount attempt
17/01/2020 00:28:08 INFO GlusterFS mount attempt
17/01/2020 00:28:38 INFO GlusterFS mount attempt
17/01/2020 00:29:08 INFO GlusterFS mount attempt
Not sure if this is related to something broken with gluster vol.
Any idea where to look at next ?
Thanks as usual,
wailer
75 Posts
Quote from wailer on January 18, 2020, 1:31 amUPDATE: just noticed all three nodes are showing the same message:
NODE1:
18/01/2020 02:24:37 INFO GlusterFS mount attempt
18/01/2020 02:25:08 INFO GlusterFS mount attempt
18/01/2020 02:25:38 INFO GlusterFS mount attempt
18/01/2020 02:26:08 INFO GlusterFS mount attempt
18/01/2020 02:26:38 INFO GlusterFS mount attempt
18/01/2020 02:27:08 INFO GlusterFS mount attemptNODE 2
18/01/2020 02:24:20 INFO GlusterFS mount attempt
18/01/2020 02:24:50 INFO GlusterFS mount attempt
18/01/2020 02:25:20 INFO GlusterFS mount attempt
18/01/2020 02:25:50 INFO GlusterFS mount attempt
18/01/2020 02:26:21 INFO GlusterFS mount attempt
18/01/2020 02:26:51 INFO GlusterFS mount attempt
18/01/2020 02:27:21 INFO GlusterFS mount attemptNODE 3
18/01/2020 02:24:28 INFO GlusterFS mount attempt
18/01/2020 02:24:58 INFO GlusterFS mount attempt
18/01/2020 02:25:28 INFO GlusterFS mount attempt
18/01/2020 02:25:58 INFO GlusterFS mount attempt
18/01/2020 02:26:28 INFO GlusterFS mount attempt
18/01/2020 02:26:58 INFO GlusterFS mount attempt
18/01/2020 02:27:28 INFO GlusterFS mount attempt
18/01/2020 02:27:58 INFO GlusterFS mount attempt
Stats server is node 3.
I compared same logs in 3 nodes from another petasan cluster and I don't see all this spam, so looks like there's something not working there.
UPDATE 2: Solved by deleting and recreating gluster gfs-vol
UPDATE: just noticed all three nodes are showing the same message:
NODE1:
18/01/2020 02:24:37 INFO GlusterFS mount attempt
18/01/2020 02:25:08 INFO GlusterFS mount attempt
18/01/2020 02:25:38 INFO GlusterFS mount attempt
18/01/2020 02:26:08 INFO GlusterFS mount attempt
18/01/2020 02:26:38 INFO GlusterFS mount attempt
18/01/2020 02:27:08 INFO GlusterFS mount attempt
NODE 2
18/01/2020 02:24:20 INFO GlusterFS mount attempt
18/01/2020 02:24:50 INFO GlusterFS mount attempt
18/01/2020 02:25:20 INFO GlusterFS mount attempt
18/01/2020 02:25:50 INFO GlusterFS mount attempt
18/01/2020 02:26:21 INFO GlusterFS mount attempt
18/01/2020 02:26:51 INFO GlusterFS mount attempt
18/01/2020 02:27:21 INFO GlusterFS mount attempt
NODE 3
18/01/2020 02:24:28 INFO GlusterFS mount attempt
18/01/2020 02:24:58 INFO GlusterFS mount attempt
18/01/2020 02:25:28 INFO GlusterFS mount attempt
18/01/2020 02:25:58 INFO GlusterFS mount attempt
18/01/2020 02:26:28 INFO GlusterFS mount attempt
18/01/2020 02:26:58 INFO GlusterFS mount attempt
18/01/2020 02:27:28 INFO GlusterFS mount attempt
18/01/2020 02:27:58 INFO GlusterFS mount attempt
Stats server is node 3.
I compared same logs in 3 nodes from another petasan cluster and I don't see all this spam, so looks like there's something not working there.
UPDATE 2: Solved by deleting and recreating gluster gfs-vol