Forums

Home / Forums

You need to log in to create posts and topics. Login · Register

[SOLVED]No data points , graphs stop working every few hours

Hi ,

After replacing all management nodes we are facing this new issue.

We have 3 nodes, ceph-21, 22 and 23. Stats server is 23.

When graphs show "no data points" we see this error:

root@CEPH-23:~# cd /opt/petasan/config/shared
-bash: cd: /opt/petasan/config/shared: Transport endpoint is not connected

Gluster peer status looks OK to me

root@CEPH-23:~# gluster peer status
Number of Peers: 2

Hostname: 10.74.2.221
Uuid: 1441ff3a-635d-4375-873f-b3f51e1b419e
State: Peer in Cluster (Connected)

Hostname: 10.74.2.222
Uuid: af6ee2b5-3b47-4778-956d-32143965891f
State: Peer in Cluster (Connected)

To fix, we do:

root@CEPH-23:~# systemctl restart petasan-mount-sharedfs
root@CEPH-23:~# cd /opt/petasan/config/shared
root@CEPH-23:/opt/petasan/config/shared#

Now the shared gluster folder its working again.

We execute this 2 commands

root@CEPH-23:/opt/petasan/config/shared# /opt/petasan/scripts/stats-setup.sh
root@CEPH-23:/opt/petasan/config/shared# /opt/petasan/scripts/stats-start.sh
volume set: success

Now graphs start working again ,but PetaSAN.log , keeps showing this lines:
17/01/2020 00:27:08 INFO GlusterFS mount attempt
17/01/2020 00:27:38 INFO GlusterFS mount attempt
17/01/2020 00:28:08 INFO GlusterFS mount attempt
17/01/2020 00:28:38 INFO GlusterFS mount attempt
17/01/2020 00:29:08 INFO GlusterFS mount attempt

Not sure if this is related to something broken with gluster vol.

Any idea where to look at next ?

Thanks as usual,

 

 

UPDATE: just noticed all three nodes are showing the same message:

NODE1:

18/01/2020 02:24:37 INFO GlusterFS mount attempt
18/01/2020 02:25:08 INFO GlusterFS mount attempt
18/01/2020 02:25:38 INFO GlusterFS mount attempt
18/01/2020 02:26:08 INFO GlusterFS mount attempt
18/01/2020 02:26:38 INFO GlusterFS mount attempt
18/01/2020 02:27:08 INFO GlusterFS mount attempt

NODE 2
18/01/2020 02:24:20 INFO GlusterFS mount attempt
18/01/2020 02:24:50 INFO GlusterFS mount attempt
18/01/2020 02:25:20 INFO GlusterFS mount attempt
18/01/2020 02:25:50 INFO GlusterFS mount attempt
18/01/2020 02:26:21 INFO GlusterFS mount attempt
18/01/2020 02:26:51 INFO GlusterFS mount attempt
18/01/2020 02:27:21 INFO GlusterFS mount attempt

NODE 3

18/01/2020 02:24:28 INFO GlusterFS mount attempt
18/01/2020 02:24:58 INFO GlusterFS mount attempt
18/01/2020 02:25:28 INFO GlusterFS mount attempt
18/01/2020 02:25:58 INFO GlusterFS mount attempt
18/01/2020 02:26:28 INFO GlusterFS mount attempt
18/01/2020 02:26:58 INFO GlusterFS mount attempt
18/01/2020 02:27:28 INFO GlusterFS mount attempt
18/01/2020 02:27:58 INFO GlusterFS mount attempt

 

Stats server is node 3.

I compared same logs in 3 nodes from another petasan cluster and I don't see all this spam,  so looks like there's something not working there.

 

UPDATE 2: Solved by deleting and recreating gluster gfs-vol