Forums

Home / Forums

You need to log in to create posts and topics. Login · Register

Lost dashboard chart data

Pages: 1 2

Hello,

Just updated from 2.3.1 to 2.6.2.  After a little while, the data in the chart area is not working and showing the "no data points" in the screen.  I have no timing skew indicated.  This happened after I did a ceph change to help with the pg scrub issues.  grafana-server.service loaded active running Grafana instance
Here's what was changed

osd_scrub_begin_hour  =
0
osd_scrub_end_hour  =
24
osd_scrub_load_threshold  =
0.500000
osd_scrub_sleep  =
0.200000

If you're seeing this Grafana has failed to load its application files

1. This could be caused by your reverse proxy settings.

2. If you host grafana under subpath make sure your grafana.ini root_url setting includes subpath

3. If you have a local dev build make sure you build frontend using: yarn start, yarn start:hot, or yarn build

4. Sometimes restarting grafana-server can help

so did it work for a while after the upgrade process but later is no longer working ? or did it stop during upgrade ?

is the cluster under load ? can you check your disk % utlization.

Yes it was working just after the upgrade until I added the ceph statements which I don't think have anything to do with it.  Disk utilization is 50%

CPU is just after the "S", not much
1201936 ceph 20 0 943680 292704 35304 S 5.0 0.1 17:00.70 ceph-mgr
1203132 root 20 0 1091672 101732 30172 S 3.6 0.0 1:29.79 admin.py
2111 root 20 0 222476 103536 59276 S 2.3 0.0 5328:32 consul
1202608 ceph 20 0 4443716 3.532g 35244 S 2.3 1.4 28:30.15 ceph-osd
1202629 ceph 20 0 4281164 3.378g 35084 S 2.0 1.3 24:37.48 ceph-osd
1201572 ceph 20 0 843856 398372 28360 S 1.7 0.2 2:58.50 ceph-mon
1202666 ceph 20 0 4367512 3.432g 35308 S 1.7 1.4 27:30.19 ceph-osd
1374122 root 20 0 40752 3992 3068 R 1.3 0.0 0:00.12 top

try to restart the stats service

find out on what node the stats service is currently running
/opt/petasan/scripts/util/get_cluster_leader.py

restart it
/opt/petasan/scripts/stats-stop.sh
/opt/petasan/scripts/stats-start.sh

looks like its showing

 

Connection failed. Please check if gluster daemon is operational.

showing where ?

root@PS-Node3:~# /opt/petasan/scripts/util/get_cluster_leader.py
{u'PS-Node3': u'172.16.14.57'}
root@PS-Node3:~# /opt/petasan/scripts/stats-stop.sh
root@PS-Node3:~# /opt/petasan/scripts/stats-start.sh
Connection failed. Please check if gluster daemon is operational.

 

systemctl status glusterfs-server
Unit glusterfs-server.service could not be found.

what is the output of

gluster vol status

and on the first 3 nodes:

systemctl status glusterd

The service was not started.

systemctl status glusterd
â glusterd.service - GlusterFS, a clustered file-system server
Loaded: loaded (/lib/systemd/system/glusterd.service; disabled; vendor preset
: enabled)
Active: inactive (dead) since Wed 2020-09-30 03:49:17 CDT; 12h ago
Main PID: 2194 (code=exited, status=15)

Started it on all 3 nodes

systemctl status glusterd
â glusterd.service - GlusterFS, a clustered file-system server
Loaded: loaded (/lib/systemd/system/glusterd.service; disabled; vendor preset
: enabled)
Active: active (running) since Wed 2020-09-30 15:50:35 CDT; 8s ago
Process: 1594429 ExecStart=/usr/sbin/glusterd -p /var/run/glusterd.pid --log-l
evel $LOG_LEVEL $GLUSTERD_OPTIONS (code=exited, status=0/SUCCESS)

gluster vol status on all 3 nodes

 

root@PS-Node1:/etc/ceph# gluster vol status
Status of volume: gfs-vol
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick 10.20.4.55:/opt/petasan/config/gfs-br
ick 49152 0 Y 2685
Brick 10.20.4.56:/opt/petasan/config/gfs-br
ick 49152 0 Y 2677
Brick 10.20.4.57:/opt/petasan/config/gfs-br
ick 49152 0 Y 2887
Self-heal Daemon on localhost N/A N/A Y 1594519
Self-heal Daemon on 10.20.4.56 N/A N/A Y 2685484
Self-heal Daemon on 10.20.4.57 N/A N/A Y 745739

Task Status of Volume gfs-vol

 

peer status all 3 nodes

gluster peer status
Number of Peers: 2

Hostname: 10.20.4.56
Uuid: b1e2d005-953a-4e81-af9e-34f7431aa257
State: Peer in Cluster (Connected)

Hostname: 10.20.4.57
Uuid: 41554629-07d7-4c9d-b074-c7980e018566
State: Peer in Cluster (Connected)
root@PS-Node1:/etc/ceph# systemctl restart grafana-server
root@PS-Node1:/etc/ceph# gluster vol status
Status of volume: gfs-vol
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick 10.20.4.55:/opt/petasan/config/gfs-br
ick 49152 0 Y 2685
Brick 10.20.4.56:/opt/petasan/config/gfs-br
ick 49152 0 Y 2677
Brick 10.20.4.57:/opt/petasan/config/gfs-br
ick 49152 0 Y 2887
Self-heal Daemon on localhost N/A N/A Y 1594519
Self-heal Daemon on 10.20.4.57 N/A N/A Y 745739
Self-heal Daemon on 10.20.4.56 N/A N/A Y 2685484

Task Status of Volume gfs-vol
------------------------------------------------------------------------------
There are no active volume tasks
------------------------------------------------------------------------------
There are no active volume tasks

 

Restart grafana-server on all 3 nodes, still the same results of

If you're seeing this Grafana has failed to load its application files

1. This could be caused by your reverse proxy settings.

2. If you host grafana under subpath make sure your grafana.ini root_url setting includes subpath

3. If you have a local dev build make sure you build frontend using: yarn start, yarn start:hot, or yarn build

4. Sometimes restarting grafana-server can help

can you check the shared file system is mounted
mount | grep shared

grafana can only be started in one node only
find the stats leader

/opt/petasan/scripts/util/get_cluster_leader.py

restart services
/opt/petasan/scripts/stats-stop.sh
/opt/petasan/scripts/stats-start.sh

for other management nodes, make sure you stop them else it could cuase issues
/opt/petasan/scripts/stats-stop.sh

Pages: 1 2