Forums

Home / Forums

You need to log in to create posts and topics. Login · Register

Graphs not working - PetaSAN Version 3.1.0 Upgrade

Pages: 1 2

Hello,

after upgrading PetaSAN from Version 3.0.2 to 3.1.0 the graphs on the dashboard stopped working.

When hovering on the graphs exclamation mark, the error text says "Request error".

We have already tried restarting services. Seems that Grafana has some Problems since the update.

Any suggestions what else we could try to get it working normally again?

Best regards

1) get the stats server ip from
/opt/petasan/scripts/util/get_cluster_leader.py
2) on node which is stats server, try
/opt/petasan/scripts/stats-stop.sh
/opt/petasan/scripts/stats-setup.sh
/opt/petasan/scripts/stats-start.sh

3) If this does not fix issues, on all nodes do a
systemctl restart petasan-node-stats

Hello,

we tried these steps. Unfortunately it didn't fix the problem.

Any other suggestions?

Thank you very much.

Hello,

we still have the problem. Any suggestions?

Thanks in advance.

1) get the stats server ip from
/opt/petasan/scripts/util/get_cluster_leader.py

2) On that node, what is the output of:

systemctl status carbon-cache
gluster vol status
mount | grep shared
ls /opt/petasan/config/shared/graphite/whisper/PetaSAN/

Hello,

1) Output:
{'ps-cl02-node02': '10.10.12.32'}

2) Output:
root@ps-cl02-node02:~# gluster vol status
Status of volume: gfs-vol
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 192.168.44.31:/opt/petasan/config/gfs
-brick                                      49152     0          Y       4815
Brick 192.168.44.32:/opt/petasan/config/gfs
-brick                                      49152     0          Y       3311
Brick 192.168.44.33:/opt/petasan/config/gfs
-brick                                      49152     0          Y       3314
Self-heal Daemon on localhost               N/A       N/A        N       N/A
Self-heal Daemon on 192.168.44.31           N/A       N/A        N       N/A
Self-heal Daemon on 192.168.44.33           N/A       N/A        N       N/A

Task Status of Volume gfs-vol
------------------------------------------------------------------------------
There are no active volume tasks

root@ps-cl02-node02:~# mount | grep shared
192.168.44.31:gfs-vol on /opt/petasan/config/shared type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
root@ps-cl02-node02:~# ls /opt/petasan/config/shared/graphite/whisper/PetaSAN/
ClusterStats  NodeStats

Thank you.

first command in 2)  systemctl status carbon-cache

Hello,

root@ps-cl02-node02:~# systemctl status carbon-cache
● carbon-cache.service - Graphite Carbon Cache
Loaded: loaded (/lib/systemd/system/carbon-cache.service; disabled; vendor preset: enabled)
Active: active (running) since Thu 2022-10-20 16:55:00 CEST; 3 days ago
Docs: https://graphite.readthedocs.io
Process: 1268423 ExecStart=/usr/bin/carbon-cache --config=/etc/carbon/carbon.conf --pidfile=/var/run/carbon-cache.pid --logdi
r=/var/log/carbon/ start (code=exited, status=0/SUCCESS)
Main PID: 1268430 (carbon-cache)
Tasks: 3 (limit: 154598)
Memory: 121.6M
CGroup: /system.slice/carbon-cache.service
└─1268430 /usr/bin/python3 /usr/bin/carbon-cache --config=/etc/carbon/carbon.conf --pidfile=/var/run/carbon-cache.pi
d --logdir=/var/log/carbon/ start

Oct 20 16:54:59 ps-cl02-node02 systemd[1]: Starting Graphite Carbon Cache...
Oct 20 16:55:00 ps-cl02-node02 systemd[1]: Started Graphite Carbon Cache.

find current stats server:
/opt/petasan/scripts/util/get_cluster_leader.py

on this node:
systemctl stop petasan-cluster-leader
systemctl stop petasan-notification
/opt/petasan/scripts/stats-stop.sh

refresh dashboard in ui, it should show bad gateway, then run
consul kv delete PetaSAN/Services/ClusterLeader
wait approx 1 min then refresh dashboard
if this does not solve the issue:

Can you post screenshot of error you see on grafana

Do you have error for both cluster as well as node stats ?

Can you post the following on stats server:
/var/log/apache2/graphite-web_error.log
/var/log/carbon/console.log
/var/log/apache2/error.log
/var/log/grafana/grafana.log

Hello,

this did unfortunately not resolve the issue.

grafana_error

Notice:
After /opt/petasan/scripts/stats-stop.sh the node changed from

{'ps-cl02-node02': '10.10.12.32'} to {'ps-cl02-node01': '10.10.12.31'}

On the new node:
/var/log/apache2/graphite-web_error.log:
[Wed Oct 26 09:54:33.475067 2022] [wsgi:error] [pid 2861410:tid 139699158735936]   warn('SECRET_KEY is set to an unsafe default. This should be set in local_settings.py for better security')
[Wed Oct 26 09:54:33.475083 2022] [wsgi:error] [pid 2861412:tid 139699158735936]   warn('SECRET_KEY is set to an unsafe default. This should be set in local_settings.py for better security')

/var/log/carbon/console.log:
26/10/2022 09:59:19 :: Queue consumed in 28.196389 seconds
26/10/2022 09:59:19 :: Sorted 130 cache queues in 0.000103 seconds
26/10/2022 09:59:33 :: Queue consumed in 14.465051 seconds
26/10/2022 09:59:33 :: Sorted 18 cache queues in 0.000029 seconds

/var/log/apache2/error.log:
[Wed Oct 26 09:54:32.969256 2022] [mpm_event:notice] [pid 2861407:tid 139699158735936] AH00489: Apache/2.4.41 (Ubuntu) mod_wsgi/4.6.8 Python/3.8 configured -- resuming normal operations
[Wed Oct 26 09:54:32.969510 2022] [core:notice] [pid 2861407:tid 139699158735936] AH00094: Command line: '/usr/sbin/apache2'

/var/log/grafana/grafana.log:
t=2022-10-26T09:59:34+0200 lvl=info msg="Request Completed" logger=context userId=0 orgId=1 uname= method=POST path=/api/datasources/proxy/1/render status=502 remote_addr=10.10.12.31 time_ms=2 size=0 referer="https://10.10.12.31/grafana/d-solo/000000011/node-stats?orgId=1&from=now-60m&to=now&var-node=ps-cl02-node01&panelId=4&refresh=30s"
t=2022-10-26T10:00:04+0200 lvl=info msg="Request Completed" logger=context userId=0 orgId=1 uname= method=POST path=/api/datasources/proxy/1/render status=502 remote_addr=10.10.12.31 time_ms=2 size=0 referer="https://10.10.12.31/grafana/d-solo/000000011/node-stats?orgId=1&from=now-60m&to=now&var-node=ps-cl02-node01&panelId=4&refresh=30s"

Thank you.

Pages: 1 2