Forums

Home / Forums

You need to log in to create posts and topics. Login · Register

gfs-brick has disconnected from glusterd

reviewing logfiles found the following over and over in etc-glusterfs-glusterd.vol.log and found some info via google that this may be caused by starting gluster-nfs when NFS is disabled?

The message "I [MSGID: 106006] [glusterd-svc-mgmt.c:323:glusterd_svc_common_rpc_notify] 0-management: nfs has disconnected from glusterd." repeated 39 times between [2017-07-27 15:16:43.538691] and [2017-07-27 15:18:40.547953]
The message "I [MSGID: 106005] [glusterd-handler.c:4908:__glusterd_brick_rpc_notify] 0-management: Brick 10.60.124.12:/opt/petasan/config/gfs-brick has disconnected from glusterd." repeated 39 times between [2017-07-27 15:16:43.538882] and [2017-07-27 15:18:40.548124]
[2017-07-27 15:18:43.547974] W [socket.c:588:__socket_rwv] 0-nfs: readv on /var/run/gluster/992c8f2b1b91c6b724fe775e24fbf538.socket failed (Invalid argument)
[2017-07-27 15:18:43.548208] W [socket.c:588:__socket_rwv] 0-management: readv on /var/run/gluster/8f17601b1e8ee2e4cc3ff2b161ce34a8.socket failed (Invalid argument)
[2017-07-27 15:18:43.548209] I [MSGID: 106006] [glusterd-svc-mgmt.c:323:glusterd_svc_common_rpc_notify] 0-management: nfs has disconnected from glusterd.
[2017-07-27 15:18:43.548401] I [MSGID: 106005] [glusterd-handler.c:4908:__glusterd_brick_rpc_notify] 0-management: Brick 10.60.124.12:/opt/petasan/config/gfs-brick has disconnected from glusterd.
[2017-07-27 15:18:46.548201] W [socket.c:588:__socket_rwv] 0-nfs: readv on /var/run/gluster/992c8f2b1b91c6b724fe775e24fbf538.socket failed (Invalid argument)
[2017-07-27 15:18:46.548437] W [socket.c:588:__socket_rwv] 0-management: readv on /var/run/gluster/8f17601b1e8ee2e4cc3ff2b161ce34a8.socket failed (Invalid argument)
[2017-07-27 15:18:49.548424] W [socket.c:588:__socket_rwv] 0-nfs: readv on /var/run/gluster/992c8f2b1b91c6b724fe775e24fbf538.socket failed (Invalid argument)
[2017-07-27 15:18:49.548657] W [socket.c:588:__socket_rwv] 0-management: readv on /var/run/gluster/8f17601b1e8ee2e4cc3ff2b161ce34a8.socket failed (Invalid argument)
[2017-07-27 15:18:52.548666] W [socket.c:588:__socket_rwv] 0-nfs: readv on /var/run/gluster/992c8f2b1b91c6b724fe775e24fbf538.socket failed (Invalid argument)
[2017-07-27 15:18:52.548900] W [socket.c:588:__socket_rwv] 0-management: readv on /var/run/gluster/8f17601b1e8ee2e4cc3ff2b161ce34a8.socket failed (Invalid argument)
[2017-07-27 15:18:55.548898] W [socket.c:588:__socket_rwv] 0-nfs: readv on /var/run/gluster/992c8f2b1b91c6b724fe775e24fbf538.socket failed (Invalid argument)
[2017-07-27 15:18:55.549149] W [socket.c:588:__socket_rwv] 0-management: readv on /var/run/gluster/8f17601b1e8ee2e4cc3ff2b161ce34a8.socket failed (Invalid argument)
[2017-07-27 15:18:58.549127] W [socket.c:588:__socket_rwv] 0-nfs: readv on /var/run/gluster/992c8f2b1b91c6b724fe775e24fbf538.socket failed (Invalid argument)
[2017-07-27 15:18:58.549366] W [socket.c:588:__socket_rwv] 0-management: readv on /var/run/gluster/8f17601b1e8ee2e4cc3ff2b161ce34a8.socket failed (Invalid argument)
[2017-07-27 15:19:01.549351] W [socket.c:588:__socket_rwv] 0-nfs: readv on /var/run/gluster/992c8f2b1b91c6b724fe775e24fbf538.socket failed (Invalid argument)
[2017-07-27 15:19:01.549587] W [socket.c:588:__socket_rwv] 0-management: readv on /var/run/gluster/8f17601b1e8ee2e4cc3ff2b161ce34a8.socket failed (Invalid argument)
[2017-07-27 15:19:04.549593] W [socket.c:588:__socket_rwv] 0-nfs: readv on /var/run/gluster/992c8f2b1b91c6b724fe775e24fbf538.socket failed (Invalid argument)
[2017-07-27 15:19:04.549831] W [socket.c:588:__socket_rwv] 0-management: readv on /var/run/gluster/8f17601b1e8ee2e4cc3ff2b161ce34a8.socket failed (Invalid argument)
[2017-07-27 15:19:07.549821] W [socket.c:588:__socket_rwv] 0-nfs: readv on /var/run/gluster/992c8f2b1b91c6b724fe775e24fbf538.socket failed (Invalid argument)
[2017-07-27 15:19:07.550062] W [socket.c:588:__socket_rwv] 0-management: readv on /var/run/gluster/8f17601b1e8ee2e4cc3ff2b161ce34a8.socket failed (Invalid argument)
[2017-07-27 15:19:10.550052] W [socket.c:588:__socket_rwv] 0-nfs: readv on /var/run/gluster/992c8f2b1b91c6b724fe775e24fbf538.socket failed (Invalid argument)
[2017-07-27 15:19:10.550276] W [socket.c:588:__socket_rwv] 0-management: readv on /var/run/gluster/8f17601b1e8ee2e4cc3ff2b161ce34a8.socket failed (Invalid argument)
[2017-07-27 15:19:13.550300] W [socket.c:588:__socket_rwv] 0-nfs: readv on /var/run/gluster/992c8f2b1b91c6b724fe775e24fbf538.socket failed (Invalid argument)
[2017-07-27 15:19:13.550552] W [socket.c:588:__socket_rwv] 0-management: readv on /var/run/gluster/8f17601b1e8ee2e4cc3ff2b161ce34a8.socket failed (Invalid argument)
[2017-07-27 15:19:16.550533] W [socket.c:588:__socket_rwv] 0-nfs: readv on /var/run/gluster/992c8f2b1b91c6b724fe775e24fbf538.socket failed (Invalid argument)
[2017-07-27 15:19:16.550775] W [socket.c:588:__socket_rwv] 0-management: readv on /var/run/gluster/8f17601b1e8ee2e4cc3ff2b161ce34a8.socket failed (Invalid argument)
[2017-07-27 15:19:19.550766] W [socket.c:588:__socket_rwv] 0-nfs: readv on /var/run/gluster/992c8f2b1b91c6b724fe775e24fbf538.socket failed (Invalid argument)
[2017-07-27 15:19:19.551009] W [socket.c:588:__socket_rwv] 0-management: readv on /var/run/gluster/8f17601b1e8ee2e4cc3ff2b161ce34a8.socket failed (Invalid argument)

 

I think there are 2 different issues: the etc-glusterfs-glusterd.vol.log filling up and the brick disconnect

1) etc-glusterfs-glusterd.vol.log filling

I could reproduce the issue. It logs a line every 3 sec (MSGID: 106006), takes about 5 MB per day of storage space. It is probably a gluster bug. You can fix this with the command:

gluster volume set gfs-vol nfs.disable true

You need to run it once from any management node. This will stop the continuous logs, if you need to clear the old log then on each management node:

systemctl stop glusterfs-server

rm /var/log/glusterfs/etc-glusterfs-glusterd.vol.log

systemctl start glusterfs-server

 

2) brick disconnect

This is more serious error (MSGID: 106005), i could not reproduce it. Does it happen on all 3 management nodes or only 1 node ? Do you see your chart data or are they gone ?

can you restart the service

systemctl restart  glusterfs-server

please let me if your brick disconnect  problem persist and i will follow with you.

I will implement steps from #1 and keep an eye out for another brick disconnect. Will also look at the other nodes log when I have a chance. THank you!

I got the same issue.

Implemented all of the steps.

It does not work.

Any help !!

What is the effect you see ? is it just the charts in the dashboard not working ?
Can the first 3 nodes ping each other on Backend 1 ?
Has there been any changes to the network such as ip changes ?

On each of the first 3 nodes, what is the output of:

gluster peer status
gluster vol info gfs-vol
gluster vol status gfs-vol
systemctl status glusterfs-server
mount | grep shared
ls /opt/petasan/config/shared/graphite/whisper/PetaSAN/