GlusterFS mount attempt every 30 seconds
wid
47 Posts
March 6, 2023, 5:18 pmQuote from wid on March 6, 2023, 5:18 pmHey,
I have a GLusterFS mount attempt in the logs every 30 seconds.
Is this the correct action?
Where can I see what triggers such a mount and whether the mount status is ok or terminated with an error?
Since the last update, I have been having an adjustable problem with:
1 clients failing to respond to cache pressure
I checked, cache usage is at 100 MB.
Increasing the cache from 4 GB to 8 GB did not help.
I also gave more time for the client to empty the caches:
mds_cache_trim_interval = 2
But this did not give a resulat.
Interestingly, I have only NFS exposed from the cluster and it is only the NFS client that connects to the cluster.
/opt/petasan/log/PetaSAN.log
06/03/2023 17:52:55 INFO GlusterFS mount attempt
06/03/2023 17:53:25 INFO GlusterFS mount attempt
06/03/2023 17:53:55 INFO GlusterFS mount attempt
06/03/2023 17:54:25 INFO GlusterFS mount attempt
06/03/2023 17:54:55 INFO GlusterFS mount attempt
06/03/2023 17:55:25 INFO GlusterFS mount attempt
06/03/2023 17:55:55 INFO GlusterFS mount attempt
06/03/2023 17:56:25 INFO GlusterFS mount attempt
06/03/2023 17:56:55 INFO GlusterFS mount attempt
06/03/2023 17:57:25 INFO GlusterFS mount attempt
06/03/2023 17:57:55 INFO GlusterFS mount attempt
06/03/2023 17:58:25 INFO GlusterFS mount attempt
06/03/2023 17:58:55 INFO GlusterFS mount attempt
06/03/2023 17:59:25 INFO GlusterFS mount attempt
06/03/2023 17:59:56 INFO GlusterFS mount attempt
06/03/2023 18:00:26 INFO GlusterFS mount attempt
06/03/2023 18:00:56 INFO GlusterFS mount attempt
06/03/2023 18:01:26 INFO GlusterFS mount attempt
06/03/2023 18:01:56 INFO GlusterFS mount attempt
06/03/2023 18:02:26 INFO GlusterFS mount attempt
06/03/2023 18:02:56 INFO GlusterFS mount attempt
06/03/2023 18:03:26 INFO GlusterFS mount attempt
06/03/2023 18:03:56 INFO GlusterFS mount attempt
06/03/2023 18:04:26 INFO GlusterFS mount attempt
06/03/2023 18:04:56 INFO GlusterFS mount attempt
06/03/2023 18:05:26 INFO GlusterFS mount attempt
06/03/2023 18:05:56 INFO GlusterFS mount attempt
ceph tell mds.* client ls | grep hostname
2023-03-06T18:16:50.990+0100 7f3dca7fc700 0 client.67696816 ms_handle_reset on v2:172.30.0.43:6800/352678348
2023-03-06T18:16:51.018+0100 7f3dcb7fe700 0 client.67696822 ms_handle_reset on v2:172.30.0.43:6800/352678348
Error ENOSYS:
2023-03-06T18:16:51.022+0100 7f3dca7fc700 0 client.67696828 ms_handle_reset on v2:172.30.0.41:6800/1134846623
2023-03-06T18:16:51.126+0100 7f3dcb7fe700 0 client.67696834 ms_handle_reset on v2:172.30.0.41:6800/1134846623
"hostname": "NFS-172-30-0-142",
"hostname": "NFS-172-30-0-142",
"hostname": "NFS-172-30-0-141",
"hostname": "NFS-172-30-0-142",
"hostname": "NFS-172-30-0-141",
"hostname": "NFS-172-30-0-141",
"hostname": "ceph03",
"hostname": "NFS-172-30-0-143",
"hostname": "NFS-172-30-0-143",
"hostname": "NFS-172-30-0-143",
"hostname": "ceph01",
2023-03-06T18:16:51.262+0100 7f3dca7fc700 0 client.67696840 ms_handle_reset on v2:172.30.0.42:6800/1102300439
2023-03-06T18:16:51.290+0100 7f3dcb7fe700 0 client.67696852 ms_handle_reset on v2:172.30.0.42:6800/1102300439
"hostname": "ceph02",
"hostname": "NFS-172-30-0-142",
"hostname": "NFS-172-30-0-141",
"hostname": "NFS-172-30-0-142",
"hostname": "NFS-172-30-0-141",
"hostname": "NFS-172-30-0-142",
"hostname": "NFS-172-30-0-141",
"hostname": "NFS-172-30-0-143",
"hostname": "NFS-172-30-0-143",
"hostname": "NFS-172-30-0-143",
"hostname": "ceph01",
Hey,
I have a GLusterFS mount attempt in the logs every 30 seconds.
Is this the correct action?
Where can I see what triggers such a mount and whether the mount status is ok or terminated with an error?
Since the last update, I have been having an adjustable problem with:
1 clients failing to respond to cache pressure
I checked, cache usage is at 100 MB.
Increasing the cache from 4 GB to 8 GB did not help.
I also gave more time for the client to empty the caches:
mds_cache_trim_interval = 2
But this did not give a resulat.
Interestingly, I have only NFS exposed from the cluster and it is only the NFS client that connects to the cluster.
/opt/petasan/log/PetaSAN.log
06/03/2023 17:52:55 INFO GlusterFS mount attempt
06/03/2023 17:53:25 INFO GlusterFS mount attempt
06/03/2023 17:53:55 INFO GlusterFS mount attempt
06/03/2023 17:54:25 INFO GlusterFS mount attempt
06/03/2023 17:54:55 INFO GlusterFS mount attempt
06/03/2023 17:55:25 INFO GlusterFS mount attempt
06/03/2023 17:55:55 INFO GlusterFS mount attempt
06/03/2023 17:56:25 INFO GlusterFS mount attempt
06/03/2023 17:56:55 INFO GlusterFS mount attempt
06/03/2023 17:57:25 INFO GlusterFS mount attempt
06/03/2023 17:57:55 INFO GlusterFS mount attempt
06/03/2023 17:58:25 INFO GlusterFS mount attempt
06/03/2023 17:58:55 INFO GlusterFS mount attempt
06/03/2023 17:59:25 INFO GlusterFS mount attempt
06/03/2023 17:59:56 INFO GlusterFS mount attempt
06/03/2023 18:00:26 INFO GlusterFS mount attempt
06/03/2023 18:00:56 INFO GlusterFS mount attempt
06/03/2023 18:01:26 INFO GlusterFS mount attempt
06/03/2023 18:01:56 INFO GlusterFS mount attempt
06/03/2023 18:02:26 INFO GlusterFS mount attempt
06/03/2023 18:02:56 INFO GlusterFS mount attempt
06/03/2023 18:03:26 INFO GlusterFS mount attempt
06/03/2023 18:03:56 INFO GlusterFS mount attempt
06/03/2023 18:04:26 INFO GlusterFS mount attempt
06/03/2023 18:04:56 INFO GlusterFS mount attempt
06/03/2023 18:05:26 INFO GlusterFS mount attempt
06/03/2023 18:05:56 INFO GlusterFS mount attempt
ceph tell mds.* client ls | grep hostname
2023-03-06T18:16:50.990+0100 7f3dca7fc700 0 client.67696816 ms_handle_reset on v2:172.30.0.43:6800/352678348
2023-03-06T18:16:51.018+0100 7f3dcb7fe700 0 client.67696822 ms_handle_reset on v2:172.30.0.43:6800/352678348
Error ENOSYS:
2023-03-06T18:16:51.022+0100 7f3dca7fc700 0 client.67696828 ms_handle_reset on v2:172.30.0.41:6800/1134846623
2023-03-06T18:16:51.126+0100 7f3dcb7fe700 0 client.67696834 ms_handle_reset on v2:172.30.0.41:6800/1134846623
"hostname": "NFS-172-30-0-142",
"hostname": "NFS-172-30-0-142",
"hostname": "NFS-172-30-0-141",
"hostname": "NFS-172-30-0-142",
"hostname": "NFS-172-30-0-141",
"hostname": "NFS-172-30-0-141",
"hostname": "ceph03",
"hostname": "NFS-172-30-0-143",
"hostname": "NFS-172-30-0-143",
"hostname": "NFS-172-30-0-143",
"hostname": "ceph01",
2023-03-06T18:16:51.262+0100 7f3dca7fc700 0 client.67696840 ms_handle_reset on v2:172.30.0.42:6800/1102300439
2023-03-06T18:16:51.290+0100 7f3dcb7fe700 0 client.67696852 ms_handle_reset on v2:172.30.0.42:6800/1102300439
"hostname": "ceph02",
"hostname": "NFS-172-30-0-142",
"hostname": "NFS-172-30-0-141",
"hostname": "NFS-172-30-0-142",
"hostname": "NFS-172-30-0-141",
"hostname": "NFS-172-30-0-142",
"hostname": "NFS-172-30-0-141",
"hostname": "NFS-172-30-0-143",
"hostname": "NFS-172-30-0-143",
"hostname": "NFS-172-30-0-143",
"hostname": "ceph01",
admin
2,930 Posts
March 6, 2023, 7:06 pmQuote from admin on March 6, 2023, 7:06 pmglusterfs is used as shared file system for shared configuration and for storing stats for graphs
on any 1 of first 3 nodes, what is:
gluster vol status
gluster peer status
on first 3 nodes, what is output of:
systemctl status glusterd
glusterfs is used as shared file system for shared configuration and for storing stats for graphs
on any 1 of first 3 nodes, what is:
gluster vol status
gluster peer status
on first 3 nodes, what is output of:
systemctl status glusterd
wid
47 Posts
March 6, 2023, 9:16 pmQuote from wid on March 6, 2023, 9:16 pmIt's 3 node cluster,
You made perfect shot, as always 🙂
gluster vol status
Staging failed on ceph03. Please check log file for details.
gluster peer status
Number of Peers: 2
Hostname: 172.30.0.122
Uuid: fda17bf0-1fa8-4874-ae87-4be8f1f503a8
State: Peer in Cluster (Connected)
Hostname: ceph03
Uuid: 5197c531-dff4-44e8-bbd3-9ac4090bef4f
State: Peer in Cluster (Connected)
tail -f /var/log/glusterfs/glusterd.log
[2023-03-06 21:15:44.066365] E [MSGID: 106165] [glusterd-handshake.c:2060:__glusterd_mgmt_hndsk_version_cbk] 0-management: failed to get the 'versions' from peer (172.30.0.122:24007) [Invalid argument]
[2023-03-06 21:15:44.066416] I [MSGID: 106004] [glusterd-handler.c:6200:__glusterd_peer_rpc_notify] 0-management: Peer <172.30.0.122> (<fda17bf0-1fa8-4874-ae87-4be8f1f503a8>), in state <Peer in Cluster>, has disconnected from glusterd.
[2023-03-06 21:15:47.022417] E [MSGID: 106170] [glusterd-handshake.c:1264:gd_validate_mgmt_hndsk_req] 0-management: Request from peer 172.30.0.43:49139 has an entry in peerinfo, but uuid does not match
[2023-03-06 21:15:47.022471] E [MSGID: 106170] [glusterd-handshake.c:1274:gd_validate_mgmt_hndsk_req] 0-management: Rejecting management handshake request from unknown peer 172.30.0.43:49139
[2023-03-06 21:15:47.022595] E [MSGID: 106165] [glusterd-handshake.c:2060:__glusterd_mgmt_hndsk_version_cbk] 0-management: failed to get the 'versions' from peer (172.30.0.43:24007) [Invalid argument]
[2023-03-06 21:15:47.022696] I [MSGID: 106004] [glusterd-handler.c:6200:__glusterd_peer_rpc_notify] 0-management: Peer <ceph03> (<5197c531-dff4-44e8-bbd3-9ac4090bef4f>), in state <Peer Rejected>, has disconnected from glusterd.
[2023-03-06 21:15:47.022834] E [MSGID: 106165] [glusterd-handshake.c:2060:__glusterd_mgmt_hndsk_version_cbk] 0-management: failed to get the 'versions' from peer (172.30.0.121:24007) [Invalid argument]
[2023-03-06 21:15:47.022879] I [MSGID: 106004] [glusterd-handler.c:6200:__glusterd_peer_rpc_notify] 0-management: Peer <172.30.0.121> (<1331e99e-83bb-45c7-a726-bdc99483b70a>), in state <Peer in Cluster>, has disconnected from glusterd.
[2023-03-06 21:15:47.067863] E [MSGID: 106165] [glusterd-handshake.c:2060:__glusterd_mgmt_hndsk_version_cbk] 0-management: failed to get the 'versions' from peer (172.30.0.122:24007) [Invalid argument]
[2023-03-06 21:15:47.067946] I [MSGID: 106004] [glusterd-handler.c:6200:__glusterd_peer_rpc_notify] 0-management: Peer <172.30.0.122> (<fda17bf0-1fa8-4874-ae87-4be8f1f503a8>), in state <Peer in Cluster>, has disconnected from glusterd.
Ceph01
● glusterd.service - GlusterFS, a clustered file-system server
Loaded: loaded (/lib/systemd/system/glusterd.service; disabled; vendor preset: enabled)
Active: active (running) since Sun 2022-12-04 14:36:31 CET; 3 months 1 days ago
Docs: man:glusterd(8)
Main PID: 2389 (glusterd)
Tasks: 9 (limit: 115873)
Memory: 18.2M
CGroup: /system.slice/glusterd.service
└─2389 /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO
Mar 06 22:10:55 ceph01 glusterd[2389]: [2023-03-06 21:10:55.958491] C [MSGID: 106147] [glusterd-syncop.c:783:_gd_syncop_stage_op_cbk] 0-management: Staging response for 'Volume Status' received from unknown peer: 9f85c001-5f22-4868-aea9-c8c9d85b0b5b
Mar 06 22:11:12 ceph01 glusterd[2389]: [2023-03-06 21:11:12.243447] C [MSGID: 106147] [glusterd-syncop.c:783:_gd_syncop_stage_op_cbk] 0-management: Staging response for 'Volume Status' received from unknown peer: 9f85c001-5f22-4868-aea9-c8c9d85b0b5b
Mar 06 22:11:32 ceph01 glusterd[2389]: [2023-03-06 21:11:32.804939] C [MSGID: 106147] [glusterd-syncop.c:783:_gd_syncop_stage_op_cbk] 0-management: Staging response for 'Volume Status' received from unknown peer: 9f85c001-5f22-4868-aea9-c8c9d85b0b5b
Mar 06 22:11:34 ceph01 glusterd[2389]: [2023-03-06 21:11:34.397162] C [MSGID: 106147] [glusterd-syncop.c:783:_gd_syncop_stage_op_cbk] 0-management: Staging response for 'Volume Status' received from unknown peer: 9f85c001-5f22-4868-aea9-c8c9d85b0b5b
Mar 06 22:11:34 ceph01 glusterd[2389]: [2023-03-06 21:11:34.964503] C [MSGID: 106147] [glusterd-syncop.c:783:_gd_syncop_stage_op_cbk] 0-management: Staging response for 'Volume Status' received from unknown peer: 9f85c001-5f22-4868-aea9-c8c9d85b0b5b
Mar 06 22:11:35 ceph01 glusterd[2389]: [2023-03-06 21:11:35.428406] C [MSGID: 106147] [glusterd-syncop.c:783:_gd_syncop_stage_op_cbk] 0-management: Staging response for 'Volume Status' received from unknown peer: 9f85c001-5f22-4868-aea9-c8c9d85b0b5b
Mar 06 22:11:35 ceph01 glusterd[2389]: [2023-03-06 21:11:35.845524] C [MSGID: 106147] [glusterd-syncop.c:783:_gd_syncop_stage_op_cbk] 0-management: Staging response for 'Volume Status' received from unknown peer: 9f85c001-5f22-4868-aea9-c8c9d85b0b5b
CEPH02
● glusterd.service - GlusterFS, a clustered file-system server
Loaded: loaded (/lib/systemd/system/glusterd.service; disabled; vendor preset: enabled)
Active: active (running) since Fri 2023-01-13 18:04:59 CET; 1 months 21 days ago
Docs: man:glusterd(8)
Main PID: 3956 (glusterd)
Tasks: 9 (limit: 115873)
Memory: 24.8M
CGroup: /system.slice/glusterd.service
└─3956 /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO
Jan 13 18:04:56 ceph02 systemd[1]: Starting GlusterFS, a clustered file-system server...
Jan 13 18:04:59 ceph02 systemd[1]: Started GlusterFS, a clustered file-system server.
CEPH03
● glusterd.service - GlusterFS, a clustered file-system server
Loaded: loaded (/lib/systemd/system/glusterd.service; disabled; vendor preset: enabled)
Active: active (running) since Mon 2023-03-06 22:14:49 CET; 3min 54s ago
Docs: man:glusterd(8)
Process: 52546 ExecStart=/usr/sbin/glusterd -p /var/run/glusterd.pid --log-level $LOG_LEVEL $GLUSTERD_OPTIONS (code=exited, status=0/SUCCESS)
Main PID: 52547 (glusterd)
Tasks: 9 (limit: 115614)
Memory: 6.6M
CGroup: /system.slice/glusterd.service
└─52547 /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO
Mar 06 22:14:47 ceph03 systemd[1]: Starting GlusterFS, a clustered file-system server...
Mar 06 22:14:49 ceph03 systemd[1]: Started GlusterFS, a clustered file-system server.
It's 3 node cluster,
You made perfect shot, as always 🙂
gluster vol status
Staging failed on ceph03. Please check log file for details.
gluster peer status
Number of Peers: 2
Hostname: 172.30.0.122
Uuid: fda17bf0-1fa8-4874-ae87-4be8f1f503a8
State: Peer in Cluster (Connected)
Hostname: ceph03
Uuid: 5197c531-dff4-44e8-bbd3-9ac4090bef4f
State: Peer in Cluster (Connected)
tail -f /var/log/glusterfs/glusterd.log
[2023-03-06 21:15:44.066365] E [MSGID: 106165] [glusterd-handshake.c:2060:__glusterd_mgmt_hndsk_version_cbk] 0-management: failed to get the 'versions' from peer (172.30.0.122:24007) [Invalid argument]
[2023-03-06 21:15:44.066416] I [MSGID: 106004] [glusterd-handler.c:6200:__glusterd_peer_rpc_notify] 0-management: Peer <172.30.0.122> (<fda17bf0-1fa8-4874-ae87-4be8f1f503a8>), in state <Peer in Cluster>, has disconnected from glusterd.
[2023-03-06 21:15:47.022417] E [MSGID: 106170] [glusterd-handshake.c:1264:gd_validate_mgmt_hndsk_req] 0-management: Request from peer 172.30.0.43:49139 has an entry in peerinfo, but uuid does not match
[2023-03-06 21:15:47.022471] E [MSGID: 106170] [glusterd-handshake.c:1274:gd_validate_mgmt_hndsk_req] 0-management: Rejecting management handshake request from unknown peer 172.30.0.43:49139
[2023-03-06 21:15:47.022595] E [MSGID: 106165] [glusterd-handshake.c:2060:__glusterd_mgmt_hndsk_version_cbk] 0-management: failed to get the 'versions' from peer (172.30.0.43:24007) [Invalid argument]
[2023-03-06 21:15:47.022696] I [MSGID: 106004] [glusterd-handler.c:6200:__glusterd_peer_rpc_notify] 0-management: Peer <ceph03> (<5197c531-dff4-44e8-bbd3-9ac4090bef4f>), in state <Peer Rejected>, has disconnected from glusterd.
[2023-03-06 21:15:47.022834] E [MSGID: 106165] [glusterd-handshake.c:2060:__glusterd_mgmt_hndsk_version_cbk] 0-management: failed to get the 'versions' from peer (172.30.0.121:24007) [Invalid argument]
[2023-03-06 21:15:47.022879] I [MSGID: 106004] [glusterd-handler.c:6200:__glusterd_peer_rpc_notify] 0-management: Peer <172.30.0.121> (<1331e99e-83bb-45c7-a726-bdc99483b70a>), in state <Peer in Cluster>, has disconnected from glusterd.
[2023-03-06 21:15:47.067863] E [MSGID: 106165] [glusterd-handshake.c:2060:__glusterd_mgmt_hndsk_version_cbk] 0-management: failed to get the 'versions' from peer (172.30.0.122:24007) [Invalid argument]
[2023-03-06 21:15:47.067946] I [MSGID: 106004] [glusterd-handler.c:6200:__glusterd_peer_rpc_notify] 0-management: Peer <172.30.0.122> (<fda17bf0-1fa8-4874-ae87-4be8f1f503a8>), in state <Peer in Cluster>, has disconnected from glusterd.
Ceph01
● glusterd.service - GlusterFS, a clustered file-system server
Loaded: loaded (/lib/systemd/system/glusterd.service; disabled; vendor preset: enabled)
Active: active (running) since Sun 2022-12-04 14:36:31 CET; 3 months 1 days ago
Docs: man:glusterd(8)
Main PID: 2389 (glusterd)
Tasks: 9 (limit: 115873)
Memory: 18.2M
CGroup: /system.slice/glusterd.service
└─2389 /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO
Mar 06 22:10:55 ceph01 glusterd[2389]: [2023-03-06 21:10:55.958491] C [MSGID: 106147] [glusterd-syncop.c:783:_gd_syncop_stage_op_cbk] 0-management: Staging response for 'Volume Status' received from unknown peer: 9f85c001-5f22-4868-aea9-c8c9d85b0b5b
Mar 06 22:11:12 ceph01 glusterd[2389]: [2023-03-06 21:11:12.243447] C [MSGID: 106147] [glusterd-syncop.c:783:_gd_syncop_stage_op_cbk] 0-management: Staging response for 'Volume Status' received from unknown peer: 9f85c001-5f22-4868-aea9-c8c9d85b0b5b
Mar 06 22:11:32 ceph01 glusterd[2389]: [2023-03-06 21:11:32.804939] C [MSGID: 106147] [glusterd-syncop.c:783:_gd_syncop_stage_op_cbk] 0-management: Staging response for 'Volume Status' received from unknown peer: 9f85c001-5f22-4868-aea9-c8c9d85b0b5b
Mar 06 22:11:34 ceph01 glusterd[2389]: [2023-03-06 21:11:34.397162] C [MSGID: 106147] [glusterd-syncop.c:783:_gd_syncop_stage_op_cbk] 0-management: Staging response for 'Volume Status' received from unknown peer: 9f85c001-5f22-4868-aea9-c8c9d85b0b5b
Mar 06 22:11:34 ceph01 glusterd[2389]: [2023-03-06 21:11:34.964503] C [MSGID: 106147] [glusterd-syncop.c:783:_gd_syncop_stage_op_cbk] 0-management: Staging response for 'Volume Status' received from unknown peer: 9f85c001-5f22-4868-aea9-c8c9d85b0b5b
Mar 06 22:11:35 ceph01 glusterd[2389]: [2023-03-06 21:11:35.428406] C [MSGID: 106147] [glusterd-syncop.c:783:_gd_syncop_stage_op_cbk] 0-management: Staging response for 'Volume Status' received from unknown peer: 9f85c001-5f22-4868-aea9-c8c9d85b0b5b
Mar 06 22:11:35 ceph01 glusterd[2389]: [2023-03-06 21:11:35.845524] C [MSGID: 106147] [glusterd-syncop.c:783:_gd_syncop_stage_op_cbk] 0-management: Staging response for 'Volume Status' received from unknown peer: 9f85c001-5f22-4868-aea9-c8c9d85b0b5b
CEPH02
● glusterd.service - GlusterFS, a clustered file-system server
Loaded: loaded (/lib/systemd/system/glusterd.service; disabled; vendor preset: enabled)
Active: active (running) since Fri 2023-01-13 18:04:59 CET; 1 months 21 days ago
Docs: man:glusterd(8)
Main PID: 3956 (glusterd)
Tasks: 9 (limit: 115873)
Memory: 24.8M
CGroup: /system.slice/glusterd.service
└─3956 /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO
Jan 13 18:04:56 ceph02 systemd[1]: Starting GlusterFS, a clustered file-system server...
Jan 13 18:04:59 ceph02 systemd[1]: Started GlusterFS, a clustered file-system server.
CEPH03
● glusterd.service - GlusterFS, a clustered file-system server
Loaded: loaded (/lib/systemd/system/glusterd.service; disabled; vendor preset: enabled)
Active: active (running) since Mon 2023-03-06 22:14:49 CET; 3min 54s ago
Docs: man:glusterd(8)
Process: 52546 ExecStart=/usr/sbin/glusterd -p /var/run/glusterd.pid --log-level $LOG_LEVEL $GLUSTERD_OPTIONS (code=exited, status=0/SUCCESS)
Main PID: 52547 (glusterd)
Tasks: 9 (limit: 115614)
Memory: 6.6M
CGroup: /system.slice/glusterd.service
└─52547 /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO
Mar 06 22:14:47 ceph03 systemd[1]: Starting GlusterFS, a clustered file-system server...
Mar 06 22:14:49 ceph03 systemd[1]: Started GlusterFS, a clustered file-system server.
Last edited on March 6, 2023, 9:18 pm by wid · #3
admin
2,930 Posts
March 7, 2023, 7:40 amQuote from admin on March 7, 2023, 7:40 amcan you run
gluster vol status
gluster peer status
on all 3 nodes
can you run
gluster vol status
gluster peer status
on all 3 nodes
wid
47 Posts
March 7, 2023, 12:33 pmQuote from wid on March 7, 2023, 12:33 pm=== [ ceph01 ] ===
Staging failed on ceph03. Please check log file for details.
Number of Peers: 2
Hostname: 172.30.0.122
Uuid: fda17bf0-1fa8-4874-ae87-4be8f1f503a8
State: Peer in Cluster (Connected)
Hostname: ceph03
Uuid: 5197c531-dff4-44e8-bbd3-9ac4090bef4f
State: Peer in Cluster (Connected)
=== [ ceph02 ] ===
Staging failed on ceph03. Please check log file for details.
Number of Peers: 2
Hostname: ceph03
Uuid: 5197c531-dff4-44e8-bbd3-9ac4090bef4f
State: Peer in Cluster (Connected)
Hostname: 172.30.0.121
Uuid: 1331e99e-83bb-45c7-a726-bdc99483b70a
State: Peer in Cluster (Connected)
=== [ ceph03 ] ===
No volumes present
Number of Peers: 3
Hostname: 172.30.0.122
Uuid: fda17bf0-1fa8-4874-ae87-4be8f1f503a8
State: Peer in Cluster (Disconnected)
Hostname: ceph03
Uuid: 5197c531-dff4-44e8-bbd3-9ac4090bef4f
State: Peer Rejected (Disconnected)
Hostname: 172.30.0.121
Uuid: 1331e99e-83bb-45c7-a726-bdc99483b70a
State: Peer in Cluster (Disconnected)
=== [ ceph01 ] ===
Staging failed on ceph03. Please check log file for details.
Number of Peers: 2
Hostname: 172.30.0.122
Uuid: fda17bf0-1fa8-4874-ae87-4be8f1f503a8
State: Peer in Cluster (Connected)
Hostname: ceph03
Uuid: 5197c531-dff4-44e8-bbd3-9ac4090bef4f
State: Peer in Cluster (Connected)
=== [ ceph02 ] ===
Staging failed on ceph03. Please check log file for details.
Number of Peers: 2
Hostname: ceph03
Uuid: 5197c531-dff4-44e8-bbd3-9ac4090bef4f
State: Peer in Cluster (Connected)
Hostname: 172.30.0.121
Uuid: 1331e99e-83bb-45c7-a726-bdc99483b70a
State: Peer in Cluster (Connected)
=== [ ceph03 ] ===
No volumes present
Number of Peers: 3
Hostname: 172.30.0.122
Uuid: fda17bf0-1fa8-4874-ae87-4be8f1f503a8
State: Peer in Cluster (Disconnected)
Hostname: ceph03
Uuid: 5197c531-dff4-44e8-bbd3-9ac4090bef4f
State: Peer Rejected (Disconnected)
Hostname: 172.30.0.121
Uuid: 1331e99e-83bb-45c7-a726-bdc99483b70a
State: Peer in Cluster (Disconnected)
wid
47 Posts
March 13, 2023, 2:12 amQuote from wid on March 13, 2023, 2:12 amHi,
is my information sufficient for diagnostics?
Thank You
Hi,
is my information sufficient for diagnostics?
Thank You
admin
2,930 Posts
March 13, 2023, 11:42 amQuote from admin on March 13, 2023, 11:42 am
I assume the error you posted are from running the 2 commands i had posted. if so, the gluster server configuration is screwed up, the error says to look at the log which is located in
/var/log/glusterfs/
it may be better to just re-configure gluster server on the first 3 nodes using the backend ip, delete existing configuration in:
/var/lib/glusterd/peers
/var/lib/glusterd/vols/
and setup a 3x replicated volume using gluster documentation:
volume name: gfs-vol
brick path: /opt/petasan/config/gfs-brick
make sure you run gluster on the backend ip
it case you need the old statistics/charts data, you may want to backup /opt/petasan/config/gfs-brick on one server before deleting the brick, you can then write the old data once the servers are up on the mounted point
/opt/petasan/config/shared
You can also buy support from us if you need. Good luck.
I assume the error you posted are from running the 2 commands i had posted. if so, the gluster server configuration is screwed up, the error says to look at the log which is located in
/var/log/glusterfs/
it may be better to just re-configure gluster server on the first 3 nodes using the backend ip, delete existing configuration in:
/var/lib/glusterd/peers
/var/lib/glusterd/vols/
and setup a 3x replicated volume using gluster documentation:
volume name: gfs-vol
brick path: /opt/petasan/config/gfs-brick
make sure you run gluster on the backend ip
it case you need the old statistics/charts data, you may want to backup /opt/petasan/config/gfs-brick on one server before deleting the brick, you can then write the old data once the servers are up on the mounted point
/opt/petasan/config/shared
You can also buy support from us if you need. Good luck.
Last edited on March 13, 2023, 11:43 am by admin · #7
wid
47 Posts
March 14, 2023, 3:13 pmQuote from wid on March 14, 2023, 3:13 pmIt work's, thank You!
It work's, thank You!
admin
2,930 Posts
March 14, 2023, 3:48 pmQuote from admin on March 14, 2023, 3:48 pmGlad things are working 🙂
Glad things are working 🙂
GlusterFS mount attempt every 30 seconds
wid
47 Posts
Quote from wid on March 6, 2023, 5:18 pmHey,
I have a GLusterFS mount attempt in the logs every 30 seconds.
Is this the correct action?
Where can I see what triggers such a mount and whether the mount status is ok or terminated with an error?
Since the last update, I have been having an adjustable problem with:
1 clients failing to respond to cache pressureI checked, cache usage is at 100 MB.
Increasing the cache from 4 GB to 8 GB did not help.I also gave more time for the client to empty the caches:
mds_cache_trim_interval = 2But this did not give a resulat.
Interestingly, I have only NFS exposed from the cluster and it is only the NFS client that connects to the cluster.
/opt/petasan/log/PetaSAN.log
06/03/2023 17:52:55 INFO GlusterFS mount attempt
06/03/2023 17:53:25 INFO GlusterFS mount attempt
06/03/2023 17:53:55 INFO GlusterFS mount attempt
06/03/2023 17:54:25 INFO GlusterFS mount attempt
06/03/2023 17:54:55 INFO GlusterFS mount attempt
06/03/2023 17:55:25 INFO GlusterFS mount attempt
06/03/2023 17:55:55 INFO GlusterFS mount attempt
06/03/2023 17:56:25 INFO GlusterFS mount attempt
06/03/2023 17:56:55 INFO GlusterFS mount attempt
06/03/2023 17:57:25 INFO GlusterFS mount attempt
06/03/2023 17:57:55 INFO GlusterFS mount attempt
06/03/2023 17:58:25 INFO GlusterFS mount attempt
06/03/2023 17:58:55 INFO GlusterFS mount attempt
06/03/2023 17:59:25 INFO GlusterFS mount attempt
06/03/2023 17:59:56 INFO GlusterFS mount attempt
06/03/2023 18:00:26 INFO GlusterFS mount attempt
06/03/2023 18:00:56 INFO GlusterFS mount attempt
06/03/2023 18:01:26 INFO GlusterFS mount attempt
06/03/2023 18:01:56 INFO GlusterFS mount attempt
06/03/2023 18:02:26 INFO GlusterFS mount attempt
06/03/2023 18:02:56 INFO GlusterFS mount attempt
06/03/2023 18:03:26 INFO GlusterFS mount attempt
06/03/2023 18:03:56 INFO GlusterFS mount attempt
06/03/2023 18:04:26 INFO GlusterFS mount attempt
06/03/2023 18:04:56 INFO GlusterFS mount attempt
06/03/2023 18:05:26 INFO GlusterFS mount attempt
06/03/2023 18:05:56 INFO GlusterFS mount attemptceph tell mds.* client ls | grep hostname
2023-03-06T18:16:50.990+0100 7f3dca7fc700 0 client.67696816 ms_handle_reset on v2:172.30.0.43:6800/352678348
2023-03-06T18:16:51.018+0100 7f3dcb7fe700 0 client.67696822 ms_handle_reset on v2:172.30.0.43:6800/352678348
Error ENOSYS:
2023-03-06T18:16:51.022+0100 7f3dca7fc700 0 client.67696828 ms_handle_reset on v2:172.30.0.41:6800/1134846623
2023-03-06T18:16:51.126+0100 7f3dcb7fe700 0 client.67696834 ms_handle_reset on v2:172.30.0.41:6800/1134846623
"hostname": "NFS-172-30-0-142",
"hostname": "NFS-172-30-0-142",
"hostname": "NFS-172-30-0-141",
"hostname": "NFS-172-30-0-142",
"hostname": "NFS-172-30-0-141",
"hostname": "NFS-172-30-0-141",
"hostname": "ceph03",
"hostname": "NFS-172-30-0-143",
"hostname": "NFS-172-30-0-143",
"hostname": "NFS-172-30-0-143",
"hostname": "ceph01",
2023-03-06T18:16:51.262+0100 7f3dca7fc700 0 client.67696840 ms_handle_reset on v2:172.30.0.42:6800/1102300439
2023-03-06T18:16:51.290+0100 7f3dcb7fe700 0 client.67696852 ms_handle_reset on v2:172.30.0.42:6800/1102300439
"hostname": "ceph02",
"hostname": "NFS-172-30-0-142",
"hostname": "NFS-172-30-0-141",
"hostname": "NFS-172-30-0-142",
"hostname": "NFS-172-30-0-141",
"hostname": "NFS-172-30-0-142",
"hostname": "NFS-172-30-0-141",
"hostname": "NFS-172-30-0-143",
"hostname": "NFS-172-30-0-143",
"hostname": "NFS-172-30-0-143",
"hostname": "ceph01",
Hey,
I have a GLusterFS mount attempt in the logs every 30 seconds.
Is this the correct action?
Where can I see what triggers such a mount and whether the mount status is ok or terminated with an error?
Since the last update, I have been having an adjustable problem with:
1 clients failing to respond to cache pressure
I checked, cache usage is at 100 MB.
Increasing the cache from 4 GB to 8 GB did not help.
I also gave more time for the client to empty the caches:
mds_cache_trim_interval = 2
But this did not give a resulat.
Interestingly, I have only NFS exposed from the cluster and it is only the NFS client that connects to the cluster.
/opt/petasan/log/PetaSAN.log
06/03/2023 17:52:55 INFO GlusterFS mount attempt
06/03/2023 17:53:25 INFO GlusterFS mount attempt
06/03/2023 17:53:55 INFO GlusterFS mount attempt
06/03/2023 17:54:25 INFO GlusterFS mount attempt
06/03/2023 17:54:55 INFO GlusterFS mount attempt
06/03/2023 17:55:25 INFO GlusterFS mount attempt
06/03/2023 17:55:55 INFO GlusterFS mount attempt
06/03/2023 17:56:25 INFO GlusterFS mount attempt
06/03/2023 17:56:55 INFO GlusterFS mount attempt
06/03/2023 17:57:25 INFO GlusterFS mount attempt
06/03/2023 17:57:55 INFO GlusterFS mount attempt
06/03/2023 17:58:25 INFO GlusterFS mount attempt
06/03/2023 17:58:55 INFO GlusterFS mount attempt
06/03/2023 17:59:25 INFO GlusterFS mount attempt
06/03/2023 17:59:56 INFO GlusterFS mount attempt
06/03/2023 18:00:26 INFO GlusterFS mount attempt
06/03/2023 18:00:56 INFO GlusterFS mount attempt
06/03/2023 18:01:26 INFO GlusterFS mount attempt
06/03/2023 18:01:56 INFO GlusterFS mount attempt
06/03/2023 18:02:26 INFO GlusterFS mount attempt
06/03/2023 18:02:56 INFO GlusterFS mount attempt
06/03/2023 18:03:26 INFO GlusterFS mount attempt
06/03/2023 18:03:56 INFO GlusterFS mount attempt
06/03/2023 18:04:26 INFO GlusterFS mount attempt
06/03/2023 18:04:56 INFO GlusterFS mount attempt
06/03/2023 18:05:26 INFO GlusterFS mount attempt
06/03/2023 18:05:56 INFO GlusterFS mount attempt
ceph tell mds.* client ls | grep hostname
2023-03-06T18:16:50.990+0100 7f3dca7fc700 0 client.67696816 ms_handle_reset on v2:172.30.0.43:6800/352678348
2023-03-06T18:16:51.018+0100 7f3dcb7fe700 0 client.67696822 ms_handle_reset on v2:172.30.0.43:6800/352678348
Error ENOSYS:
2023-03-06T18:16:51.022+0100 7f3dca7fc700 0 client.67696828 ms_handle_reset on v2:172.30.0.41:6800/1134846623
2023-03-06T18:16:51.126+0100 7f3dcb7fe700 0 client.67696834 ms_handle_reset on v2:172.30.0.41:6800/1134846623
"hostname": "NFS-172-30-0-142",
"hostname": "NFS-172-30-0-142",
"hostname": "NFS-172-30-0-141",
"hostname": "NFS-172-30-0-142",
"hostname": "NFS-172-30-0-141",
"hostname": "NFS-172-30-0-141",
"hostname": "ceph03",
"hostname": "NFS-172-30-0-143",
"hostname": "NFS-172-30-0-143",
"hostname": "NFS-172-30-0-143",
"hostname": "ceph01",
2023-03-06T18:16:51.262+0100 7f3dca7fc700 0 client.67696840 ms_handle_reset on v2:172.30.0.42:6800/1102300439
2023-03-06T18:16:51.290+0100 7f3dcb7fe700 0 client.67696852 ms_handle_reset on v2:172.30.0.42:6800/1102300439
"hostname": "ceph02",
"hostname": "NFS-172-30-0-142",
"hostname": "NFS-172-30-0-141",
"hostname": "NFS-172-30-0-142",
"hostname": "NFS-172-30-0-141",
"hostname": "NFS-172-30-0-142",
"hostname": "NFS-172-30-0-141",
"hostname": "NFS-172-30-0-143",
"hostname": "NFS-172-30-0-143",
"hostname": "NFS-172-30-0-143",
"hostname": "ceph01",
admin
2,930 Posts
Quote from admin on March 6, 2023, 7:06 pmglusterfs is used as shared file system for shared configuration and for storing stats for graphs
on any 1 of first 3 nodes, what is:
gluster vol status
gluster peer statuson first 3 nodes, what is output of:
systemctl status glusterd
glusterfs is used as shared file system for shared configuration and for storing stats for graphs
on any 1 of first 3 nodes, what is:
gluster vol status
gluster peer status
on first 3 nodes, what is output of:
systemctl status glusterd
wid
47 Posts
Quote from wid on March 6, 2023, 9:16 pmIt's 3 node cluster,
You made perfect shot, as always 🙂
gluster vol status
Staging failed on ceph03. Please check log file for details.gluster peer status
Number of Peers: 2Hostname: 172.30.0.122
Uuid: fda17bf0-1fa8-4874-ae87-4be8f1f503a8
State: Peer in Cluster (Connected)Hostname: ceph03
Uuid: 5197c531-dff4-44e8-bbd3-9ac4090bef4f
State: Peer in Cluster (Connected)tail -f /var/log/glusterfs/glusterd.log
[2023-03-06 21:15:44.066365] E [MSGID: 106165] [glusterd-handshake.c:2060:__glusterd_mgmt_hndsk_version_cbk] 0-management: failed to get the 'versions' from peer (172.30.0.122:24007) [Invalid argument]
[2023-03-06 21:15:44.066416] I [MSGID: 106004] [glusterd-handler.c:6200:__glusterd_peer_rpc_notify] 0-management: Peer <172.30.0.122> (<fda17bf0-1fa8-4874-ae87-4be8f1f503a8>), in state <Peer in Cluster>, has disconnected from glusterd.
[2023-03-06 21:15:47.022417] E [MSGID: 106170] [glusterd-handshake.c:1264:gd_validate_mgmt_hndsk_req] 0-management: Request from peer 172.30.0.43:49139 has an entry in peerinfo, but uuid does not match
[2023-03-06 21:15:47.022471] E [MSGID: 106170] [glusterd-handshake.c:1274:gd_validate_mgmt_hndsk_req] 0-management: Rejecting management handshake request from unknown peer 172.30.0.43:49139
[2023-03-06 21:15:47.022595] E [MSGID: 106165] [glusterd-handshake.c:2060:__glusterd_mgmt_hndsk_version_cbk] 0-management: failed to get the 'versions' from peer (172.30.0.43:24007) [Invalid argument]
[2023-03-06 21:15:47.022696] I [MSGID: 106004] [glusterd-handler.c:6200:__glusterd_peer_rpc_notify] 0-management: Peer <ceph03> (<5197c531-dff4-44e8-bbd3-9ac4090bef4f>), in state <Peer Rejected>, has disconnected from glusterd.
[2023-03-06 21:15:47.022834] E [MSGID: 106165] [glusterd-handshake.c:2060:__glusterd_mgmt_hndsk_version_cbk] 0-management: failed to get the 'versions' from peer (172.30.0.121:24007) [Invalid argument]
[2023-03-06 21:15:47.022879] I [MSGID: 106004] [glusterd-handler.c:6200:__glusterd_peer_rpc_notify] 0-management: Peer <172.30.0.121> (<1331e99e-83bb-45c7-a726-bdc99483b70a>), in state <Peer in Cluster>, has disconnected from glusterd.
[2023-03-06 21:15:47.067863] E [MSGID: 106165] [glusterd-handshake.c:2060:__glusterd_mgmt_hndsk_version_cbk] 0-management: failed to get the 'versions' from peer (172.30.0.122:24007) [Invalid argument]
[2023-03-06 21:15:47.067946] I [MSGID: 106004] [glusterd-handler.c:6200:__glusterd_peer_rpc_notify] 0-management: Peer <172.30.0.122> (<fda17bf0-1fa8-4874-ae87-4be8f1f503a8>), in state <Peer in Cluster>, has disconnected from glusterd.Ceph01
● glusterd.service - GlusterFS, a clustered file-system server
Loaded: loaded (/lib/systemd/system/glusterd.service; disabled; vendor preset: enabled)
Active: active (running) since Sun 2022-12-04 14:36:31 CET; 3 months 1 days ago
Docs: man:glusterd(8)
Main PID: 2389 (glusterd)
Tasks: 9 (limit: 115873)
Memory: 18.2M
CGroup: /system.slice/glusterd.service
└─2389 /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFOMar 06 22:10:55 ceph01 glusterd[2389]: [2023-03-06 21:10:55.958491] C [MSGID: 106147] [glusterd-syncop.c:783:_gd_syncop_stage_op_cbk] 0-management: Staging response for 'Volume Status' received from unknown peer: 9f85c001-5f22-4868-aea9-c8c9d85b0b5b
Mar 06 22:11:12 ceph01 glusterd[2389]: [2023-03-06 21:11:12.243447] C [MSGID: 106147] [glusterd-syncop.c:783:_gd_syncop_stage_op_cbk] 0-management: Staging response for 'Volume Status' received from unknown peer: 9f85c001-5f22-4868-aea9-c8c9d85b0b5b
Mar 06 22:11:32 ceph01 glusterd[2389]: [2023-03-06 21:11:32.804939] C [MSGID: 106147] [glusterd-syncop.c:783:_gd_syncop_stage_op_cbk] 0-management: Staging response for 'Volume Status' received from unknown peer: 9f85c001-5f22-4868-aea9-c8c9d85b0b5b
Mar 06 22:11:34 ceph01 glusterd[2389]: [2023-03-06 21:11:34.397162] C [MSGID: 106147] [glusterd-syncop.c:783:_gd_syncop_stage_op_cbk] 0-management: Staging response for 'Volume Status' received from unknown peer: 9f85c001-5f22-4868-aea9-c8c9d85b0b5b
Mar 06 22:11:34 ceph01 glusterd[2389]: [2023-03-06 21:11:34.964503] C [MSGID: 106147] [glusterd-syncop.c:783:_gd_syncop_stage_op_cbk] 0-management: Staging response for 'Volume Status' received from unknown peer: 9f85c001-5f22-4868-aea9-c8c9d85b0b5b
Mar 06 22:11:35 ceph01 glusterd[2389]: [2023-03-06 21:11:35.428406] C [MSGID: 106147] [glusterd-syncop.c:783:_gd_syncop_stage_op_cbk] 0-management: Staging response for 'Volume Status' received from unknown peer: 9f85c001-5f22-4868-aea9-c8c9d85b0b5b
Mar 06 22:11:35 ceph01 glusterd[2389]: [2023-03-06 21:11:35.845524] C [MSGID: 106147] [glusterd-syncop.c:783:_gd_syncop_stage_op_cbk] 0-management: Staging response for 'Volume Status' received from unknown peer: 9f85c001-5f22-4868-aea9-c8c9d85b0b5bCEPH02
● glusterd.service - GlusterFS, a clustered file-system server
Loaded: loaded (/lib/systemd/system/glusterd.service; disabled; vendor preset: enabled)
Active: active (running) since Fri 2023-01-13 18:04:59 CET; 1 months 21 days ago
Docs: man:glusterd(8)
Main PID: 3956 (glusterd)
Tasks: 9 (limit: 115873)
Memory: 24.8M
CGroup: /system.slice/glusterd.service
└─3956 /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFOJan 13 18:04:56 ceph02 systemd[1]: Starting GlusterFS, a clustered file-system server...
Jan 13 18:04:59 ceph02 systemd[1]: Started GlusterFS, a clustered file-system server.CEPH03
● glusterd.service - GlusterFS, a clustered file-system server
Loaded: loaded (/lib/systemd/system/glusterd.service; disabled; vendor preset: enabled)
Active: active (running) since Mon 2023-03-06 22:14:49 CET; 3min 54s ago
Docs: man:glusterd(8)
Process: 52546 ExecStart=/usr/sbin/glusterd -p /var/run/glusterd.pid --log-level $LOG_LEVEL $GLUSTERD_OPTIONS (code=exited, status=0/SUCCESS)
Main PID: 52547 (glusterd)
Tasks: 9 (limit: 115614)
Memory: 6.6M
CGroup: /system.slice/glusterd.service
└─52547 /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFOMar 06 22:14:47 ceph03 systemd[1]: Starting GlusterFS, a clustered file-system server...
Mar 06 22:14:49 ceph03 systemd[1]: Started GlusterFS, a clustered file-system server.
It's 3 node cluster,
You made perfect shot, as always 🙂
gluster vol status
Staging failed on ceph03. Please check log file for details.
gluster peer status
Number of Peers: 2
Hostname: 172.30.0.122
Uuid: fda17bf0-1fa8-4874-ae87-4be8f1f503a8
State: Peer in Cluster (Connected)
Hostname: ceph03
Uuid: 5197c531-dff4-44e8-bbd3-9ac4090bef4f
State: Peer in Cluster (Connected)
tail -f /var/log/glusterfs/glusterd.log
[2023-03-06 21:15:44.066365] E [MSGID: 106165] [glusterd-handshake.c:2060:__glusterd_mgmt_hndsk_version_cbk] 0-management: failed to get the 'versions' from peer (172.30.0.122:24007) [Invalid argument]
[2023-03-06 21:15:44.066416] I [MSGID: 106004] [glusterd-handler.c:6200:__glusterd_peer_rpc_notify] 0-management: Peer <172.30.0.122> (<fda17bf0-1fa8-4874-ae87-4be8f1f503a8>), in state <Peer in Cluster>, has disconnected from glusterd.
[2023-03-06 21:15:47.022417] E [MSGID: 106170] [glusterd-handshake.c:1264:gd_validate_mgmt_hndsk_req] 0-management: Request from peer 172.30.0.43:49139 has an entry in peerinfo, but uuid does not match
[2023-03-06 21:15:47.022471] E [MSGID: 106170] [glusterd-handshake.c:1274:gd_validate_mgmt_hndsk_req] 0-management: Rejecting management handshake request from unknown peer 172.30.0.43:49139
[2023-03-06 21:15:47.022595] E [MSGID: 106165] [glusterd-handshake.c:2060:__glusterd_mgmt_hndsk_version_cbk] 0-management: failed to get the 'versions' from peer (172.30.0.43:24007) [Invalid argument]
[2023-03-06 21:15:47.022696] I [MSGID: 106004] [glusterd-handler.c:6200:__glusterd_peer_rpc_notify] 0-management: Peer <ceph03> (<5197c531-dff4-44e8-bbd3-9ac4090bef4f>), in state <Peer Rejected>, has disconnected from glusterd.
[2023-03-06 21:15:47.022834] E [MSGID: 106165] [glusterd-handshake.c:2060:__glusterd_mgmt_hndsk_version_cbk] 0-management: failed to get the 'versions' from peer (172.30.0.121:24007) [Invalid argument]
[2023-03-06 21:15:47.022879] I [MSGID: 106004] [glusterd-handler.c:6200:__glusterd_peer_rpc_notify] 0-management: Peer <172.30.0.121> (<1331e99e-83bb-45c7-a726-bdc99483b70a>), in state <Peer in Cluster>, has disconnected from glusterd.
[2023-03-06 21:15:47.067863] E [MSGID: 106165] [glusterd-handshake.c:2060:__glusterd_mgmt_hndsk_version_cbk] 0-management: failed to get the 'versions' from peer (172.30.0.122:24007) [Invalid argument]
[2023-03-06 21:15:47.067946] I [MSGID: 106004] [glusterd-handler.c:6200:__glusterd_peer_rpc_notify] 0-management: Peer <172.30.0.122> (<fda17bf0-1fa8-4874-ae87-4be8f1f503a8>), in state <Peer in Cluster>, has disconnected from glusterd.
Ceph01
● glusterd.service - GlusterFS, a clustered file-system server
Loaded: loaded (/lib/systemd/system/glusterd.service; disabled; vendor preset: enabled)
Active: active (running) since Sun 2022-12-04 14:36:31 CET; 3 months 1 days ago
Docs: man:glusterd(8)
Main PID: 2389 (glusterd)
Tasks: 9 (limit: 115873)
Memory: 18.2M
CGroup: /system.slice/glusterd.service
└─2389 /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO
Mar 06 22:10:55 ceph01 glusterd[2389]: [2023-03-06 21:10:55.958491] C [MSGID: 106147] [glusterd-syncop.c:783:_gd_syncop_stage_op_cbk] 0-management: Staging response for 'Volume Status' received from unknown peer: 9f85c001-5f22-4868-aea9-c8c9d85b0b5b
Mar 06 22:11:12 ceph01 glusterd[2389]: [2023-03-06 21:11:12.243447] C [MSGID: 106147] [glusterd-syncop.c:783:_gd_syncop_stage_op_cbk] 0-management: Staging response for 'Volume Status' received from unknown peer: 9f85c001-5f22-4868-aea9-c8c9d85b0b5b
Mar 06 22:11:32 ceph01 glusterd[2389]: [2023-03-06 21:11:32.804939] C [MSGID: 106147] [glusterd-syncop.c:783:_gd_syncop_stage_op_cbk] 0-management: Staging response for 'Volume Status' received from unknown peer: 9f85c001-5f22-4868-aea9-c8c9d85b0b5b
Mar 06 22:11:34 ceph01 glusterd[2389]: [2023-03-06 21:11:34.397162] C [MSGID: 106147] [glusterd-syncop.c:783:_gd_syncop_stage_op_cbk] 0-management: Staging response for 'Volume Status' received from unknown peer: 9f85c001-5f22-4868-aea9-c8c9d85b0b5b
Mar 06 22:11:34 ceph01 glusterd[2389]: [2023-03-06 21:11:34.964503] C [MSGID: 106147] [glusterd-syncop.c:783:_gd_syncop_stage_op_cbk] 0-management: Staging response for 'Volume Status' received from unknown peer: 9f85c001-5f22-4868-aea9-c8c9d85b0b5b
Mar 06 22:11:35 ceph01 glusterd[2389]: [2023-03-06 21:11:35.428406] C [MSGID: 106147] [glusterd-syncop.c:783:_gd_syncop_stage_op_cbk] 0-management: Staging response for 'Volume Status' received from unknown peer: 9f85c001-5f22-4868-aea9-c8c9d85b0b5b
Mar 06 22:11:35 ceph01 glusterd[2389]: [2023-03-06 21:11:35.845524] C [MSGID: 106147] [glusterd-syncop.c:783:_gd_syncop_stage_op_cbk] 0-management: Staging response for 'Volume Status' received from unknown peer: 9f85c001-5f22-4868-aea9-c8c9d85b0b5b
CEPH02
● glusterd.service - GlusterFS, a clustered file-system server
Loaded: loaded (/lib/systemd/system/glusterd.service; disabled; vendor preset: enabled)
Active: active (running) since Fri 2023-01-13 18:04:59 CET; 1 months 21 days ago
Docs: man:glusterd(8)
Main PID: 3956 (glusterd)
Tasks: 9 (limit: 115873)
Memory: 24.8M
CGroup: /system.slice/glusterd.service
└─3956 /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO
Jan 13 18:04:56 ceph02 systemd[1]: Starting GlusterFS, a clustered file-system server...
Jan 13 18:04:59 ceph02 systemd[1]: Started GlusterFS, a clustered file-system server.
CEPH03
● glusterd.service - GlusterFS, a clustered file-system server
Loaded: loaded (/lib/systemd/system/glusterd.service; disabled; vendor preset: enabled)
Active: active (running) since Mon 2023-03-06 22:14:49 CET; 3min 54s ago
Docs: man:glusterd(8)
Process: 52546 ExecStart=/usr/sbin/glusterd -p /var/run/glusterd.pid --log-level $LOG_LEVEL $GLUSTERD_OPTIONS (code=exited, status=0/SUCCESS)
Main PID: 52547 (glusterd)
Tasks: 9 (limit: 115614)
Memory: 6.6M
CGroup: /system.slice/glusterd.service
└─52547 /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO
Mar 06 22:14:47 ceph03 systemd[1]: Starting GlusterFS, a clustered file-system server...
Mar 06 22:14:49 ceph03 systemd[1]: Started GlusterFS, a clustered file-system server.
admin
2,930 Posts
Quote from admin on March 7, 2023, 7:40 amcan you run
gluster vol status
gluster peer statuson all 3 nodes
can you run
gluster vol status
gluster peer status
on all 3 nodes
wid
47 Posts
Quote from wid on March 7, 2023, 12:33 pm=== [ ceph01 ] ===
Staging failed on ceph03. Please check log file for details.Number of Peers: 2
Hostname: 172.30.0.122
Uuid: fda17bf0-1fa8-4874-ae87-4be8f1f503a8
State: Peer in Cluster (Connected)Hostname: ceph03
Uuid: 5197c531-dff4-44e8-bbd3-9ac4090bef4f
State: Peer in Cluster (Connected)=== [ ceph02 ] ===
Staging failed on ceph03. Please check log file for details.Number of Peers: 2
Hostname: ceph03
Uuid: 5197c531-dff4-44e8-bbd3-9ac4090bef4f
State: Peer in Cluster (Connected)Hostname: 172.30.0.121
Uuid: 1331e99e-83bb-45c7-a726-bdc99483b70a
State: Peer in Cluster (Connected)=== [ ceph03 ] ===
No volumes present
Number of Peers: 3Hostname: 172.30.0.122
Uuid: fda17bf0-1fa8-4874-ae87-4be8f1f503a8
State: Peer in Cluster (Disconnected)Hostname: ceph03
Uuid: 5197c531-dff4-44e8-bbd3-9ac4090bef4f
State: Peer Rejected (Disconnected)Hostname: 172.30.0.121
Uuid: 1331e99e-83bb-45c7-a726-bdc99483b70a
State: Peer in Cluster (Disconnected)
=== [ ceph01 ] ===
Staging failed on ceph03. Please check log file for details.
Number of Peers: 2
Hostname: 172.30.0.122
Uuid: fda17bf0-1fa8-4874-ae87-4be8f1f503a8
State: Peer in Cluster (Connected)
Hostname: ceph03
Uuid: 5197c531-dff4-44e8-bbd3-9ac4090bef4f
State: Peer in Cluster (Connected)
=== [ ceph02 ] ===
Staging failed on ceph03. Please check log file for details.
Number of Peers: 2
Hostname: ceph03
Uuid: 5197c531-dff4-44e8-bbd3-9ac4090bef4f
State: Peer in Cluster (Connected)
Hostname: 172.30.0.121
Uuid: 1331e99e-83bb-45c7-a726-bdc99483b70a
State: Peer in Cluster (Connected)
=== [ ceph03 ] ===
No volumes present
Number of Peers: 3
Hostname: 172.30.0.122
Uuid: fda17bf0-1fa8-4874-ae87-4be8f1f503a8
State: Peer in Cluster (Disconnected)
Hostname: ceph03
Uuid: 5197c531-dff4-44e8-bbd3-9ac4090bef4f
State: Peer Rejected (Disconnected)
Hostname: 172.30.0.121
Uuid: 1331e99e-83bb-45c7-a726-bdc99483b70a
State: Peer in Cluster (Disconnected)
wid
47 Posts
Quote from wid on March 13, 2023, 2:12 amHi,
is my information sufficient for diagnostics?
Thank You
Hi,
is my information sufficient for diagnostics?
Thank You
admin
2,930 Posts
Quote from admin on March 13, 2023, 11:42 am
I assume the error you posted are from running the 2 commands i had posted. if so, the gluster server configuration is screwed up, the error says to look at the log which is located in
/var/log/glusterfs/
it may be better to just re-configure gluster server on the first 3 nodes using the backend ip, delete existing configuration in:
/var/lib/glusterd/peers
/var/lib/glusterd/vols/and setup a 3x replicated volume using gluster documentation:
volume name: gfs-vol
brick path: /opt/petasan/config/gfs-brickmake sure you run gluster on the backend ip
it case you need the old statistics/charts data, you may want to backup /opt/petasan/config/gfs-brick on one server before deleting the brick, you can then write the old data once the servers are up on the mounted point
/opt/petasan/config/shared
You can also buy support from us if you need. Good luck.
I assume the error you posted are from running the 2 commands i had posted. if so, the gluster server configuration is screwed up, the error says to look at the log which is located in
/var/log/glusterfs/
it may be better to just re-configure gluster server on the first 3 nodes using the backend ip, delete existing configuration in:
/var/lib/glusterd/peers
/var/lib/glusterd/vols/
and setup a 3x replicated volume using gluster documentation:
volume name: gfs-vol
brick path: /opt/petasan/config/gfs-brick
make sure you run gluster on the backend ip
it case you need the old statistics/charts data, you may want to backup /opt/petasan/config/gfs-brick on one server before deleting the brick, you can then write the old data once the servers are up on the mounted point
/opt/petasan/config/shared
You can also buy support from us if you need. Good luck.
wid
47 Posts
Quote from wid on March 14, 2023, 3:13 pmIt work's, thank You!
It work's, thank You!
admin
2,930 Posts
Quote from admin on March 14, 2023, 3:48 pmGlad things are working 🙂
Glad things are working 🙂