[PetaSAN 1.4.0] NTP is not working
Pages: 1 2
fx882
17 Posts
September 18, 2017, 2:14 pmQuote from fx882 on September 18, 2017, 2:14 pmI installed PetaSAN 1.4.0 and the clocks of the different Nodes were far off. (At the moment I have the problem, that my cluster isn't initializing)
I realized the following:
- "ntpq -p" produced the error message"name or service not known". The issue was fixed with adding the following line to /etc/hosts (empty after setup):
"127.0.0.1 hostname.domain.tld hostname localhost"
I additionally changed ntp.conf to use my organization NTP-Servers and manually synced the date initially with ntpdate.
I installed PetaSAN 1.4.0 and the clocks of the different Nodes were far off. (At the moment I have the problem, that my cluster isn't initializing)
I realized the following:
- "ntpq -p" produced the error message"name or service not known". The issue was fixed with adding the following line to /etc/hosts (empty after setup):
"127.0.0.1 hostname.domain.tld hostname localhost"
I additionally changed ntp.conf to use my organization NTP-Servers and manually synced the date initially with ntpdate.
Last edited on September 18, 2017, 2:14 pm by fx882 · #1
admin
2,930 Posts
September 18, 2017, 5:09 pmQuote from admin on September 18, 2017, 5:09 pmThe ntp should be working in v 1.4 either in after a fresh install or from upgrade, we tested this several times. Node 1 acts as the main ntp server for the cluster, followed bu node 2 then 3. If you define an external ntp server via the Cluster Settings page, it will be used by node 1 to adjust its time and relay it to the other nodes.
Did you get any other errors apart from this ?
The ntp should be working in v 1.4 either in after a fresh install or from upgrade, we tested this several times. Node 1 acts as the main ntp server for the cluster, followed bu node 2 then 3. If you define an external ntp server via the Cluster Settings page, it will be used by node 1 to adjust its time and relay it to the other nodes.
Did you get any other errors apart from this ?
Last edited on September 18, 2017, 5:16 pm by admin · #2
fx882
17 Posts
September 19, 2017, 9:08 amQuote from fx882 on September 19, 2017, 9:08 amI just wanted to point out a possible bug. I'll possibly check it again if the scenario is set up as you described. I opened another topic with more details about my setup and my problem to build a cluster.
I just wanted to point out a possible bug. I'll possibly check it again if the scenario is set up as you described. I opened another topic with more details about my setup and my problem to build a cluster.
Last edited on September 19, 2017, 9:09 am by fx882 · #3
davlaw
35 Posts
November 21, 2017, 8:06 pmQuote from davlaw on November 21, 2017, 8:06 pmI do see this in my logs, my cluster was built last week. ~11/15. Nothing on node 01, so guess this is by design? My concern was just the permission denied.
Nov 21 14:36:43 ps-node-02 ntpd[1509]: frequency file /var/lib/ntp/ntp.drift.TEMP: Permission denied
Nov 21 14:36:43 ps-node-02 ntpd[1509]: 21 Nov 14:36:43 ntpd[1509]: frequency file /var/lib/ntp/ntp.drift.TEMP: Permission denied
Nov 21 13:35:40 ps-node-03 ntpd[1473]: frequency file /var/lib/ntp/ntp.drift.TEMP: Permission denied
Nov 21 13:35:40 ps-node-03 ntpd[1473]: 21 Nov 13:35:40 ntpd[1473]: frequency file /var/lib/ntp/ntp.drift.TEMP: Permission denied
Nov 21 14:35:41 ps-node-03 ntpd[1473]: frequency file /var/lib/ntp/ntp.drift.TEMP: Permission denied
Nov 21 14:35:41 ps-node-03 ntpd[1473]: 21 Nov 14:35:41 ntpd[1473]: frequency file /var/lib/ntp/ntp.drift.TEMP: Permission denied
First node with ntpq-p seems ok
root@ps-node-01:/var/log# ntpq -p
remote refid st t when poll reach delay offset jitter
==============================================================================
*10.10.1.1 10.10.0.96 3 u 79 128 377 0.391 -0.197 0.130
LOCAL(0) .LOCL. 7 l 43m 64 0 0.000 0.000 0.000
root@ps-node-02:/var/log# ntpq -p
remote refid st t when poll reach delay offset jitter
==============================================================================
*ps-node-01 10.10.1.1 5 u 359 512 377 0.178 3.341 0.017
LOCAL(0) .LOCL. 9 l 96m 64 0 0.000 0.000 0.000
root@ps-node-03:/var/log# ntpq -p
remote refid st t when poll reach delay offset jitter
==============================================================================
*ps-node-01 10.10.1.1 5 u 552 1024 377 0.147 -12.752 0.029
+ps-node-02 10.10.0.111 6 u 547 1024 277 0.141 -16.502 0.195
LOCAL(0) .LOCL. 11 l 7h 64 0 0.000 0.000 0.000
I do see this in my logs, my cluster was built last week. ~11/15. Nothing on node 01, so guess this is by design? My concern was just the permission denied.
Nov 21 14:36:43 ps-node-02 ntpd[1509]: frequency file /var/lib/ntp/ntp.drift.TEMP: Permission denied
Nov 21 14:36:43 ps-node-02 ntpd[1509]: 21 Nov 14:36:43 ntpd[1509]: frequency file /var/lib/ntp/ntp.drift.TEMP: Permission denied
Nov 21 13:35:40 ps-node-03 ntpd[1473]: frequency file /var/lib/ntp/ntp.drift.TEMP: Permission denied
Nov 21 13:35:40 ps-node-03 ntpd[1473]: 21 Nov 13:35:40 ntpd[1473]: frequency file /var/lib/ntp/ntp.drift.TEMP: Permission denied
Nov 21 14:35:41 ps-node-03 ntpd[1473]: frequency file /var/lib/ntp/ntp.drift.TEMP: Permission denied
Nov 21 14:35:41 ps-node-03 ntpd[1473]: 21 Nov 14:35:41 ntpd[1473]: frequency file /var/lib/ntp/ntp.drift.TEMP: Permission denied
First node with ntpq-p seems ok
root@ps-node-01:/var/log# ntpq -p
remote refid st t when poll reach delay offset jitter
==============================================================================
*10.10.1.1 10.10.0.96 3 u 79 128 377 0.391 -0.197 0.130
LOCAL(0) .LOCL. 7 l 43m 64 0 0.000 0.000 0.000
root@ps-node-02:/var/log# ntpq -p
remote refid st t when poll reach delay offset jitter
==============================================================================
*ps-node-01 10.10.1.1 5 u 359 512 377 0.178 3.341 0.017
LOCAL(0) .LOCL. 9 l 96m 64 0 0.000 0.000 0.000
root@ps-node-03:/var/log# ntpq -p
remote refid st t when poll reach delay offset jitter
==============================================================================
*ps-node-01 10.10.1.1 5 u 552 1024 377 0.147 -12.752 0.029
+ps-node-02 10.10.0.111 6 u 547 1024 277 0.141 -16.502 0.195
LOCAL(0) .LOCL. 11 l 7h 64 0 0.000 0.000 0.000
Last edited on November 21, 2017, 8:14 pm by davlaw · #4
admin
2,930 Posts
November 22, 2017, 11:32 amQuote from admin on November 22, 2017, 11:32 amThanks for reporting this. I am trying to replicate this, so far i do not see these logs but it may take time so we will add this to our test cases. The good thing is that the syncing is working as you mentioned.
It will be helpful if you can supply the output of :
ps aux | grep ntp
systemctl status ntp
Thanks for reporting this. I am trying to replicate this, so far i do not see these logs but it may take time so we will add this to our test cases. The good thing is that the syncing is working as you mentioned.
It will be helpful if you can supply the output of :
ps aux | grep ntp
systemctl status ntp
fcatanza
4 Posts
August 29, 2018, 3:13 pmQuote from fcatanza on August 29, 2018, 3:13 pmI am seeing this EXACT behavior on a fresh 2.0 install. We have two separate cluster in 2 physical locations, one is perfect, the other is showing this issue with the "frequency file /var/lib/ntp/ntp.drift.TEMP: Permission denied"
The servers are now all about 8 minutes behind real time and falling further behind. I don't want to do anything rash in case it will kill the array, but this is DEFINITELY an issue.
Below is ps -ef | grep ntp for all 3 nodes, #2 looks weird, but a "systemctl status ntp" is IDENTICAL on all 3.
Node 1:
root 1881286 1 0 10:55 ? 00:00:00 /usr/sbin/ntpd -n -g
Node 2:
957535 ? 00:00:00 ntpd
Node 3:
root 1697210 1 0 10:47 ? 00:00:00 /usr/sbin/ntpd -n -g
I am seeing this EXACT behavior on a fresh 2.0 install. We have two separate cluster in 2 physical locations, one is perfect, the other is showing this issue with the "frequency file /var/lib/ntp/ntp.drift.TEMP: Permission denied"
The servers are now all about 8 minutes behind real time and falling further behind. I don't want to do anything rash in case it will kill the array, but this is DEFINITELY an issue.
Below is ps -ef | grep ntp for all 3 nodes, #2 looks weird, but a "systemctl status ntp" is IDENTICAL on all 3.
Node 1:
root 1881286 1 0 10:55 ? 00:00:00 /usr/sbin/ntpd -n -g
Node 2:
957535 ? 00:00:00 ntpd
Node 3:
root 1697210 1 0 10:47 ? 00:00:00 /usr/sbin/ntpd -n -g
admin
2,930 Posts
August 29, 2018, 9:58 pmQuote from admin on August 29, 2018, 9:58 pmDo not know why this happens, the following may help:
On all nodes:
Check ownership of /var/lib/ntp , it should be ntp, else set it
chown -R ntp:ntp /var/lib/ntp
Disable ntp auto start, they are started within the PetaSAN scripts
update-rc.d ntp disable
systemctl disable ntp
systemctl disable systemd-timesyncd
systemctl restart ntp
After approx 30 min start monitoring nodes via
ntpq -p
the time offset between nodes should start to decrease
If you feel the offset is not decreasing, you can force sync nodes to the first node, with the exception of first node, on other nodes:
systemctl stop ntp
ntpdate ip_address_of_first_node
hwclock --systohc --utc
systemctl start ntp
Do not know why this happens, the following may help:
On all nodes:
Check ownership of /var/lib/ntp , it should be ntp, else set it
chown -R ntp:ntp /var/lib/ntp
Disable ntp auto start, they are started within the PetaSAN scripts
update-rc.d ntp disable
systemctl disable ntp
systemctl disable systemd-timesyncd
systemctl restart ntp
After approx 30 min start monitoring nodes via
ntpq -p
the time offset between nodes should start to decrease
If you feel the offset is not decreasing, you can force sync nodes to the first node, with the exception of first node, on other nodes:
systemctl stop ntp
ntpdate ip_address_of_first_node
hwclock --systohc --utc
systemctl start ntp
Last edited on August 29, 2018, 10:00 pm by admin · #7
fcatanza
4 Posts
August 30, 2018, 3:49 amQuote from fcatanza on August 30, 2018, 3:49 amPermissions are fine. All 3 servers are in sync with each other time wise. But they are ALL a few minutes behind real-time... the array is working, but it thinks its about 10 minutes ago which makes all the monitoring screwy.
Permissions are fine. All 3 servers are in sync with each other time wise. But they are ALL a few minutes behind real-time... the array is working, but it thinks its about 10 minutes ago which makes all the monitoring screwy.
admin
2,930 Posts
August 30, 2018, 11:37 amQuote from admin on August 30, 2018, 11:37 amok i understand better now. the internal syncing is working but they are all off by several minutes. To fix this you need to define an external ntp time server in the Cluster Settings page. If you already have then there is am issue connecting to it, make sure the external ntp server can be pinged, you may need route your management network for external access. You can monitor the time offset with your external ntp server via
ntpq -p
If this does not show your external ntp or does not show the offset decreasing, try another external ntp server.
Without an external ntp server, the nodes will be in sync but could deviate by 1 or 2 sec per day if they have a low grade hardware clock ( 20 ppm accuracy )
For the second node showing different ps ouptut, make sure the ntp service is not started via sysinit system but systemctl via
update-rc.d ntp disable
systemctl restart ntp
if the permission error exist on just the second node this may be the cause, the installer should have run the disable command during installation but maybe it failed or something over-wrote it.
ok i understand better now. the internal syncing is working but they are all off by several minutes. To fix this you need to define an external ntp time server in the Cluster Settings page. If you already have then there is am issue connecting to it, make sure the external ntp server can be pinged, you may need route your management network for external access. You can monitor the time offset with your external ntp server via
ntpq -p
If this does not show your external ntp or does not show the offset decreasing, try another external ntp server.
Without an external ntp server, the nodes will be in sync but could deviate by 1 or 2 sec per day if they have a low grade hardware clock ( 20 ppm accuracy )
For the second node showing different ps ouptut, make sure the ntp service is not started via sysinit system but systemctl via
update-rc.d ntp disable
systemctl restart ntp
if the permission error exist on just the second node this may be the cause, the installer should have run the disable command during installation but maybe it failed or something over-wrote it.
fcatanza
4 Posts
August 30, 2018, 12:30 pmQuote from fcatanza on August 30, 2018, 12:30 pmThank you! There were two problems and now they both seem fixed! The outside ntp server was indeed unreachable, and the node showing the odd ps output was cured by following the two commands you gave. After about 30 minutes all is in sync and keeping up. Also, I saw 2.1 is out, can't wait to test it! I appreciate the amazing work you've done with this!
Thank you! There were two problems and now they both seem fixed! The outside ntp server was indeed unreachable, and the node showing the odd ps output was cured by following the two commands you gave. After about 30 minutes all is in sync and keeping up. Also, I saw 2.1 is out, can't wait to test it! I appreciate the amazing work you've done with this!
Pages: 1 2
[PetaSAN 1.4.0] NTP is not working
fx882
17 Posts
Quote from fx882 on September 18, 2017, 2:14 pmI installed PetaSAN 1.4.0 and the clocks of the different Nodes were far off. (At the moment I have the problem, that my cluster isn't initializing)
I realized the following:
- "ntpq -p" produced the error message"name or service not known". The issue was fixed with adding the following line to /etc/hosts (empty after setup):
"127.0.0.1 hostname.domain.tld hostname localhost"I additionally changed ntp.conf to use my organization NTP-Servers and manually synced the date initially with ntpdate.
I installed PetaSAN 1.4.0 and the clocks of the different Nodes were far off. (At the moment I have the problem, that my cluster isn't initializing)
I realized the following:
- "ntpq -p" produced the error message"name or service not known". The issue was fixed with adding the following line to /etc/hosts (empty after setup):
"127.0.0.1 hostname.domain.tld hostname localhost"
I additionally changed ntp.conf to use my organization NTP-Servers and manually synced the date initially with ntpdate.
admin
2,930 Posts
Quote from admin on September 18, 2017, 5:09 pmThe ntp should be working in v 1.4 either in after a fresh install or from upgrade, we tested this several times. Node 1 acts as the main ntp server for the cluster, followed bu node 2 then 3. If you define an external ntp server via the Cluster Settings page, it will be used by node 1 to adjust its time and relay it to the other nodes.
Did you get any other errors apart from this ?
The ntp should be working in v 1.4 either in after a fresh install or from upgrade, we tested this several times. Node 1 acts as the main ntp server for the cluster, followed bu node 2 then 3. If you define an external ntp server via the Cluster Settings page, it will be used by node 1 to adjust its time and relay it to the other nodes.
Did you get any other errors apart from this ?
fx882
17 Posts
Quote from fx882 on September 19, 2017, 9:08 amI just wanted to point out a possible bug. I'll possibly check it again if the scenario is set up as you described. I opened another topic with more details about my setup and my problem to build a cluster.
I just wanted to point out a possible bug. I'll possibly check it again if the scenario is set up as you described. I opened another topic with more details about my setup and my problem to build a cluster.
davlaw
35 Posts
Quote from davlaw on November 21, 2017, 8:06 pmI do see this in my logs, my cluster was built last week. ~11/15. Nothing on node 01, so guess this is by design? My concern was just the permission denied.
Nov 21 14:36:43 ps-node-02 ntpd[1509]: frequency file /var/lib/ntp/ntp.drift.TEMP: Permission denied
Nov 21 14:36:43 ps-node-02 ntpd[1509]: 21 Nov 14:36:43 ntpd[1509]: frequency file /var/lib/ntp/ntp.drift.TEMP: Permission deniedNov 21 13:35:40 ps-node-03 ntpd[1473]: frequency file /var/lib/ntp/ntp.drift.TEMP: Permission denied
Nov 21 13:35:40 ps-node-03 ntpd[1473]: 21 Nov 13:35:40 ntpd[1473]: frequency file /var/lib/ntp/ntp.drift.TEMP: Permission denied
Nov 21 14:35:41 ps-node-03 ntpd[1473]: frequency file /var/lib/ntp/ntp.drift.TEMP: Permission denied
Nov 21 14:35:41 ps-node-03 ntpd[1473]: 21 Nov 14:35:41 ntpd[1473]: frequency file /var/lib/ntp/ntp.drift.TEMP: Permission denied
First node with ntpq-p seems ok
root@ps-node-01:/var/log# ntpq -p
remote refid st t when poll reach delay offset jitter
==============================================================================
*10.10.1.1 10.10.0.96 3 u 79 128 377 0.391 -0.197 0.130
LOCAL(0) .LOCL. 7 l 43m 64 0 0.000 0.000 0.000root@ps-node-02:/var/log# ntpq -p
remote refid st t when poll reach delay offset jitter
==============================================================================
*ps-node-01 10.10.1.1 5 u 359 512 377 0.178 3.341 0.017
LOCAL(0) .LOCL. 9 l 96m 64 0 0.000 0.000 0.000root@ps-node-03:/var/log# ntpq -p
remote refid st t when poll reach delay offset jitter
==============================================================================
*ps-node-01 10.10.1.1 5 u 552 1024 377 0.147 -12.752 0.029
+ps-node-02 10.10.0.111 6 u 547 1024 277 0.141 -16.502 0.195
LOCAL(0) .LOCL. 11 l 7h 64 0 0.000 0.000 0.000
I do see this in my logs, my cluster was built last week. ~11/15. Nothing on node 01, so guess this is by design? My concern was just the permission denied.
Nov 21 14:36:43 ps-node-02 ntpd[1509]: frequency file /var/lib/ntp/ntp.drift.TEMP: Permission denied
Nov 21 14:36:43 ps-node-02 ntpd[1509]: 21 Nov 14:36:43 ntpd[1509]: frequency file /var/lib/ntp/ntp.drift.TEMP: Permission denied
Nov 21 13:35:40 ps-node-03 ntpd[1473]: frequency file /var/lib/ntp/ntp.drift.TEMP: Permission denied
Nov 21 13:35:40 ps-node-03 ntpd[1473]: 21 Nov 13:35:40 ntpd[1473]: frequency file /var/lib/ntp/ntp.drift.TEMP: Permission denied
Nov 21 14:35:41 ps-node-03 ntpd[1473]: frequency file /var/lib/ntp/ntp.drift.TEMP: Permission denied
Nov 21 14:35:41 ps-node-03 ntpd[1473]: 21 Nov 14:35:41 ntpd[1473]: frequency file /var/lib/ntp/ntp.drift.TEMP: Permission denied
First node with ntpq-p seems ok
root@ps-node-01:/var/log# ntpq -p
remote refid st t when poll reach delay offset jitter
==============================================================================
*10.10.1.1 10.10.0.96 3 u 79 128 377 0.391 -0.197 0.130
LOCAL(0) .LOCL. 7 l 43m 64 0 0.000 0.000 0.000
root@ps-node-02:/var/log# ntpq -p
remote refid st t when poll reach delay offset jitter
==============================================================================
*ps-node-01 10.10.1.1 5 u 359 512 377 0.178 3.341 0.017
LOCAL(0) .LOCL. 9 l 96m 64 0 0.000 0.000 0.000
root@ps-node-03:/var/log# ntpq -p
remote refid st t when poll reach delay offset jitter
==============================================================================
*ps-node-01 10.10.1.1 5 u 552 1024 377 0.147 -12.752 0.029
+ps-node-02 10.10.0.111 6 u 547 1024 277 0.141 -16.502 0.195
LOCAL(0) .LOCL. 11 l 7h 64 0 0.000 0.000 0.000
admin
2,930 Posts
Quote from admin on November 22, 2017, 11:32 amThanks for reporting this. I am trying to replicate this, so far i do not see these logs but it may take time so we will add this to our test cases. The good thing is that the syncing is working as you mentioned.
It will be helpful if you can supply the output of :
ps aux | grep ntp
systemctl status ntp
Thanks for reporting this. I am trying to replicate this, so far i do not see these logs but it may take time so we will add this to our test cases. The good thing is that the syncing is working as you mentioned.
It will be helpful if you can supply the output of :
ps aux | grep ntp
systemctl status ntp
fcatanza
4 Posts
Quote from fcatanza on August 29, 2018, 3:13 pmI am seeing this EXACT behavior on a fresh 2.0 install. We have two separate cluster in 2 physical locations, one is perfect, the other is showing this issue with the "frequency file /var/lib/ntp/ntp.drift.TEMP: Permission denied"
The servers are now all about 8 minutes behind real time and falling further behind. I don't want to do anything rash in case it will kill the array, but this is DEFINITELY an issue.
Below is ps -ef | grep ntp for all 3 nodes, #2 looks weird, but a "systemctl status ntp" is IDENTICAL on all 3.
Node 1:
root 1881286 1 0 10:55 ? 00:00:00 /usr/sbin/ntpd -n -g
Node 2:
957535 ? 00:00:00 ntpd
Node 3:
root 1697210 1 0 10:47 ? 00:00:00 /usr/sbin/ntpd -n -g
I am seeing this EXACT behavior on a fresh 2.0 install. We have two separate cluster in 2 physical locations, one is perfect, the other is showing this issue with the "frequency file /var/lib/ntp/ntp.drift.TEMP: Permission denied"
The servers are now all about 8 minutes behind real time and falling further behind. I don't want to do anything rash in case it will kill the array, but this is DEFINITELY an issue.
Below is ps -ef | grep ntp for all 3 nodes, #2 looks weird, but a "systemctl status ntp" is IDENTICAL on all 3.
Node 1:
root 1881286 1 0 10:55 ? 00:00:00 /usr/sbin/ntpd -n -g
Node 2:
957535 ? 00:00:00 ntpd
Node 3:
root 1697210 1 0 10:47 ? 00:00:00 /usr/sbin/ntpd -n -g
admin
2,930 Posts
Quote from admin on August 29, 2018, 9:58 pmDo not know why this happens, the following may help:
On all nodes:
Check ownership of /var/lib/ntp , it should be ntp, else set it
chown -R ntp:ntp /var/lib/ntpDisable ntp auto start, they are started within the PetaSAN scripts
update-rc.d ntp disable
systemctl disable ntp
systemctl disable systemd-timesyncd
systemctl restart ntpAfter approx 30 min start monitoring nodes via
ntpq -p
the time offset between nodes should start to decreaseIf you feel the offset is not decreasing, you can force sync nodes to the first node, with the exception of first node, on other nodes:
systemctl stop ntp
ntpdate ip_address_of_first_node
hwclock --systohc --utc
systemctl start ntp
Do not know why this happens, the following may help:
On all nodes:
Check ownership of /var/lib/ntp , it should be ntp, else set it
chown -R ntp:ntp /var/lib/ntp
Disable ntp auto start, they are started within the PetaSAN scripts
update-rc.d ntp disable
systemctl disable ntp
systemctl disable systemd-timesyncd
systemctl restart ntp
After approx 30 min start monitoring nodes via
ntpq -p
the time offset between nodes should start to decrease
If you feel the offset is not decreasing, you can force sync nodes to the first node, with the exception of first node, on other nodes:
systemctl stop ntp
ntpdate ip_address_of_first_node
hwclock --systohc --utc
systemctl start ntp
fcatanza
4 Posts
Quote from fcatanza on August 30, 2018, 3:49 amPermissions are fine. All 3 servers are in sync with each other time wise. But they are ALL a few minutes behind real-time... the array is working, but it thinks its about 10 minutes ago which makes all the monitoring screwy.
Permissions are fine. All 3 servers are in sync with each other time wise. But they are ALL a few minutes behind real-time... the array is working, but it thinks its about 10 minutes ago which makes all the monitoring screwy.
admin
2,930 Posts
Quote from admin on August 30, 2018, 11:37 amok i understand better now. the internal syncing is working but they are all off by several minutes. To fix this you need to define an external ntp time server in the Cluster Settings page. If you already have then there is am issue connecting to it, make sure the external ntp server can be pinged, you may need route your management network for external access. You can monitor the time offset with your external ntp server via
ntpq -p
If this does not show your external ntp or does not show the offset decreasing, try another external ntp server.
Without an external ntp server, the nodes will be in sync but could deviate by 1 or 2 sec per day if they have a low grade hardware clock ( 20 ppm accuracy )
For the second node showing different ps ouptut, make sure the ntp service is not started via sysinit system but systemctl via
update-rc.d ntp disable
systemctl restart ntpif the permission error exist on just the second node this may be the cause, the installer should have run the disable command during installation but maybe it failed or something over-wrote it.
ok i understand better now. the internal syncing is working but they are all off by several minutes. To fix this you need to define an external ntp time server in the Cluster Settings page. If you already have then there is am issue connecting to it, make sure the external ntp server can be pinged, you may need route your management network for external access. You can monitor the time offset with your external ntp server via
ntpq -p
If this does not show your external ntp or does not show the offset decreasing, try another external ntp server.
Without an external ntp server, the nodes will be in sync but could deviate by 1 or 2 sec per day if they have a low grade hardware clock ( 20 ppm accuracy )
For the second node showing different ps ouptut, make sure the ntp service is not started via sysinit system but systemctl via
update-rc.d ntp disable
systemctl restart ntp
if the permission error exist on just the second node this may be the cause, the installer should have run the disable command during installation but maybe it failed or something over-wrote it.
fcatanza
4 Posts
Quote from fcatanza on August 30, 2018, 12:30 pmThank you! There were two problems and now they both seem fixed! The outside ntp server was indeed unreachable, and the node showing the odd ps output was cured by following the two commands you gave. After about 30 minutes all is in sync and keeping up. Also, I saw 2.1 is out, can't wait to test it! I appreciate the amazing work you've done with this!
Thank you! There were two problems and now they both seem fixed! The outside ntp server was indeed unreachable, and the node showing the odd ps output was cured by following the two commands you gave. After about 30 minutes all is in sync and keeping up. Also, I saw 2.1 is out, can't wait to test it! I appreciate the amazing work you've done with this!