Forums

Home / Forums

You need to log in to create posts and topics. Login · Register

five node cluster two non monitor nodes crash

I changed the time under ntp.conf using nano on only the first three monitors, this caused the remaining two to not be seen and crash, I have since of course reverted it back, question is if I am going to use google time servers, do I remove the server and fudge and drift file? and on the other servers remove the monitors? or just make sure all five servers have the google time and leave everything else alone, I just want to do it right first time and its already in production

  GNU nano 2.9.3                                        /etc/ntp.conf                                         Modified  

driftfile /var/lib/ntp/ntp.drift

server time1.google.com iburst

server time2.google.com iburst

server time3.google.com iburst

server time4.google.com iburst

server  127.127.1.0

fudge   127.127.1.0 stratum 7

hard to say. I would recommend changing the conf on just the first node, the 2 other monitors should not include an external ntp but rather leave them with original settings ( which will point to node 1 using different stratum values ), the same for the remaining nodes. restart the ntp service on node 1 rather than reboot it, this will slowly sync the time on node 1 in gradual steps and accordingly the rest of the cluster.

you could also use the ui in the general settings to add an external ntp, but make sure you revert the ntp conf on the other nodes to the original values as above.

what if the main node goes down, wouldn't that be more or less a single point of failure?

no, not at all. node 2 will become sync master, then node 3. Technically there is no master as such, just stratum numbers which give more weight to the node. It is also recommended in general not to have all your server access an external time server but create an internal one(s) which have external access and have other servers sync from them. If you have many internal servers accessing the same external server, the time latency between each connection will be different and can create issues, PetaSAN requires smaller than 0.3 s drift between the monitors.

got it I'm going to deploy a NTP time issue server on the network for this

You can but you do not have too. If you do not modify the conf file manually, the out of box settings will create this internal hierarchy with node 1 highest priority than node 2, then 3 then all other nodes. When you add an external ntp server from ui, this will be configured on node 1 only, however it will be propagated from node 1 to the other nodes, this is what i meant in my prev post of having 1 node only be accessing the external ntp, this is node 1. This results in a more accurate syncing between the nodes as having all of them connect externally will cause high drift due to the high external latency differences.  Note that the nodes acts as both ntp clients and servers. If node 1 fails, node 2 will have the highest weight ( lowest stratum ) and so on.