Forums

Home / Forums

You need to log in to create posts and topics. Login · Register

service ip not failing-back after a node reboot.

Hi, I've noticed the service IP addresses (rgw in this case) are not failing-back, or balancing evenly, after a node reboot. I end up with 2 service IP addresses on a single node and a node with no service IP address. Cluster is healthy and ok otherwise (PetaSAN 3.0.1).

Is there something I need to enable to have them rebalance? Can it be done manually?

 

Thanks

You can manually re-assign ips only for iSCSI.  Manual re-assignment for NFS/SMB/S3 will be supported in PetaSAN 3.1 (next release).

By design, PetaSAN is a distributed system and ips are not owned by any node, if a node goes down its resources are failed over to other nodes but if the failed node comes back online we do not automatically return these ips back.

Note for S3: the public/floating ips are for load balancer(s) ( nginx ), this/these load balancer will internally load balance to all rgw services running, so you are still actively accessing all rgw services although from limited number of load balancers.

As you said, now works in 3.1.0. Thanks 😀

Though I wonder why it doesn't do this automatically, without any intervention.

We actually need to hit the "auto" selection and then hit "assign" for the rebalance of ip to be done.

Cheers!

Just to clarify, HA failover has been working since release 1. What was added in PetaSAN 3.1 is the ability to manually move ips among nodes. this is more general than returning back ips when a failed node reboots, it could be used for other scenarios like for better load distribution..etc.

as to why we do not automatically return ips when a down node gets up:

  • Moving of ips and service resources does take time, client io will suspend during this time. For production traffic you may not want to interrupt things if all is working, of course for HA failover you have no choice.
  • In a 2 node system, it may make sense to assign ownership of ips to nodes. In a scale out system, service ips are not owned by nodes but can be locked by any node in a distributed way. It makes the system more scalable and more robust.