Forums

Home / Forums

You need to log in to create posts and topics. Login · Register

[solved][PetaSAN 1.4.0] Cluster building process stuck (build won't finish)

Hi,

I have set up a new test cluster with PetaSAN 1.4.0 and when adding the third node, the build does not finish the cluster setup. The webinterface keeps showing "Final Deployment Stage: Please do not close this page until processing is complete. Processing..." for about 30 hours now.

Here is my setup:

  • 3 Nodes, 5 Network adapters (1 x onboard Intel 1GBit, 1 x Quadport Intel 1GBit)
  • IP-Configuration: Management 10.1.0.0/22(existing Management-Subnet) Backend-1: 10.5.3.0/24 Backend-2: 10.5.4.0/24
  • HDD sizes are different. 4 HDDs in each host, 2 hosts with 4 x SATA 500 GB, 1 host with adaptec AS2405 RAID and 4 x SAS 1 TB, All drives are configured as single HDDs. OS is installed onto the first HDD
  • Motherboard Intel DQ67SW with core i7-2600 cpu and 16 GB RAM
  • 48 Port HP Network Switch separated into VLANs
  • All network cables checked manually(all cable in all nodes are connected in the same order)
  • IP Connectivity checked and ok in all configured LANs(management,backend-1,backend-2)
  • After the first unsuccessful setup, I rebooted all nodes and restarted the setup several times, same outcome
  • IP Connectivity from all nodes to outside(management-interface) works

Debug Information collected via script are here as paste:
https://paste.ubuntu.com/25571214/

Thank's for any advice.

 

P. S.: The diagnostic information collector script is here: https://github.com/megabert/script-pastebin/tree/master/petasan-diag

As I saw, an ssh-key is distributed to the cluster nodes, I switched off strict hostkey checking as this may block an ssh command if it's spawn as an ssh process and it will ask if the key ist trusted upon the first connection.

I put this into /root/.ssh/config

Host *
StrictHostKeyChecking no

I additionally put my own key in /root/.ssh/authorized_keys, so I'm authenticated via key. I left the existing cluster key as is in authorized_keys.

I further checked key based authentication for root within all the nodes: It's working properly.

This did not help.

I observed strange DNS Querys (for localhost) and found out that the /etc/hosts which is a symlink to /opt/petasan/config/etc/hosts is empty. I added some entries by hand onto all hosts:

127.0.0.1 localhost
10.1.0.101 ceph1a.mgt.mydomain.de ceph1a
10.1.0.102 ceph1b.mgt.mydomain.de ceph1b
10.1.0.103 ceph1c.mgt.mydomain.de ceph1c

No effect so far.

I reinstalled all 3 nodes and did not do anything on the servers on my own. I especially did not select "Jumbo-Frames".

The cluster setup had finished successfully now and a further 4th node could be added.

Edit: Yes, I realized, I did not enable Jumbo Frames on the Switch. My Bad. That probably was the cause of the error.

jumbo frames option should work out of the box