[solved][PetaSAN 1.4.0] Cluster building process stuck (build won't finish)
fx882
17 Posts
September 19, 2017, 9:07 amQuote from fx882 on September 19, 2017, 9:07 amHi,
I have set up a new test cluster with PetaSAN 1.4.0 and when adding the third node, the build does not finish the cluster setup. The webinterface keeps showing "Final Deployment Stage: Please do not close this page until processing is complete. Processing..." for about 30 hours now.
Here is my setup:
- 3 Nodes, 5 Network adapters (1 x onboard Intel 1GBit, 1 x Quadport Intel 1GBit)
- IP-Configuration: Management 10.1.0.0/22(existing Management-Subnet) Backend-1: 10.5.3.0/24 Backend-2: 10.5.4.0/24
- HDD sizes are different. 4 HDDs in each host, 2 hosts with 4 x SATA 500 GB, 1 host with adaptec AS2405 RAID and 4 x SAS 1 TB, All drives are configured as single HDDs. OS is installed onto the first HDD
- Motherboard Intel DQ67SW with core i7-2600 cpu and 16 GB RAM
- 48 Port HP Network Switch separated into VLANs
- All network cables checked manually(all cable in all nodes are connected in the same order)
- IP Connectivity checked and ok in all configured LANs(management,backend-1,backend-2)
- After the first unsuccessful setup, I rebooted all nodes and restarted the setup several times, same outcome
- IP Connectivity from all nodes to outside(management-interface) works
Debug Information collected via script are here as paste:
https://paste.ubuntu.com/25571214/
Thank's for any advice.
P. S.: The diagnostic information collector script is here: https://github.com/megabert/script-pastebin/tree/master/petasan-diag
Hi,
I have set up a new test cluster with PetaSAN 1.4.0 and when adding the third node, the build does not finish the cluster setup. The webinterface keeps showing "Final Deployment Stage: Please do not close this page until processing is complete. Processing..." for about 30 hours now.
Here is my setup:
- 3 Nodes, 5 Network adapters (1 x onboard Intel 1GBit, 1 x Quadport Intel 1GBit)
- IP-Configuration: Management 10.1.0.0/22(existing Management-Subnet) Backend-1: 10.5.3.0/24 Backend-2: 10.5.4.0/24
- HDD sizes are different. 4 HDDs in each host, 2 hosts with 4 x SATA 500 GB, 1 host with adaptec AS2405 RAID and 4 x SAS 1 TB, All drives are configured as single HDDs. OS is installed onto the first HDD
- Motherboard Intel DQ67SW with core i7-2600 cpu and 16 GB RAM
- 48 Port HP Network Switch separated into VLANs
- All network cables checked manually(all cable in all nodes are connected in the same order)
- IP Connectivity checked and ok in all configured LANs(management,backend-1,backend-2)
- After the first unsuccessful setup, I rebooted all nodes and restarted the setup several times, same outcome
- IP Connectivity from all nodes to outside(management-interface) works
Debug Information collected via script are here as paste:
https://paste.ubuntu.com/25571214/
Thank's for any advice.
P. S.: The diagnostic information collector script is here: https://github.com/megabert/script-pastebin/tree/master/petasan-diag
Last edited on September 19, 2017, 12:15 pm by fx882 · #1
fx882
17 Posts
September 19, 2017, 10:11 amQuote from fx882 on September 19, 2017, 10:11 amAs I saw, an ssh-key is distributed to the cluster nodes, I switched off strict hostkey checking as this may block an ssh command if it's spawn as an ssh process and it will ask if the key ist trusted upon the first connection.
I put this into /root/.ssh/config
Host *
StrictHostKeyChecking no
I additionally put my own key in /root/.ssh/authorized_keys, so I'm authenticated via key. I left the existing cluster key as is in authorized_keys.
I further checked key based authentication for root within all the nodes: It's working properly.
This did not help.
As I saw, an ssh-key is distributed to the cluster nodes, I switched off strict hostkey checking as this may block an ssh command if it's spawn as an ssh process and it will ask if the key ist trusted upon the first connection.
I put this into /root/.ssh/config
Host *
StrictHostKeyChecking no
I additionally put my own key in /root/.ssh/authorized_keys, so I'm authenticated via key. I left the existing cluster key as is in authorized_keys.
I further checked key based authentication for root within all the nodes: It's working properly.
This did not help.
Last edited on September 19, 2017, 10:14 am by fx882 · #2
fx882
17 Posts
September 19, 2017, 10:31 amQuote from fx882 on September 19, 2017, 10:31 amI observed strange DNS Querys (for localhost) and found out that the /etc/hosts which is a symlink to /opt/petasan/config/etc/hosts is empty. I added some entries by hand onto all hosts:
127.0.0.1 localhost
10.1.0.101 ceph1a.mgt.mydomain.de ceph1a
10.1.0.102 ceph1b.mgt.mydomain.de ceph1b
10.1.0.103 ceph1c.mgt.mydomain.de ceph1c
No effect so far.
I observed strange DNS Querys (for localhost) and found out that the /etc/hosts which is a symlink to /opt/petasan/config/etc/hosts is empty. I added some entries by hand onto all hosts:
127.0.0.1 localhost
10.1.0.101 ceph1a.mgt.mydomain.de ceph1a
10.1.0.102 ceph1b.mgt.mydomain.de ceph1b
10.1.0.103 ceph1c.mgt.mydomain.de ceph1c
No effect so far.
Last edited on September 19, 2017, 10:35 am by fx882 · #3
fx882
17 Posts
September 19, 2017, 12:16 pmQuote from fx882 on September 19, 2017, 12:16 pmI reinstalled all 3 nodes and did not do anything on the servers on my own. I especially did not select "Jumbo-Frames".
The cluster setup had finished successfully now and a further 4th node could be added.
Edit: Yes, I realized, I did not enable Jumbo Frames on the Switch. My Bad. That probably was the cause of the error.
I reinstalled all 3 nodes and did not do anything on the servers on my own. I especially did not select "Jumbo-Frames".
The cluster setup had finished successfully now and a further 4th node could be added.
Edit: Yes, I realized, I did not enable Jumbo Frames on the Switch. My Bad. That probably was the cause of the error.
Last edited on September 20, 2017, 11:30 am by fx882 · #4
admin
2,930 Posts
September 20, 2017, 4:37 pmQuote from admin on September 20, 2017, 4:37 pmjumbo frames option should work out of the box
jumbo frames option should work out of the box
Last edited on September 20, 2017, 4:37 pm by admin · #5
[solved][PetaSAN 1.4.0] Cluster building process stuck (build won't finish)
fx882
17 Posts
Quote from fx882 on September 19, 2017, 9:07 amHi,
I have set up a new test cluster with PetaSAN 1.4.0 and when adding the third node, the build does not finish the cluster setup. The webinterface keeps showing "Final Deployment Stage: Please do not close this page until processing is complete. Processing..." for about 30 hours now.
Here is my setup:
- 3 Nodes, 5 Network adapters (1 x onboard Intel 1GBit, 1 x Quadport Intel 1GBit)
- IP-Configuration: Management 10.1.0.0/22(existing Management-Subnet) Backend-1: 10.5.3.0/24 Backend-2: 10.5.4.0/24
- HDD sizes are different. 4 HDDs in each host, 2 hosts with 4 x SATA 500 GB, 1 host with adaptec AS2405 RAID and 4 x SAS 1 TB, All drives are configured as single HDDs. OS is installed onto the first HDD
- Motherboard Intel DQ67SW with core i7-2600 cpu and 16 GB RAM
- 48 Port HP Network Switch separated into VLANs
- All network cables checked manually(all cable in all nodes are connected in the same order)
- IP Connectivity checked and ok in all configured LANs(management,backend-1,backend-2)
- After the first unsuccessful setup, I rebooted all nodes and restarted the setup several times, same outcome
- IP Connectivity from all nodes to outside(management-interface) works
Debug Information collected via script are here as paste:
https://paste.ubuntu.com/25571214/Thank's for any advice.
P. S.: The diagnostic information collector script is here: https://github.com/megabert/script-pastebin/tree/master/petasan-diag
Hi,
I have set up a new test cluster with PetaSAN 1.4.0 and when adding the third node, the build does not finish the cluster setup. The webinterface keeps showing "Final Deployment Stage: Please do not close this page until processing is complete. Processing..." for about 30 hours now.
Here is my setup:
- 3 Nodes, 5 Network adapters (1 x onboard Intel 1GBit, 1 x Quadport Intel 1GBit)
- IP-Configuration: Management 10.1.0.0/22(existing Management-Subnet) Backend-1: 10.5.3.0/24 Backend-2: 10.5.4.0/24
- HDD sizes are different. 4 HDDs in each host, 2 hosts with 4 x SATA 500 GB, 1 host with adaptec AS2405 RAID and 4 x SAS 1 TB, All drives are configured as single HDDs. OS is installed onto the first HDD
- Motherboard Intel DQ67SW with core i7-2600 cpu and 16 GB RAM
- 48 Port HP Network Switch separated into VLANs
- All network cables checked manually(all cable in all nodes are connected in the same order)
- IP Connectivity checked and ok in all configured LANs(management,backend-1,backend-2)
- After the first unsuccessful setup, I rebooted all nodes and restarted the setup several times, same outcome
- IP Connectivity from all nodes to outside(management-interface) works
Debug Information collected via script are here as paste:
https://paste.ubuntu.com/25571214/
Thank's for any advice.
P. S.: The diagnostic information collector script is here: https://github.com/megabert/script-pastebin/tree/master/petasan-diag
fx882
17 Posts
Quote from fx882 on September 19, 2017, 10:11 amAs I saw, an ssh-key is distributed to the cluster nodes, I switched off strict hostkey checking as this may block an ssh command if it's spawn as an ssh process and it will ask if the key ist trusted upon the first connection.
I put this into /root/.ssh/config
Host *
StrictHostKeyChecking noI additionally put my own key in /root/.ssh/authorized_keys, so I'm authenticated via key. I left the existing cluster key as is in authorized_keys.
I further checked key based authentication for root within all the nodes: It's working properly.
This did not help.
As I saw, an ssh-key is distributed to the cluster nodes, I switched off strict hostkey checking as this may block an ssh command if it's spawn as an ssh process and it will ask if the key ist trusted upon the first connection.
I put this into /root/.ssh/config
Host *
StrictHostKeyChecking no
I additionally put my own key in /root/.ssh/authorized_keys, so I'm authenticated via key. I left the existing cluster key as is in authorized_keys.
I further checked key based authentication for root within all the nodes: It's working properly.
This did not help.
fx882
17 Posts
Quote from fx882 on September 19, 2017, 10:31 amI observed strange DNS Querys (for localhost) and found out that the /etc/hosts which is a symlink to /opt/petasan/config/etc/hosts is empty. I added some entries by hand onto all hosts:
127.0.0.1 localhost
10.1.0.101 ceph1a.mgt.mydomain.de ceph1a
10.1.0.102 ceph1b.mgt.mydomain.de ceph1b
10.1.0.103 ceph1c.mgt.mydomain.de ceph1cNo effect so far.
I observed strange DNS Querys (for localhost) and found out that the /etc/hosts which is a symlink to /opt/petasan/config/etc/hosts is empty. I added some entries by hand onto all hosts:
127.0.0.1 localhost
10.1.0.101 ceph1a.mgt.mydomain.de ceph1a
10.1.0.102 ceph1b.mgt.mydomain.de ceph1b
10.1.0.103 ceph1c.mgt.mydomain.de ceph1c
No effect so far.
fx882
17 Posts
Quote from fx882 on September 19, 2017, 12:16 pmI reinstalled all 3 nodes and did not do anything on the servers on my own. I especially did not select "Jumbo-Frames".
The cluster setup had finished successfully now and a further 4th node could be added.
Edit: Yes, I realized, I did not enable Jumbo Frames on the Switch. My Bad. That probably was the cause of the error.
I reinstalled all 3 nodes and did not do anything on the servers on my own. I especially did not select "Jumbo-Frames".
The cluster setup had finished successfully now and a further 4th node could be added.
Edit: Yes, I realized, I did not enable Jumbo Frames on the Switch. My Bad. That probably was the cause of the error.
admin
2,930 Posts
Quote from admin on September 20, 2017, 4:37 pmjumbo frames option should work out of the box
jumbo frames option should work out of the box