Forums - PetaSAN

ForumBug ReportingError building Consul cluster
You need to log in to create posts and topics. Login · Register
Error building Consul cluster

Pages: 1 2 3

eugeans
3 Posts

March 14, 2017, 7:55 am
Quote from eugeans on March 14, 2017, 7:55 am
After trying to add a third node to the cluster - Step6, I get an error:
"Error building Consul cluster"

After trying to add a third node to the cluster - Step6, I get an error:
"Error building Consul cluster"

#1

admin
2,930 Posts

March 14, 2017, 10:24 am
Quote from admin on March 14, 2017, 10:24 am
Hi,

This error could happen if:

-There is network problem, specifically for the back-end 1 subnet

-There is not enough RAM

If you ssh into the nodes, using the cluster password, can they ping each other on all their different sub-nets ?

Are you testing this under desktop virtual setup ( vmware / virtualbox ) ? if so try using separate interfaces per subnet and make sure they are assigned the correct subnet and wired together correctly.

Hi,

This error could happen if:

-There is network problem, specifically for the back-end 1 subnet

-There is not enough RAM

If you ssh into the nodes, using the cluster password, can they ping each other on all their different sub-nets ?

Are you testing this under desktop virtual setup ( vmware / virtualbox ) ? if so try using separate interfaces per subnet and make sure they are assigned the correct subnet and wired together correctly.

#2

eugeans
3 Posts

March 16, 2017, 7:24 am
Quote from eugeans on March 16, 2017, 7:24 am
Kind time of day!

I'm testing the product on vmware.
To verify the correctness of the infrastructure, a similar configuration of 3 virtual machines was built.
My configuration:
3 nodes of 5 interfaces.

1. mgmt 10.16.246.70
2. iSCSI-1
3. iSCSI-2
4. back-end-1 10.0.4.8
5. back-end-2 10.0.5.8

1. mgmt 10.16.246.71
2. iSCSI-1
3. iSCSI-2
4. back-end-1 10.0.4.9
5. back-end-2 10.0.5.9

1. mgmt 10.16.246.72
2. iSCSI-1
3. iSCSI-2
4. back-end-1 10.0.4.10
5. back-end-2 10.0.5.10

I check the reachability from node 1:
livecd ~ # ping 10.0.4.8
PING 10.0.4.8 (10.0.4.8) 56(84) bytes of data.
64 bytes from 10.0.4.8: icmp_seq=1 ttl=64 time=0.047 ms
64 bytes from 10.0.4.8: icmp_seq=2 ttl=64 time=0.019 ms
^C
--- 10.0.4.8 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 999ms
rtt min/avg/max/mdev = 0.019/0.033/0.047/0.014 ms
livecd ~ # ping 10.0.4.9
PING 10.0.4.9 (10.0.4.9) 56(84) bytes of data.
64 bytes from 10.0.4.9: icmp_seq=1 ttl=64 time=0.453 ms
64 bytes from 10.0.4.9: icmp_seq=2 ttl=64 time=0.489 ms
^C
--- 10.0.4.9 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 999ms
rtt min/avg/max/mdev = 0.453/0.471/0.489/0.018 ms
livecd ~ # ping 10.0.4.8
PING 10.0.4.8 (10.0.4.8) 56(84) bytes of data.
64 bytes from 10.0.4.8: icmp_seq=1 ttl=64 time=0.019 ms
64 bytes from 10.0.4.8: icmp_seq=2 ttl=64 time=0.012 ms
^C
--- 10.0.4.8 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 999ms
rtt min/avg/max/mdev = 0.012/0.015/0.019/0.005 ms
livecd ~ # ping 10.0.5.8
PING 10.0.5.8 (10.0.5.8) 56(84) bytes of data.
64 bytes from 10.0.5.8: icmp_seq=1 ttl=64 time=0.020 ms
64 bytes from 10.0.5.8: icmp_seq=2 ttl=64 time=0.019 ms
^C
--- 10.0.5.8 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 999ms
rtt min/avg/max/mdev = 0.019/0.019/0.020/0.004 ms
livecd ~ # ping 10.0.4.9
PING 10.0.4.9 (10.0.4.9) 56(84) bytes of data.
64 bytes from 10.0.4.9: icmp_seq=1 ttl=64 time=0.484 ms
64 bytes from 10.0.4.9: icmp_seq=2 ttl=64 time=0.399 ms
64 bytes from 10.0.4.9: icmp_seq=3 ttl=64 time=0.438 ms
^C
--- 10.0.4.9 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 1998ms
rtt min/avg/max/mdev = 0.399/0.440/0.484/0.038 ms
livecd ~ # ping 10.0.5.9
PING 10.0.5.9 (10.0.5.9) 56(84) bytes of data.
64 bytes from 10.0.5.9: icmp_seq=1 ttl=64 time=0.516 ms
64 bytes from 10.0.5.9: icmp_seq=2 ttl=64 time=0.381 ms
64 bytes from 10.0.5.9: icmp_seq=3 ttl=64 time=0.376 ms
^C
--- 10.0.5.9 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 1998ms
rtt min/avg/max/mdev = 0.376/0.424/0.516/0.067 ms
livecd ~ # ping 10.0.4.10
PING 10.0.4.10 (10.0.4.10) 56(84) bytes of data.
64 bytes from 10.0.4.10: icmp_seq=1 ttl=64 time=0.478 ms
64 bytes from 10.0.4.10: icmp_seq=2 ttl=64 time=0.443 ms
^C
--- 10.0.4.10 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 999ms
rtt min/avg/max/mdev = 0.443/0.460/0.478/0.027 ms
livecd ~ # ping 10.0.5.10
PING 10.0.5.10 (10.0.5.10) 56(84) bytes of data.
64 bytes from 10.0.5.10: icmp_seq=1 ttl=64 time=0.569 ms
64 bytes from 10.0.5.10: icmp_seq=2 ttl=64 time=0.567 ms
^C
--- 10.0.5.10 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1000ms
rtt min/avg/max/mdev = 0.567/0.568/0.569/0.001 ms
livecd ~ #

All interfaces back-end'es achievable.

Node deployment completed successfully. Another 2 nodes need to join for cluster to be built.

Node deployment completed successfully. Another node needs to join for cluster to be built.

Fore 3 node:

Error List

Error building Consul cluster

Each node configuration: 8 CPU, 8 Gb RAM, 40 GB HDD.

Kind time of day!

I'm testing the product on vmware.
To verify the correctness of the infrastructure, a similar configuration of 3 virtual machines was built.
My configuration:
3 nodes of 5 interfaces.

1. mgmt 10.16.246.70
2. iSCSI-1
3. iSCSI-2
4. back-end-1 10.0.4.8
5. back-end-2 10.0.5.8

1. mgmt 10.16.246.71
2. iSCSI-1
3. iSCSI-2
4. back-end-1 10.0.4.9
5. back-end-2 10.0.5.9

1. mgmt 10.16.246.72
2. iSCSI-1
3. iSCSI-2
4. back-end-1 10.0.4.10
5. back-end-2 10.0.5.10

I check the reachability from node 1:
livecd ~ # ping 10.0.4.8
PING 10.0.4.8 (10.0.4.8) 56(84) bytes of data.
64 bytes from 10.0.4.8: icmp_seq=1 ttl=64 time=0.047 ms
64 bytes from 10.0.4.8: icmp_seq=2 ttl=64 time=0.019 ms
^C
--- 10.0.4.8 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 999ms
rtt min/avg/max/mdev = 0.019/0.033/0.047/0.014 ms
livecd ~ # ping 10.0.4.9
PING 10.0.4.9 (10.0.4.9) 56(84) bytes of data.
64 bytes from 10.0.4.9: icmp_seq=1 ttl=64 time=0.453 ms
64 bytes from 10.0.4.9: icmp_seq=2 ttl=64 time=0.489 ms
^C
--- 10.0.4.9 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 999ms
rtt min/avg/max/mdev = 0.453/0.471/0.489/0.018 ms
livecd ~ # ping 10.0.4.8
PING 10.0.4.8 (10.0.4.8) 56(84) bytes of data.
64 bytes from 10.0.4.8: icmp_seq=1 ttl=64 time=0.019 ms
64 bytes from 10.0.4.8: icmp_seq=2 ttl=64 time=0.012 ms
^C
--- 10.0.4.8 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 999ms
rtt min/avg/max/mdev = 0.012/0.015/0.019/0.005 ms
livecd ~ # ping 10.0.5.8
PING 10.0.5.8 (10.0.5.8) 56(84) bytes of data.
64 bytes from 10.0.5.8: icmp_seq=1 ttl=64 time=0.020 ms
64 bytes from 10.0.5.8: icmp_seq=2 ttl=64 time=0.019 ms
^C
--- 10.0.5.8 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 999ms
rtt min/avg/max/mdev = 0.019/0.019/0.020/0.004 ms
livecd ~ # ping 10.0.4.9
PING 10.0.4.9 (10.0.4.9) 56(84) bytes of data.
64 bytes from 10.0.4.9: icmp_seq=1 ttl=64 time=0.484 ms
64 bytes from 10.0.4.9: icmp_seq=2 ttl=64 time=0.399 ms
64 bytes from 10.0.4.9: icmp_seq=3 ttl=64 time=0.438 ms
^C
--- 10.0.4.9 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 1998ms
rtt min/avg/max/mdev = 0.399/0.440/0.484/0.038 ms
livecd ~ # ping 10.0.5.9
PING 10.0.5.9 (10.0.5.9) 56(84) bytes of data.
64 bytes from 10.0.5.9: icmp_seq=1 ttl=64 time=0.516 ms
64 bytes from 10.0.5.9: icmp_seq=2 ttl=64 time=0.381 ms
64 bytes from 10.0.5.9: icmp_seq=3 ttl=64 time=0.376 ms
^C
--- 10.0.5.9 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 1998ms
rtt min/avg/max/mdev = 0.376/0.424/0.516/0.067 ms
livecd ~ # ping 10.0.4.10
PING 10.0.4.10 (10.0.4.10) 56(84) bytes of data.
64 bytes from 10.0.4.10: icmp_seq=1 ttl=64 time=0.478 ms
64 bytes from 10.0.4.10: icmp_seq=2 ttl=64 time=0.443 ms
^C
--- 10.0.4.10 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 999ms
rtt min/avg/max/mdev = 0.443/0.460/0.478/0.027 ms
livecd ~ # ping 10.0.5.10
PING 10.0.5.10 (10.0.5.10) 56(84) bytes of data.
64 bytes from 10.0.5.10: icmp_seq=1 ttl=64 time=0.569 ms
64 bytes from 10.0.5.10: icmp_seq=2 ttl=64 time=0.567 ms
^C
--- 10.0.5.10 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1000ms
rtt min/avg/max/mdev = 0.567/0.568/0.569/0.001 ms
livecd ~ #

All interfaces back-end'es achievable.

Node deployment completed successfully. Another 2 nodes need to join for cluster to be built.

Node deployment completed successfully. Another node needs to join for cluster to be built.

Fore 3 node:

Error List

Error building Consul cluster

Each node configuration: 8 CPU, 8 Gb RAM, 40 GB HDD.

Last edited on March 16, 2017, 7:31 am · #3

admin
2,930 Posts

March 16, 2017, 12:56 pm
Quote from admin on March 16, 2017, 12:56 pm
For the vmware interfaces, can you use either custom internal network or host only network.

Also if you use host only, make sure the subnets configured in the vmware network editor match the subnets you enter in PetaSAN. There is also a video on our website showing an installation on wmware, can you try following it.

so I recommend that you retry once more, if still you have issues i can debug it with you in more detail but i have a feeling it is a vmware configuration issue. also the ping delay is somewhat large on some of your links

For the vmware interfaces, can you use either custom internal network or host only network.

Also if you use host only, make sure the subnets configured in the vmware network editor match the subnets you enter in PetaSAN. There is also a video on our website showing an installation on wmware, can you try following it.

so I recommend that you retry once more, if still you have issues i can debug it with you in more detail but i have a feeling it is a vmware configuration issue. also the ping delay is somewhat large on some of your links

#4

eugeans
3 Posts

March 16, 2017, 1:39 pm
Quote from eugeans on March 16, 2017, 1:39 pm
Would you be kind enough to provide a link to the video on your site?

Would you be kind enough to provide a link to the video on your site?

#5

admin
2,930 Posts

March 16, 2017, 1:52 pm
Quote from admin on March 16, 2017, 1:52 pm
In the first page there is a link on the top right under the download button

In the first page there is a link on the top right under the download button

#6

Alex
7 Posts

February 14, 2018, 11:20 am
Quote from Alex on February 14, 2018, 11:20 am
Hello!
I have a four servers SuperMicro with a this specs:
Form Factor - 1U;
12 HDD + 1 SSD (for system);
2 network interface 10Gb/s -> ARISTA;
2 network interface 1Gb/s -> Cisco;
1 network interface - IPMI interface.
This servers connect in ARISTA DCS-7150S-52-CL-F switch on 10Gb/s links and Cisco WS-C3750-24TS on 1Gb/s links.

My subnets
iSCSI-1 - 10.X.X.X/29
iSCSI-2 - 10.X.X.X/29
Backend-1 - 10.X.X.X/29
Backend-2 - 10.X.X.X/29
MGMT - 10.X.X.X/29

I have a problem for cluster to be built "Error building cluster, please check detail below. Error building Consul cluster" at deploy third node this cluster.

What am I doing wrong?

And one more question - how to configure the ports on the switchs, I mean trunk or access mode?

Hello!
I have a four servers SuperMicro with a this specs:
Form Factor - 1U;
12 HDD + 1 SSD (for system);
2 network interface 10Gb/s -> ARISTA;
2 network interface 1Gb/s -> Cisco;
1 network interface - IPMI interface.
This servers connect in ARISTA DCS-7150S-52-CL-F switch on 10Gb/s links and Cisco WS-C3750-24TS on 1Gb/s links.

My subnets
iSCSI-1 - 10.X.X.X/29
iSCSI-2 - 10.X.X.X/29
Backend-1 - 10.X.X.X/29
Backend-2 - 10.X.X.X/29
MGMT - 10.X.X.X/29

I have a problem for cluster to be built "Error building cluster, please check detail below. Error building Consul cluster" at deploy third node this cluster.

What am I doing wrong?

And one more question - how to configure the ports on the switchs, I mean trunk or access mode?

#7

Alex
7 Posts

February 14, 2018, 11:36 am
Quote from Alex on February 14, 2018, 11:36 am
Part of the node-03 log:

14/02/2018 17:05:26 INFO     Starting local clean_ceph.
14/02/2018 17:05:26 INFO     Starting clean_ceph
14/02/2018 17:05:26 INFO     Stopping ceph services
14/02/2018 17:05:26 INFO     Start cleaning config files
14/02/2018 17:05:26 INFO     End cleaning config files
14/02/2018 17:05:26 INFO     Starting ceph services
14/02/2018 17:05:27 INFO     Starting local clean_consul.
14/02/2018 17:05:27 INFO     Trying to clean Consul on local node
14/02/2018 17:05:27 INFO     delete /opt/petasan/config/etc/consul.d
14/02/2018 17:05:27 INFO     delete /opt/petasan/config/var/consul
14/02/2018 17:05:27 INFO     Trying to clean Consul on (IP MGMT node-01)
14/02/2018 17:05:27 INFO     Trying to clean Consul on (IP MGMT node-02)
14/02/2018 17:05:28 INFO     cluster_name: Test_iSCSI_Cluster
14/02/2018 17:05:28 INFO     local_node_info.name: alma-sds-ps-node-03
14/02/2018 17:05:31 ERROR    Could not create Consul Configuration on node: (IP Backend-1 node-01)
14/02/2018 17:05:31 ERROR    Error building Consul cluster
14/02/2018 17:05:31 ERROR    Could not build consul.
14/02/2018 17:05:31 ERROR    ['core_consul_deploy_build_error_build_consul_cluster', 'core_consul_deploy_build_error_build_consul_cluster']

Part of the node-03 log:

14/02/2018 17:05:26 INFO     Starting local clean_ceph.
14/02/2018 17:05:26 INFO     Starting clean_ceph
14/02/2018 17:05:26 INFO     Stopping ceph services
14/02/2018 17:05:26 INFO     Start cleaning config files
14/02/2018 17:05:26 INFO     End cleaning config files
14/02/2018 17:05:26 INFO     Starting ceph services
14/02/2018 17:05:27 INFO     Starting local clean_consul.
14/02/2018 17:05:27 INFO     Trying to clean Consul on local node
14/02/2018 17:05:27 INFO     delete /opt/petasan/config/etc/consul.d
14/02/2018 17:05:27 INFO     delete /opt/petasan/config/var/consul
14/02/2018 17:05:27 INFO     Trying to clean Consul on (IP MGMT node-01)
14/02/2018 17:05:27 INFO     Trying to clean Consul on (IP MGMT node-02)
14/02/2018 17:05:28 INFO     cluster_name: Test_iSCSI_Cluster
14/02/2018 17:05:28 INFO     local_node_info.name: alma-sds-ps-node-03
14/02/2018 17:05:31 ERROR    Could not create Consul Configuration on node: (IP Backend-1 node-01)
14/02/2018 17:05:31 ERROR    Error building Consul cluster
14/02/2018 17:05:31 ERROR    Could not build consul.
14/02/2018 17:05:31 ERROR    ['core_consul_deploy_build_error_build_consul_cluster', 'core_consul_deploy_build_error_build_consul_cluster']

#8

admin
2,930 Posts

February 14, 2018, 11:42 am
Quote from admin on February 14, 2018, 11:42 am
In the majority of cases this means the backend 1 subnet is not setup correctly, this subnet is used by many several backend components but consul (our distributed resource management framework) is the first to be setup on it so it first to fail.

Test if the nodes ping each other on their backend 1 ips (either within ssh or using the blue node console) if it works i suggest you retry the install, else double check on each node the ip on this subnet is correct.

We do not use vlan tagging, so turn off port trunking, this may also be the issue

In the majority of cases this means the backend 1 subnet is not setup correctly, this subnet is used by many several backend components but consul (our distributed resource management framework) is the first to be setup on it so it first to fail.

Test if the nodes ping each other on their backend 1 ips (either within ssh or using the blue node console) if it works i suggest you retry the install, else double check on each node the ip on this subnet is correct.

We do not use vlan tagging, so turn off port trunking, this may also be the issue

Last edited on February 14, 2018, 11:43 am by admin · #9

Alex
7 Posts

February 15, 2018, 4:34 am
Quote from Alex on February 15, 2018, 4:34 am
Ok! I am understand you. But how then do I configure ports and distribute subnets?

May be like this:
MGMT IP node-0X       - eth0 (1Gb/s)
iSCSI-1 IP node-0X    - eth2 (10Gb/s)
Backend-1 IP node-0X   - eth3 (10Gb/s)
iSCSI-2 IP node-0X    - eth2 (10Gb/s)
Backend-2 IP node-0X   - eth1 (1Gb/s)

(all ports on switch in access mode)

But if the settings of a ports will be on access mode, then subnet iSCSI-2 will do not work because port of the switch to which connected port of the server eth2 ,will be configured in access mode for the iSCSI-1 subnet only.
iSCSI-2 can be neglected? Or what subnet may will be neglected?

Ok! I am understand you. But how then do I configure ports and distribute subnets?

May be like this:
MGMT IP node-0X       - eth0 (1Gb/s)
iSCSI-1 IP node-0X    - eth2 (10Gb/s)
Backend-1 IP node-0X   - eth3 (10Gb/s)
iSCSI-2 IP node-0X    - eth2 (10Gb/s)
Backend-2 IP node-0X   - eth1 (1Gb/s)

(all ports on switch in access mode)

But if the settings of a ports will be on access mode, then subnet iSCSI-2 will do not work because port of the switch to which connected port of the server eth2 ,will be configured in access mode for the iSCSI-1 subnet only.
iSCSI-2 can be neglected? Or what subnet may will be neglected?

#10

Post Reply: Error building Consul cluster

Cancel

Pages: 1 2 3