Forums - PetaSAN

ForumBug ReportingError building Consul cluster
You need to log in to create posts and topics. Login · Register
Error building Consul cluster

Pages: 1 2 3

admin
2,930 Posts

February 15, 2018, 7:17 am
Quote from admin on February 15, 2018, 7:17 am
It will work, at layer 2 in no trunk / access mode all your ethernet interfaces will be on same default vlan. iSCSI 1 and iSCSI 2 are different layer 3 (ip) subnets but at layer 2 the swicth sees them on the same vlan. there are some small drawbacks like all braodcast traffic will be shared.

Ideally iSCSI 1 adn ISCSI 2 should be put on separate switches and interfaces, the idea of MPIO is to provide high availability in case of network link failure. For example you may set up a highly available storage system like PetaSAN, but if you have 1 switch it is a single point of failure, if it dies your system stops no matter how good storage redundancy was. For backend networks, typically you will create a bond on different interfaces that are connect to different switches if you need to provide HA.

It will work, at layer 2 in no trunk / access mode all your ethernet interfaces will be on same default vlan. iSCSI 1 and iSCSI 2 are different layer 3 (ip) subnets but at layer 2 the swicth sees them on the same vlan. there are some small drawbacks like all braodcast traffic will be shared.

Ideally iSCSI 1 adn ISCSI 2 should be put on separate switches and interfaces, the idea of MPIO is to provide high availability in case of network link failure. For example you may set up a highly available storage system like PetaSAN, but if you have 1 switch it is a single point of failure, if it dies your system stops no matter how good storage redundancy was. For backend networks, typically you will create a bond on different interfaces that are connect to different switches if you need to provide HA.

Last edited on February 15, 2018, 7:25 am by admin · #11

Alex
7 Posts

February 15, 2018, 8:48 am
Quote from Alex on February 15, 2018, 8:48 am
Ok, thank you. I will try to deploy my cluster with a new settings.

Ok, thank you. I will try to deploy my cluster with a new settings.

#12

Alex
7 Posts

February 15, 2018, 10:25 am
Quote from Alex on February 15, 2018, 10:25 am
I am deployed a my cluster. You was right - my problem it was be in that switch ports was configured in trunk mode.

But i got a new problem, all OSD on first node (node-01) in my cluster as down. I have tried delete this down OSD and re-add, however, after creating new OSD (with Journal and without him) status of this OSD changed as down. What could be the problem?

I am deployed a my cluster. You was right - my problem it was be in that switch ports was configured in trunk mode.

But i got a new problem, all OSD on first node (node-01) in my cluster as down. I have tried delete this down OSD and re-add, however, after creating new OSD (with Journal and without him) status of this OSD changed as down. What could be the problem?

#13

admin
2,930 Posts

February 15, 2018, 11:03 am
Quote from admin on February 15, 2018, 11:03 am
Hi,

The other 2 nodes, the OSDs are up ?

How many OSDs do you have ?

Can your first node ping the other 2 nodes on backend 1 and backend 2 ips ?

can you please post the output of ( replace CLUSTER_NAME with the name of your cluster and OSD_ID with the id of your OSD) :

ceph status --cluster CLUSTER_NAME
/usr/lib/ceph/ceph-osd-prestart.sh --cluster CLUSTER_NAME --id OSD_ID
/usr/bin/ceph-osd -f --cluster CLUSTER_NAME --id OSD_ID --setuser ceph --setgroup ceph

example:
ceph status --cluster demo
/usr/lib/ceph/ceph-osd-prestart.sh --cluster demo --id 2
/usr/bin/ceph-osd -f --cluster demo --id 2 --setuser ceph --setgroup ceph

Hi,

The other 2 nodes, the OSDs are up ?

How many OSDs do you have ?

Can your first node ping the other 2 nodes on backend 1 and backend 2 ips ?

can you please post the output of ( replace CLUSTER_NAME with the name of your cluster and OSD_ID with the id of your OSD) :

ceph status --cluster CLUSTER_NAME
/usr/lib/ceph/ceph-osd-prestart.sh --cluster CLUSTER_NAME --id OSD_ID
/usr/bin/ceph-osd -f --cluster CLUSTER_NAME --id OSD_ID --setuser ceph --setgroup ceph

example:
ceph status --cluster demo
/usr/lib/ceph/ceph-osd-prestart.sh --cluster demo --id 2
/usr/bin/ceph-osd -f --cluster demo --id 2 --setuser ceph --setgroup ceph

Last edited on February 15, 2018, 11:07 am by admin · #14

Alex
7 Posts

February 15, 2018, 11:31 am
Quote from Alex on February 15, 2018, 11:31 am
Hi, a have 4 nodes in my cluster:
node-01 - 12 HDD (4 journal & 8 OSD) all is down;
node-01 - 12 HDD (4 journal & 8 OSD) all is up;
node-01 - 12 HDD (4 journal & 8 OSD) all is up;
node-01 - 12 HDD (12 OSD) all is up;
Each HDD have on 3.64 TB capacity.

I can first node ping the other 3 nodes on backend 1 ip only, backend 2 ip is not ping for all nodes.

root@alma-sds-ps-node-01:~# ceph status --cluster Test_iSCSI_Cluster
cluster:
id:     2efd7538-76dc-4411-9e66-077fb1cf40f6
health: HEALTH_OK

services:
mon: 3 daemons, quorum alma-sds-ps-node-01,alma-sds-ps-node-02,alma-sds-ps-node-03
mgr: alma-sds-ps-node-01(active), standbys: alma-sds-ps-node-02, alma-sds-ps-node-03
osd: 36 osds: 28 up, 28 in

data:
pools:   1 pools, 256 pgs
objects: 0 objects, 0 bytes
usage:   350 GB used, 101 TB / 102 TB avail
pgs:     256 active+clean

root@alma-sds-ps-node-01:~# ceph osd tree --cluster Test_iSCSI_Cluster
ID CLASS WEIGHT    TYPE NAME                    STATUS REWEIGHT PRI-AFF
-1       131.45746 root default
-5        29.26477     host alma-sds-ps-node-01
8   hdd   3.65810         osd.8                  down        0 1.00000
9   hdd   3.65810         osd.9                  down        0 1.00000
10   hdd   3.65810         osd.10                 down        0 1.00000
11   hdd   3.65810         osd.11                 down        0 1.00000
12   hdd   3.65810         osd.12                 down        0 1.00000
13   hdd   3.65810         osd.13                 down        0 1.00000
14   hdd   3.65810         osd.14                 down        0 1.00000
15   hdd   3.65810         osd.15                 down        0 1.00000
-7        29.26477     host alma-sds-ps-node-02
16   hdd   3.65810         osd.16                   up 1.00000 1.00000
17   hdd   3.65810         osd.17                   up 1.00000 1.00000
18   hdd   3.65810         osd.18                   up 1.00000 1.00000
19   hdd   3.65810         osd.19                   up 1.00000 1.00000
20   hdd   3.65810         osd.20                   up 1.00000 1.00000
21   hdd   3.65810         osd.21                   up 1.00000 1.00000
22   hdd   3.65810         osd.22                   up 1.00000 1.00000
23   hdd   3.65810         osd.23                   up 1.00000 1.00000
-3        29.26477     host alma-sds-ps-node-03
0   hdd   3.65810         osd.0                    up 1.00000 1.00000
1   hdd   3.65810         osd.1                    up 1.00000 1.00000
2   hdd   3.65810         osd.2                    up 1.00000 1.00000
3   hdd   3.65810         osd.3                    up 1.00000 1.00000
4   hdd   3.65810         osd.4                    up 1.00000 1.00000
5   hdd   3.65810         osd.5                    up 1.00000 1.00000
6   hdd   3.65810         osd.6                    up 1.00000 1.00000
7   hdd   3.65810         osd.7                    up 1.00000 1.00000
-9        43.66315     host alma-sds-ps-node-04
24   hdd   3.63860         osd.24                   up 1.00000 1.00000
25   hdd   3.63860         osd.25                   up 1.00000 1.00000
26   hdd   3.63860         osd.26                   up 1.00000 1.00000
27   hdd   3.63860         osd.27                   up 1.00000 1.00000
28   hdd   3.63860         osd.28                   up 1.00000 1.00000
29   hdd   3.63860         osd.29                   up 1.00000 1.00000
30   hdd   3.63860         osd.30                   up 1.00000 1.00000
31   hdd   3.63860         osd.31                   up 1.00000 1.00000
32   hdd   3.63860         osd.32                   up 1.00000 1.00000
33   hdd   3.63860         osd.33                   up 1.00000 1.00000
34   hdd   3.63860         osd.34                   up 1.00000 1.00000
35   hdd   3.63860         osd.35                   up 1.00000 1.00000

root@alma-sds-ps-node-01:~# /usr/lib/ceph/ceph-osd-prestart.sh --cluster Test_iSCSI_Cluster --id 8
root@alma-sds-ps-node-01:~#

root@alma-sds-ps-node-01:~# /usr/bin/ceph-osd -f --cluster Test_iSCSI-Cluster --id 8 --setuser ceph --setgroup ceph
2018-02-15 17:25:20.086717 7f150d989e00 -1 did not load config file, using default settings.
2018-02-15 17:25:20.088994 7f150d989e00 -1 Errors while parsing config file!
2018-02-15 17:25:20.089017 7f150d989e00 -1 parse_file: cannot open /etc/ceph/Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:25:20.089042 7f150d989e00 -1 parse_file: cannot open ~/.ceph/Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:25:20.089043 7f150d989e00 -1 parse_file: cannot open Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:25:20.091421 7f150d989e00 -1 Errors while parsing config file!
2018-02-15 17:25:20.091422 7f150d989e00 -1 parse_file: cannot open /etc/ceph/Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:25:20.091423 7f150d989e00 -1 parse_file: cannot open ~/.ceph/Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:25:20.091424 7f150d989e00 -1 parse_file: cannot open Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:25:20.091665 7f150d989e00 -1 ERROR: unable to open OSD superblock on /var/lib/ceph/osd/Test_iSCSI-Cluster-8: (2) No such file or directory

root@alma-sds-ps-node-01:~# /usr/lib/ceph/ceph-osd-prestart.sh --cluster Test_iSCSI_Cluster --id 9
root@alma-sds-ps-node-01:~#

root@alma-sds-ps-node-01:~# /usr/bin/ceph-osd -f --cluster Test_iSCSI-Cluster --id 9 --setuser ceph --setgroup ceph
2018-02-15 17:26:13.854577 7fa3b1e82e00 -1 did not load config file, using default settings.
2018-02-15 17:26:13.856985 7fa3b1e82e00 -1 Errors while parsing config file!
2018-02-15 17:26:13.856988 7fa3b1e82e00 -1 parse_file: cannot open /etc/ceph/Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:26:13.856994 7fa3b1e82e00 -1 parse_file: cannot open ~/.ceph/Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:26:13.857013 7fa3b1e82e00 -1 parse_file: cannot open Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:26:13.859253 7fa3b1e82e00 -1 Errors while parsing config file!
2018-02-15 17:26:13.859253 7fa3b1e82e00 -1 parse_file: cannot open /etc/ceph/Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:26:13.859255 7fa3b1e82e00 -1 parse_file: cannot open ~/.ceph/Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:26:13.859256 7fa3b1e82e00 -1 parse_file: cannot open Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:26:13.859474 7fa3b1e82e00 -1 ERROR: unable to open OSD superblock on /var/lib/ceph/osd/Test_iSCSI-Cluster-9: (2) No such file or directory

Hi, a have 4 nodes in my cluster:
node-01 - 12 HDD (4 journal & 8 OSD) all is down;
node-01 - 12 HDD (4 journal & 8 OSD) all is up;
node-01 - 12 HDD (4 journal & 8 OSD) all is up;
node-01 - 12 HDD (12 OSD) all is up;
Each HDD have on 3.64 TB capacity.

I can first node ping the other 3 nodes on backend 1 ip only, backend 2 ip is not ping for all nodes.

root@alma-sds-ps-node-01:~# ceph status --cluster Test_iSCSI_Cluster
cluster:
id:     2efd7538-76dc-4411-9e66-077fb1cf40f6
health: HEALTH_OK

services:
mon: 3 daemons, quorum alma-sds-ps-node-01,alma-sds-ps-node-02,alma-sds-ps-node-03
mgr: alma-sds-ps-node-01(active), standbys: alma-sds-ps-node-02, alma-sds-ps-node-03
osd: 36 osds: 28 up, 28 in

data:
pools:   1 pools, 256 pgs
objects: 0 objects, 0 bytes
usage:   350 GB used, 101 TB / 102 TB avail
pgs:     256 active+clean

root@alma-sds-ps-node-01:~# ceph osd tree --cluster Test_iSCSI_Cluster
ID CLASS WEIGHT    TYPE NAME                    STATUS REWEIGHT PRI-AFF
-1       131.45746 root default
-5        29.26477     host alma-sds-ps-node-01
8   hdd   3.65810         osd.8                  down        0 1.00000
9   hdd   3.65810         osd.9                  down        0 1.00000
10   hdd   3.65810         osd.10                 down        0 1.00000
11   hdd   3.65810         osd.11                 down        0 1.00000
12   hdd   3.65810         osd.12                 down        0 1.00000
13   hdd   3.65810         osd.13                 down        0 1.00000
14   hdd   3.65810         osd.14                 down        0 1.00000
15   hdd   3.65810         osd.15                 down        0 1.00000
-7        29.26477     host alma-sds-ps-node-02
16   hdd   3.65810         osd.16                   up 1.00000 1.00000
17   hdd   3.65810         osd.17                   up 1.00000 1.00000
18   hdd   3.65810         osd.18                   up 1.00000 1.00000
19   hdd   3.65810         osd.19                   up 1.00000 1.00000
20   hdd   3.65810         osd.20                   up 1.00000 1.00000
21   hdd   3.65810         osd.21                   up 1.00000 1.00000
22   hdd   3.65810         osd.22                   up 1.00000 1.00000
23   hdd   3.65810         osd.23                   up 1.00000 1.00000
-3        29.26477     host alma-sds-ps-node-03
0   hdd   3.65810         osd.0                    up 1.00000 1.00000
1   hdd   3.65810         osd.1                    up 1.00000 1.00000
2   hdd   3.65810         osd.2                    up 1.00000 1.00000
3   hdd   3.65810         osd.3                    up 1.00000 1.00000
4   hdd   3.65810         osd.4                    up 1.00000 1.00000
5   hdd   3.65810         osd.5                    up 1.00000 1.00000
6   hdd   3.65810         osd.6                    up 1.00000 1.00000
7   hdd   3.65810         osd.7                    up 1.00000 1.00000
-9        43.66315     host alma-sds-ps-node-04
24   hdd   3.63860         osd.24                   up 1.00000 1.00000
25   hdd   3.63860         osd.25                   up 1.00000 1.00000
26   hdd   3.63860         osd.26                   up 1.00000 1.00000
27   hdd   3.63860         osd.27                   up 1.00000 1.00000
28   hdd   3.63860         osd.28                   up 1.00000 1.00000
29   hdd   3.63860         osd.29                   up 1.00000 1.00000
30   hdd   3.63860         osd.30                   up 1.00000 1.00000
31   hdd   3.63860         osd.31                   up 1.00000 1.00000
32   hdd   3.63860         osd.32                   up 1.00000 1.00000
33   hdd   3.63860         osd.33                   up 1.00000 1.00000
34   hdd   3.63860         osd.34                   up 1.00000 1.00000
35   hdd   3.63860         osd.35                   up 1.00000 1.00000

root@alma-sds-ps-node-01:~# /usr/lib/ceph/ceph-osd-prestart.sh --cluster Test_iSCSI_Cluster --id 8
root@alma-sds-ps-node-01:~#

root@alma-sds-ps-node-01:~# /usr/bin/ceph-osd -f --cluster Test_iSCSI-Cluster --id 8 --setuser ceph --setgroup ceph
2018-02-15 17:25:20.086717 7f150d989e00 -1 did not load config file, using default settings.
2018-02-15 17:25:20.088994 7f150d989e00 -1 Errors while parsing config file!
2018-02-15 17:25:20.089017 7f150d989e00 -1 parse_file: cannot open /etc/ceph/Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:25:20.089042 7f150d989e00 -1 parse_file: cannot open ~/.ceph/Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:25:20.089043 7f150d989e00 -1 parse_file: cannot open Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:25:20.091421 7f150d989e00 -1 Errors while parsing config file!
2018-02-15 17:25:20.091422 7f150d989e00 -1 parse_file: cannot open /etc/ceph/Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:25:20.091423 7f150d989e00 -1 parse_file: cannot open ~/.ceph/Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:25:20.091424 7f150d989e00 -1 parse_file: cannot open Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:25:20.091665 7f150d989e00 -1 ERROR: unable to open OSD superblock on /var/lib/ceph/osd/Test_iSCSI-Cluster-8: (2) No such file or directory

root@alma-sds-ps-node-01:~# /usr/lib/ceph/ceph-osd-prestart.sh --cluster Test_iSCSI_Cluster --id 9
root@alma-sds-ps-node-01:~#

root@alma-sds-ps-node-01:~# /usr/bin/ceph-osd -f --cluster Test_iSCSI-Cluster --id 9 --setuser ceph --setgroup ceph
2018-02-15 17:26:13.854577 7fa3b1e82e00 -1 did not load config file, using default settings.
2018-02-15 17:26:13.856985 7fa3b1e82e00 -1 Errors while parsing config file!
2018-02-15 17:26:13.856988 7fa3b1e82e00 -1 parse_file: cannot open /etc/ceph/Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:26:13.856994 7fa3b1e82e00 -1 parse_file: cannot open ~/.ceph/Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:26:13.857013 7fa3b1e82e00 -1 parse_file: cannot open Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:26:13.859253 7fa3b1e82e00 -1 Errors while parsing config file!
2018-02-15 17:26:13.859253 7fa3b1e82e00 -1 parse_file: cannot open /etc/ceph/Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:26:13.859255 7fa3b1e82e00 -1 parse_file: cannot open ~/.ceph/Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:26:13.859256 7fa3b1e82e00 -1 parse_file: cannot open Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:26:13.859474 7fa3b1e82e00 -1 ERROR: unable to open OSD superblock on /var/lib/ceph/osd/Test_iSCSI-Cluster-9: (2) No such file or directory

#15

admin
2,930 Posts

February 15, 2018, 11:58 am
Quote from admin on February 15, 2018, 11:58 am
2 issues:

A) you need to find out why you cannot ping on backend 2, maybe the ips were not entered correctly or maybe there is switch issue.

To see what ips are currently assigned to you interfaces :

ip addr

To view the configuration file for network setting

cat /opt/petasan/config/node_info.json

cat /opt/petasan/config/cluster_info.json

B) To get the output error while OSD is starting, you mis-typed your cluster name (should be _ rather that -) , so please retry:

/usr/bin/ceph-osd -f --cluster Test_iSCSI_Cluster --id 9 --setuser ceph --setgroup ceph

2 issues:

A) you need to find out why you cannot ping on backend 2, maybe the ips were not entered correctly or maybe there is switch issue.

To see what ips are currently assigned to you interfaces :

ip addr

To view the configuration file for network setting

cat /opt/petasan/config/node_info.json

cat /opt/petasan/config/cluster_info.json

B) To get the output error while OSD is starting, you mis-typed your cluster name (should be _ rather that -) , so please retry:

/usr/bin/ceph-osd -f --cluster Test_iSCSI_Cluster --id 9 --setuser ceph --setgroup ceph

Last edited on February 15, 2018, 11:58 am by admin · #16

Alex
7 Posts

February 15, 2018, 12:00 pm
Quote from Alex on February 15, 2018, 12:00 pm
I finded error! I wrong was configured switch ports!
Now it is working!
I am so sorry! I Needed to be more attentive...
Thank you very mach for support!

I finded error! I wrong was configured switch ports!
Now it is working!
I am so sorry! I Needed to be more attentive...
Thank you very mach for support!

#17

admin
2,930 Posts

February 15, 2018, 4:20 pm
Quote from admin on February 15, 2018, 4:20 pm
no problem 🙂

no problem 🙂

#18

gbujerin
4 Posts

September 9, 2019, 2:43 pm
Quote from gbujerin on September 9, 2019, 2:43 pm
Error List

Cluster Node petasan failed to join the cluster or is not alive.
Cluster Node PetaSAN-node2 failed to join the cluster or is not alive.

I am getting this error while adding the 3rd node. Please help

Error List

Cluster Node petasan failed to join the cluster or is not alive.
Cluster Node PetaSAN-node2 failed to join the cluster or is not alive.

I am getting this error while adding the 3rd node. Please help

#19

admin
2,930 Posts

September 9, 2019, 8:24 pm
Quote from admin on September 9, 2019, 8:24 pm
i recommend you recheck your ip/subnet assignments, wiring and re-install.

if it happens again, do let us know..

i recommend you recheck your ip/subnet assignments, wiring and re-install.

if it happens again, do let us know..

#20

Post Reply: Error building Consul cluster

Cancel

Pages: 1 2 3