Error building Consul cluster
admin
2,918 Posts
February 15, 2018, 7:17 amQuote from admin on February 15, 2018, 7:17 amIt will work, at layer 2 in no trunk / access mode all your ethernet interfaces will be on same default vlan. iSCSI 1 and iSCSI 2 are different layer 3 (ip) subnets but at layer 2 the swicth sees them on the same vlan. there are some small drawbacks like all braodcast traffic will be shared.
Ideally iSCSI 1 adn ISCSI 2 should be put on separate switches and interfaces, the idea of MPIO is to provide high availability in case of network link failure. For example you may set up a highly available storage system like PetaSAN, but if you have 1 switch it is a single point of failure, if it dies your system stops no matter how good storage redundancy was. For backend networks, typically you will create a bond on different interfaces that are connect to different switches if you need to provide HA.
It will work, at layer 2 in no trunk / access mode all your ethernet interfaces will be on same default vlan. iSCSI 1 and iSCSI 2 are different layer 3 (ip) subnets but at layer 2 the swicth sees them on the same vlan. there are some small drawbacks like all braodcast traffic will be shared.
Ideally iSCSI 1 adn ISCSI 2 should be put on separate switches and interfaces, the idea of MPIO is to provide high availability in case of network link failure. For example you may set up a highly available storage system like PetaSAN, but if you have 1 switch it is a single point of failure, if it dies your system stops no matter how good storage redundancy was. For backend networks, typically you will create a bond on different interfaces that are connect to different switches if you need to provide HA.
Last edited on February 15, 2018, 7:25 am by admin · #11
Alex
7 Posts
February 15, 2018, 8:48 amQuote from Alex on February 15, 2018, 8:48 amOk, thank you. I will try to deploy my cluster with a new settings.
Ok, thank you. I will try to deploy my cluster with a new settings.
Alex
7 Posts
February 15, 2018, 10:25 amQuote from Alex on February 15, 2018, 10:25 amI am deployed a my cluster. You was right - my problem it was be in that switch ports was configured in trunk mode.
But i got a new problem, all OSD on first node (node-01) in my cluster as down. I have tried delete this down OSD and re-add, however, after creating new OSD (with Journal and without him) status of this OSD changed as down. What could be the problem?
I am deployed a my cluster. You was right - my problem it was be in that switch ports was configured in trunk mode.
But i got a new problem, all OSD on first node (node-01) in my cluster as down. I have tried delete this down OSD and re-add, however, after creating new OSD (with Journal and without him) status of this OSD changed as down. What could be the problem?
admin
2,918 Posts
February 15, 2018, 11:03 amQuote from admin on February 15, 2018, 11:03 amHi,
The other 2 nodes, the OSDs are up ?
How many OSDs do you have ?
Can your first node ping the other 2 nodes on backend 1 and backend 2 ips ?
can you please post the output of ( replace CLUSTER_NAME with the name of your cluster and OSD_ID with the id of your OSD) :
ceph status --cluster CLUSTER_NAME
/usr/lib/ceph/ceph-osd-prestart.sh --cluster CLUSTER_NAME --id OSD_ID
/usr/bin/ceph-osd -f --cluster CLUSTER_NAME --id OSD_ID --setuser ceph --setgroup ceph
example:
ceph status --cluster demo
/usr/lib/ceph/ceph-osd-prestart.sh --cluster demo --id 2
/usr/bin/ceph-osd -f --cluster demo --id 2 --setuser ceph --setgroup ceph
Hi,
The other 2 nodes, the OSDs are up ?
How many OSDs do you have ?
Can your first node ping the other 2 nodes on backend 1 and backend 2 ips ?
can you please post the output of ( replace CLUSTER_NAME with the name of your cluster and OSD_ID with the id of your OSD) :
ceph status --cluster CLUSTER_NAME
/usr/lib/ceph/ceph-osd-prestart.sh --cluster CLUSTER_NAME --id OSD_ID
/usr/bin/ceph-osd -f --cluster CLUSTER_NAME --id OSD_ID --setuser ceph --setgroup ceph
example:
ceph status --cluster demo
/usr/lib/ceph/ceph-osd-prestart.sh --cluster demo --id 2
/usr/bin/ceph-osd -f --cluster demo --id 2 --setuser ceph --setgroup ceph
Last edited on February 15, 2018, 11:07 am by admin · #14
Alex
7 Posts
February 15, 2018, 11:31 amQuote from Alex on February 15, 2018, 11:31 amHi, a have 4 nodes in my cluster:
node-01 - 12 HDD (4 journal & 8 OSD) all is down;
node-01 - 12 HDD (4 journal & 8 OSD) all is up;
node-01 - 12 HDD (4 journal & 8 OSD) all is up;
node-01 - 12 HDD (12 OSD) all is up;
Each HDD have on 3.64 TB capacity.
I can first node ping the other 3 nodes on backend 1 ip only, backend 2 ip is not ping for all nodes.
root@alma-sds-ps-node-01:~# ceph status --cluster Test_iSCSI_Cluster
cluster:
id: 2efd7538-76dc-4411-9e66-077fb1cf40f6
health: HEALTH_OK
services:
mon: 3 daemons, quorum alma-sds-ps-node-01,alma-sds-ps-node-02,alma-sds-ps-node-03
mgr: alma-sds-ps-node-01(active), standbys: alma-sds-ps-node-02, alma-sds-ps-node-03
osd: 36 osds: 28 up, 28 in
data:
pools: 1 pools, 256 pgs
objects: 0 objects, 0 bytes
usage: 350 GB used, 101 TB / 102 TB avail
pgs: 256 active+clean
root@alma-sds-ps-node-01:~# ceph osd tree --cluster Test_iSCSI_Cluster
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 131.45746 root default
-5 29.26477 host alma-sds-ps-node-01
8 hdd 3.65810 osd.8 down 0 1.00000
9 hdd 3.65810 osd.9 down 0 1.00000
10 hdd 3.65810 osd.10 down 0 1.00000
11 hdd 3.65810 osd.11 down 0 1.00000
12 hdd 3.65810 osd.12 down 0 1.00000
13 hdd 3.65810 osd.13 down 0 1.00000
14 hdd 3.65810 osd.14 down 0 1.00000
15 hdd 3.65810 osd.15 down 0 1.00000
-7 29.26477 host alma-sds-ps-node-02
16 hdd 3.65810 osd.16 up 1.00000 1.00000
17 hdd 3.65810 osd.17 up 1.00000 1.00000
18 hdd 3.65810 osd.18 up 1.00000 1.00000
19 hdd 3.65810 osd.19 up 1.00000 1.00000
20 hdd 3.65810 osd.20 up 1.00000 1.00000
21 hdd 3.65810 osd.21 up 1.00000 1.00000
22 hdd 3.65810 osd.22 up 1.00000 1.00000
23 hdd 3.65810 osd.23 up 1.00000 1.00000
-3 29.26477 host alma-sds-ps-node-03
0 hdd 3.65810 osd.0 up 1.00000 1.00000
1 hdd 3.65810 osd.1 up 1.00000 1.00000
2 hdd 3.65810 osd.2 up 1.00000 1.00000
3 hdd 3.65810 osd.3 up 1.00000 1.00000
4 hdd 3.65810 osd.4 up 1.00000 1.00000
5 hdd 3.65810 osd.5 up 1.00000 1.00000
6 hdd 3.65810 osd.6 up 1.00000 1.00000
7 hdd 3.65810 osd.7 up 1.00000 1.00000
-9 43.66315 host alma-sds-ps-node-04
24 hdd 3.63860 osd.24 up 1.00000 1.00000
25 hdd 3.63860 osd.25 up 1.00000 1.00000
26 hdd 3.63860 osd.26 up 1.00000 1.00000
27 hdd 3.63860 osd.27 up 1.00000 1.00000
28 hdd 3.63860 osd.28 up 1.00000 1.00000
29 hdd 3.63860 osd.29 up 1.00000 1.00000
30 hdd 3.63860 osd.30 up 1.00000 1.00000
31 hdd 3.63860 osd.31 up 1.00000 1.00000
32 hdd 3.63860 osd.32 up 1.00000 1.00000
33 hdd 3.63860 osd.33 up 1.00000 1.00000
34 hdd 3.63860 osd.34 up 1.00000 1.00000
35 hdd 3.63860 osd.35 up 1.00000 1.00000
root@alma-sds-ps-node-01:~# /usr/lib/ceph/ceph-osd-prestart.sh --cluster Test_iSCSI_Cluster --id 8
root@alma-sds-ps-node-01:~#
root@alma-sds-ps-node-01:~# /usr/bin/ceph-osd -f --cluster Test_iSCSI-Cluster --id 8 --setuser ceph --setgroup ceph
2018-02-15 17:25:20.086717 7f150d989e00 -1 did not load config file, using default settings.
2018-02-15 17:25:20.088994 7f150d989e00 -1 Errors while parsing config file!
2018-02-15 17:25:20.089017 7f150d989e00 -1 parse_file: cannot open /etc/ceph/Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:25:20.089042 7f150d989e00 -1 parse_file: cannot open ~/.ceph/Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:25:20.089043 7f150d989e00 -1 parse_file: cannot open Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:25:20.091421 7f150d989e00 -1 Errors while parsing config file!
2018-02-15 17:25:20.091422 7f150d989e00 -1 parse_file: cannot open /etc/ceph/Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:25:20.091423 7f150d989e00 -1 parse_file: cannot open ~/.ceph/Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:25:20.091424 7f150d989e00 -1 parse_file: cannot open Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:25:20.091665 7f150d989e00 -1 ** ERROR: unable to open OSD superblock on /var/lib/ceph/osd/Test_iSCSI-Cluster-8: (2) No such file or directory
root@alma-sds-ps-node-01:~# /usr/lib/ceph/ceph-osd-prestart.sh --cluster Test_iSCSI_Cluster --id 9
root@alma-sds-ps-node-01:~#
root@alma-sds-ps-node-01:~# /usr/bin/ceph-osd -f --cluster Test_iSCSI-Cluster --id 9 --setuser ceph --setgroup ceph
2018-02-15 17:26:13.854577 7fa3b1e82e00 -1 did not load config file, using default settings.
2018-02-15 17:26:13.856985 7fa3b1e82e00 -1 Errors while parsing config file!
2018-02-15 17:26:13.856988 7fa3b1e82e00 -1 parse_file: cannot open /etc/ceph/Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:26:13.856994 7fa3b1e82e00 -1 parse_file: cannot open ~/.ceph/Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:26:13.857013 7fa3b1e82e00 -1 parse_file: cannot open Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:26:13.859253 7fa3b1e82e00 -1 Errors while parsing config file!
2018-02-15 17:26:13.859253 7fa3b1e82e00 -1 parse_file: cannot open /etc/ceph/Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:26:13.859255 7fa3b1e82e00 -1 parse_file: cannot open ~/.ceph/Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:26:13.859256 7fa3b1e82e00 -1 parse_file: cannot open Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:26:13.859474 7fa3b1e82e00 -1 ** ERROR: unable to open OSD superblock on /var/lib/ceph/osd/Test_iSCSI-Cluster-9: (2) No such file or directory
Hi, a have 4 nodes in my cluster:
node-01 - 12 HDD (4 journal & 8 OSD) all is down;
node-01 - 12 HDD (4 journal & 8 OSD) all is up;
node-01 - 12 HDD (4 journal & 8 OSD) all is up;
node-01 - 12 HDD (12 OSD) all is up;
Each HDD have on 3.64 TB capacity.
I can first node ping the other 3 nodes on backend 1 ip only, backend 2 ip is not ping for all nodes.
root@alma-sds-ps-node-01:~# ceph status --cluster Test_iSCSI_Cluster
cluster:
id: 2efd7538-76dc-4411-9e66-077fb1cf40f6
health: HEALTH_OK
services:
mon: 3 daemons, quorum alma-sds-ps-node-01,alma-sds-ps-node-02,alma-sds-ps-node-03
mgr: alma-sds-ps-node-01(active), standbys: alma-sds-ps-node-02, alma-sds-ps-node-03
osd: 36 osds: 28 up, 28 in
data:
pools: 1 pools, 256 pgs
objects: 0 objects, 0 bytes
usage: 350 GB used, 101 TB / 102 TB avail
pgs: 256 active+clean
root@alma-sds-ps-node-01:~# ceph osd tree --cluster Test_iSCSI_Cluster
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 131.45746 root default
-5 29.26477 host alma-sds-ps-node-01
8 hdd 3.65810 osd.8 down 0 1.00000
9 hdd 3.65810 osd.9 down 0 1.00000
10 hdd 3.65810 osd.10 down 0 1.00000
11 hdd 3.65810 osd.11 down 0 1.00000
12 hdd 3.65810 osd.12 down 0 1.00000
13 hdd 3.65810 osd.13 down 0 1.00000
14 hdd 3.65810 osd.14 down 0 1.00000
15 hdd 3.65810 osd.15 down 0 1.00000
-7 29.26477 host alma-sds-ps-node-02
16 hdd 3.65810 osd.16 up 1.00000 1.00000
17 hdd 3.65810 osd.17 up 1.00000 1.00000
18 hdd 3.65810 osd.18 up 1.00000 1.00000
19 hdd 3.65810 osd.19 up 1.00000 1.00000
20 hdd 3.65810 osd.20 up 1.00000 1.00000
21 hdd 3.65810 osd.21 up 1.00000 1.00000
22 hdd 3.65810 osd.22 up 1.00000 1.00000
23 hdd 3.65810 osd.23 up 1.00000 1.00000
-3 29.26477 host alma-sds-ps-node-03
0 hdd 3.65810 osd.0 up 1.00000 1.00000
1 hdd 3.65810 osd.1 up 1.00000 1.00000
2 hdd 3.65810 osd.2 up 1.00000 1.00000
3 hdd 3.65810 osd.3 up 1.00000 1.00000
4 hdd 3.65810 osd.4 up 1.00000 1.00000
5 hdd 3.65810 osd.5 up 1.00000 1.00000
6 hdd 3.65810 osd.6 up 1.00000 1.00000
7 hdd 3.65810 osd.7 up 1.00000 1.00000
-9 43.66315 host alma-sds-ps-node-04
24 hdd 3.63860 osd.24 up 1.00000 1.00000
25 hdd 3.63860 osd.25 up 1.00000 1.00000
26 hdd 3.63860 osd.26 up 1.00000 1.00000
27 hdd 3.63860 osd.27 up 1.00000 1.00000
28 hdd 3.63860 osd.28 up 1.00000 1.00000
29 hdd 3.63860 osd.29 up 1.00000 1.00000
30 hdd 3.63860 osd.30 up 1.00000 1.00000
31 hdd 3.63860 osd.31 up 1.00000 1.00000
32 hdd 3.63860 osd.32 up 1.00000 1.00000
33 hdd 3.63860 osd.33 up 1.00000 1.00000
34 hdd 3.63860 osd.34 up 1.00000 1.00000
35 hdd 3.63860 osd.35 up 1.00000 1.00000
root@alma-sds-ps-node-01:~# /usr/lib/ceph/ceph-osd-prestart.sh --cluster Test_iSCSI_Cluster --id 8
root@alma-sds-ps-node-01:~#
root@alma-sds-ps-node-01:~# /usr/bin/ceph-osd -f --cluster Test_iSCSI-Cluster --id 8 --setuser ceph --setgroup ceph
2018-02-15 17:25:20.086717 7f150d989e00 -1 did not load config file, using default settings.
2018-02-15 17:25:20.088994 7f150d989e00 -1 Errors while parsing config file!
2018-02-15 17:25:20.089017 7f150d989e00 -1 parse_file: cannot open /etc/ceph/Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:25:20.089042 7f150d989e00 -1 parse_file: cannot open ~/.ceph/Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:25:20.089043 7f150d989e00 -1 parse_file: cannot open Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:25:20.091421 7f150d989e00 -1 Errors while parsing config file!
2018-02-15 17:25:20.091422 7f150d989e00 -1 parse_file: cannot open /etc/ceph/Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:25:20.091423 7f150d989e00 -1 parse_file: cannot open ~/.ceph/Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:25:20.091424 7f150d989e00 -1 parse_file: cannot open Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:25:20.091665 7f150d989e00 -1 ** ERROR: unable to open OSD superblock on /var/lib/ceph/osd/Test_iSCSI-Cluster-8: (2) No such file or directory
root@alma-sds-ps-node-01:~# /usr/lib/ceph/ceph-osd-prestart.sh --cluster Test_iSCSI_Cluster --id 9
root@alma-sds-ps-node-01:~#
root@alma-sds-ps-node-01:~# /usr/bin/ceph-osd -f --cluster Test_iSCSI-Cluster --id 9 --setuser ceph --setgroup ceph
2018-02-15 17:26:13.854577 7fa3b1e82e00 -1 did not load config file, using default settings.
2018-02-15 17:26:13.856985 7fa3b1e82e00 -1 Errors while parsing config file!
2018-02-15 17:26:13.856988 7fa3b1e82e00 -1 parse_file: cannot open /etc/ceph/Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:26:13.856994 7fa3b1e82e00 -1 parse_file: cannot open ~/.ceph/Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:26:13.857013 7fa3b1e82e00 -1 parse_file: cannot open Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:26:13.859253 7fa3b1e82e00 -1 Errors while parsing config file!
2018-02-15 17:26:13.859253 7fa3b1e82e00 -1 parse_file: cannot open /etc/ceph/Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:26:13.859255 7fa3b1e82e00 -1 parse_file: cannot open ~/.ceph/Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:26:13.859256 7fa3b1e82e00 -1 parse_file: cannot open Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:26:13.859474 7fa3b1e82e00 -1 ** ERROR: unable to open OSD superblock on /var/lib/ceph/osd/Test_iSCSI-Cluster-9: (2) No such file or directory
admin
2,918 Posts
February 15, 2018, 11:58 amQuote from admin on February 15, 2018, 11:58 am2 issues:
A) you need to find out why you cannot ping on backend 2, maybe the ips were not entered correctly or maybe there is switch issue.
To see what ips are currently assigned to you interfaces :
ip addr
To view the configuration file for network setting
cat /opt/petasan/config/node_info.json
cat /opt/petasan/config/cluster_info.json
B) To get the output error while OSD is starting, you mis-typed your cluster name (should be _ rather that -) , so please retry:
/usr/bin/ceph-osd -f --cluster Test_iSCSI_Cluster --id 9 --setuser ceph --setgroup ceph
2 issues:
A) you need to find out why you cannot ping on backend 2, maybe the ips were not entered correctly or maybe there is switch issue.
To see what ips are currently assigned to you interfaces :
ip addr
To view the configuration file for network setting
cat /opt/petasan/config/node_info.json
cat /opt/petasan/config/cluster_info.json
B) To get the output error while OSD is starting, you mis-typed your cluster name (should be _ rather that -) , so please retry:
/usr/bin/ceph-osd -f --cluster Test_iSCSI_Cluster --id 9 --setuser ceph --setgroup ceph
Last edited on February 15, 2018, 11:58 am by admin · #16
Alex
7 Posts
February 15, 2018, 12:00 pmQuote from Alex on February 15, 2018, 12:00 pmI finded error! I wrong was configured switch ports!
Now it is working!
I am so sorry! I Needed to be more attentive...
Thank you very mach for support!
I finded error! I wrong was configured switch ports!
Now it is working!
I am so sorry! I Needed to be more attentive...
Thank you very mach for support!
admin
2,918 Posts
February 15, 2018, 4:20 pmQuote from admin on February 15, 2018, 4:20 pmno problem 🙂
no problem 🙂
gbujerin
4 Posts
September 9, 2019, 2:43 pmQuote from gbujerin on September 9, 2019, 2:43 pmError List
Cluster Node petasan failed to join the cluster or is not alive.
Cluster Node PetaSAN-node2 failed to join the cluster or is not alive.
I am getting this error while adding the 3rd node. Please help
Error List
Cluster Node petasan failed to join the cluster or is not alive.
Cluster Node PetaSAN-node2 failed to join the cluster or is not alive.
I am getting this error while adding the 3rd node. Please help
admin
2,918 Posts
September 9, 2019, 8:24 pmQuote from admin on September 9, 2019, 8:24 pmi recommend you recheck your ip/subnet assignments, wiring and re-install.
if it happens again, do let us know..
i recommend you recheck your ip/subnet assignments, wiring and re-install.
if it happens again, do let us know..
Error building Consul cluster
admin
2,918 Posts
Quote from admin on February 15, 2018, 7:17 amIt will work, at layer 2 in no trunk / access mode all your ethernet interfaces will be on same default vlan. iSCSI 1 and iSCSI 2 are different layer 3 (ip) subnets but at layer 2 the swicth sees them on the same vlan. there are some small drawbacks like all braodcast traffic will be shared.
Ideally iSCSI 1 adn ISCSI 2 should be put on separate switches and interfaces, the idea of MPIO is to provide high availability in case of network link failure. For example you may set up a highly available storage system like PetaSAN, but if you have 1 switch it is a single point of failure, if it dies your system stops no matter how good storage redundancy was. For backend networks, typically you will create a bond on different interfaces that are connect to different switches if you need to provide HA.
It will work, at layer 2 in no trunk / access mode all your ethernet interfaces will be on same default vlan. iSCSI 1 and iSCSI 2 are different layer 3 (ip) subnets but at layer 2 the swicth sees them on the same vlan. there are some small drawbacks like all braodcast traffic will be shared.
Ideally iSCSI 1 adn ISCSI 2 should be put on separate switches and interfaces, the idea of MPIO is to provide high availability in case of network link failure. For example you may set up a highly available storage system like PetaSAN, but if you have 1 switch it is a single point of failure, if it dies your system stops no matter how good storage redundancy was. For backend networks, typically you will create a bond on different interfaces that are connect to different switches if you need to provide HA.
Alex
7 Posts
Quote from Alex on February 15, 2018, 8:48 amOk, thank you. I will try to deploy my cluster with a new settings.
Ok, thank you. I will try to deploy my cluster with a new settings.
Alex
7 Posts
Quote from Alex on February 15, 2018, 10:25 amI am deployed a my cluster. You was right - my problem it was be in that switch ports was configured in trunk mode.
But i got a new problem, all OSD on first node (node-01) in my cluster as down. I have tried delete this down OSD and re-add, however, after creating new OSD (with Journal and without him) status of this OSD changed as down. What could be the problem?
I am deployed a my cluster. You was right - my problem it was be in that switch ports was configured in trunk mode.
But i got a new problem, all OSD on first node (node-01) in my cluster as down. I have tried delete this down OSD and re-add, however, after creating new OSD (with Journal and without him) status of this OSD changed as down. What could be the problem?
admin
2,918 Posts
Quote from admin on February 15, 2018, 11:03 amHi,
The other 2 nodes, the OSDs are up ?
How many OSDs do you have ?
Can your first node ping the other 2 nodes on backend 1 and backend 2 ips ?
can you please post the output of ( replace CLUSTER_NAME with the name of your cluster and OSD_ID with the id of your OSD) :
ceph status --cluster CLUSTER_NAME
/usr/lib/ceph/ceph-osd-prestart.sh --cluster CLUSTER_NAME --id OSD_ID
/usr/bin/ceph-osd -f --cluster CLUSTER_NAME --id OSD_ID --setuser ceph --setgroup cephexample:
ceph status --cluster demo
/usr/lib/ceph/ceph-osd-prestart.sh --cluster demo --id 2
/usr/bin/ceph-osd -f --cluster demo --id 2 --setuser ceph --setgroup ceph
Hi,
The other 2 nodes, the OSDs are up ?
How many OSDs do you have ?
Can your first node ping the other 2 nodes on backend 1 and backend 2 ips ?
can you please post the output of ( replace CLUSTER_NAME with the name of your cluster and OSD_ID with the id of your OSD) :
ceph status --cluster CLUSTER_NAME
/usr/lib/ceph/ceph-osd-prestart.sh --cluster CLUSTER_NAME --id OSD_ID
/usr/bin/ceph-osd -f --cluster CLUSTER_NAME --id OSD_ID --setuser ceph --setgroup cephexample:
ceph status --cluster demo
/usr/lib/ceph/ceph-osd-prestart.sh --cluster demo --id 2
/usr/bin/ceph-osd -f --cluster demo --id 2 --setuser ceph --setgroup ceph
Alex
7 Posts
Quote from Alex on February 15, 2018, 11:31 amHi, a have 4 nodes in my cluster:
node-01 - 12 HDD (4 journal & 8 OSD) all is down;
node-01 - 12 HDD (4 journal & 8 OSD) all is up;
node-01 - 12 HDD (4 journal & 8 OSD) all is up;
node-01 - 12 HDD (12 OSD) all is up;
Each HDD have on 3.64 TB capacity.I can first node ping the other 3 nodes on backend 1 ip only, backend 2 ip is not ping for all nodes.
root@alma-sds-ps-node-01:~# ceph status --cluster Test_iSCSI_Cluster
cluster:
id: 2efd7538-76dc-4411-9e66-077fb1cf40f6
health: HEALTH_OKservices:
mon: 3 daemons, quorum alma-sds-ps-node-01,alma-sds-ps-node-02,alma-sds-ps-node-03
mgr: alma-sds-ps-node-01(active), standbys: alma-sds-ps-node-02, alma-sds-ps-node-03
osd: 36 osds: 28 up, 28 indata:
pools: 1 pools, 256 pgs
objects: 0 objects, 0 bytes
usage: 350 GB used, 101 TB / 102 TB avail
pgs: 256 active+cleanroot@alma-sds-ps-node-01:~# ceph osd tree --cluster Test_iSCSI_Cluster
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 131.45746 root default
-5 29.26477 host alma-sds-ps-node-01
8 hdd 3.65810 osd.8 down 0 1.00000
9 hdd 3.65810 osd.9 down 0 1.00000
10 hdd 3.65810 osd.10 down 0 1.00000
11 hdd 3.65810 osd.11 down 0 1.00000
12 hdd 3.65810 osd.12 down 0 1.00000
13 hdd 3.65810 osd.13 down 0 1.00000
14 hdd 3.65810 osd.14 down 0 1.00000
15 hdd 3.65810 osd.15 down 0 1.00000
-7 29.26477 host alma-sds-ps-node-02
16 hdd 3.65810 osd.16 up 1.00000 1.00000
17 hdd 3.65810 osd.17 up 1.00000 1.00000
18 hdd 3.65810 osd.18 up 1.00000 1.00000
19 hdd 3.65810 osd.19 up 1.00000 1.00000
20 hdd 3.65810 osd.20 up 1.00000 1.00000
21 hdd 3.65810 osd.21 up 1.00000 1.00000
22 hdd 3.65810 osd.22 up 1.00000 1.00000
23 hdd 3.65810 osd.23 up 1.00000 1.00000
-3 29.26477 host alma-sds-ps-node-03
0 hdd 3.65810 osd.0 up 1.00000 1.00000
1 hdd 3.65810 osd.1 up 1.00000 1.00000
2 hdd 3.65810 osd.2 up 1.00000 1.00000
3 hdd 3.65810 osd.3 up 1.00000 1.00000
4 hdd 3.65810 osd.4 up 1.00000 1.00000
5 hdd 3.65810 osd.5 up 1.00000 1.00000
6 hdd 3.65810 osd.6 up 1.00000 1.00000
7 hdd 3.65810 osd.7 up 1.00000 1.00000
-9 43.66315 host alma-sds-ps-node-04
24 hdd 3.63860 osd.24 up 1.00000 1.00000
25 hdd 3.63860 osd.25 up 1.00000 1.00000
26 hdd 3.63860 osd.26 up 1.00000 1.00000
27 hdd 3.63860 osd.27 up 1.00000 1.00000
28 hdd 3.63860 osd.28 up 1.00000 1.00000
29 hdd 3.63860 osd.29 up 1.00000 1.00000
30 hdd 3.63860 osd.30 up 1.00000 1.00000
31 hdd 3.63860 osd.31 up 1.00000 1.00000
32 hdd 3.63860 osd.32 up 1.00000 1.00000
33 hdd 3.63860 osd.33 up 1.00000 1.00000
34 hdd 3.63860 osd.34 up 1.00000 1.00000
35 hdd 3.63860 osd.35 up 1.00000 1.00000root@alma-sds-ps-node-01:~# /usr/lib/ceph/ceph-osd-prestart.sh --cluster Test_iSCSI_Cluster --id 8
root@alma-sds-ps-node-01:~#root@alma-sds-ps-node-01:~# /usr/bin/ceph-osd -f --cluster Test_iSCSI-Cluster --id 8 --setuser ceph --setgroup ceph
2018-02-15 17:25:20.086717 7f150d989e00 -1 did not load config file, using default settings.
2018-02-15 17:25:20.088994 7f150d989e00 -1 Errors while parsing config file!
2018-02-15 17:25:20.089017 7f150d989e00 -1 parse_file: cannot open /etc/ceph/Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:25:20.089042 7f150d989e00 -1 parse_file: cannot open ~/.ceph/Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:25:20.089043 7f150d989e00 -1 parse_file: cannot open Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:25:20.091421 7f150d989e00 -1 Errors while parsing config file!
2018-02-15 17:25:20.091422 7f150d989e00 -1 parse_file: cannot open /etc/ceph/Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:25:20.091423 7f150d989e00 -1 parse_file: cannot open ~/.ceph/Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:25:20.091424 7f150d989e00 -1 parse_file: cannot open Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:25:20.091665 7f150d989e00 -1 ** ERROR: unable to open OSD superblock on /var/lib/ceph/osd/Test_iSCSI-Cluster-8: (2) No such file or directoryroot@alma-sds-ps-node-01:~# /usr/lib/ceph/ceph-osd-prestart.sh --cluster Test_iSCSI_Cluster --id 9
root@alma-sds-ps-node-01:~#root@alma-sds-ps-node-01:~# /usr/bin/ceph-osd -f --cluster Test_iSCSI-Cluster --id 9 --setuser ceph --setgroup ceph
2018-02-15 17:26:13.854577 7fa3b1e82e00 -1 did not load config file, using default settings.
2018-02-15 17:26:13.856985 7fa3b1e82e00 -1 Errors while parsing config file!
2018-02-15 17:26:13.856988 7fa3b1e82e00 -1 parse_file: cannot open /etc/ceph/Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:26:13.856994 7fa3b1e82e00 -1 parse_file: cannot open ~/.ceph/Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:26:13.857013 7fa3b1e82e00 -1 parse_file: cannot open Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:26:13.859253 7fa3b1e82e00 -1 Errors while parsing config file!
2018-02-15 17:26:13.859253 7fa3b1e82e00 -1 parse_file: cannot open /etc/ceph/Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:26:13.859255 7fa3b1e82e00 -1 parse_file: cannot open ~/.ceph/Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:26:13.859256 7fa3b1e82e00 -1 parse_file: cannot open Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:26:13.859474 7fa3b1e82e00 -1 ** ERROR: unable to open OSD superblock on /var/lib/ceph/osd/Test_iSCSI-Cluster-9: (2) No such file or directory
Hi, a have 4 nodes in my cluster:
node-01 - 12 HDD (4 journal & 8 OSD) all is down;
node-01 - 12 HDD (4 journal & 8 OSD) all is up;
node-01 - 12 HDD (4 journal & 8 OSD) all is up;
node-01 - 12 HDD (12 OSD) all is up;
Each HDD have on 3.64 TB capacity.
I can first node ping the other 3 nodes on backend 1 ip only, backend 2 ip is not ping for all nodes.
root@alma-sds-ps-node-01:~# ceph status --cluster Test_iSCSI_Cluster
cluster:
id: 2efd7538-76dc-4411-9e66-077fb1cf40f6
health: HEALTH_OK
services:
mon: 3 daemons, quorum alma-sds-ps-node-01,alma-sds-ps-node-02,alma-sds-ps-node-03
mgr: alma-sds-ps-node-01(active), standbys: alma-sds-ps-node-02, alma-sds-ps-node-03
osd: 36 osds: 28 up, 28 in
data:
pools: 1 pools, 256 pgs
objects: 0 objects, 0 bytes
usage: 350 GB used, 101 TB / 102 TB avail
pgs: 256 active+clean
root@alma-sds-ps-node-01:~# ceph osd tree --cluster Test_iSCSI_Cluster
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 131.45746 root default
-5 29.26477 host alma-sds-ps-node-01
8 hdd 3.65810 osd.8 down 0 1.00000
9 hdd 3.65810 osd.9 down 0 1.00000
10 hdd 3.65810 osd.10 down 0 1.00000
11 hdd 3.65810 osd.11 down 0 1.00000
12 hdd 3.65810 osd.12 down 0 1.00000
13 hdd 3.65810 osd.13 down 0 1.00000
14 hdd 3.65810 osd.14 down 0 1.00000
15 hdd 3.65810 osd.15 down 0 1.00000
-7 29.26477 host alma-sds-ps-node-02
16 hdd 3.65810 osd.16 up 1.00000 1.00000
17 hdd 3.65810 osd.17 up 1.00000 1.00000
18 hdd 3.65810 osd.18 up 1.00000 1.00000
19 hdd 3.65810 osd.19 up 1.00000 1.00000
20 hdd 3.65810 osd.20 up 1.00000 1.00000
21 hdd 3.65810 osd.21 up 1.00000 1.00000
22 hdd 3.65810 osd.22 up 1.00000 1.00000
23 hdd 3.65810 osd.23 up 1.00000 1.00000
-3 29.26477 host alma-sds-ps-node-03
0 hdd 3.65810 osd.0 up 1.00000 1.00000
1 hdd 3.65810 osd.1 up 1.00000 1.00000
2 hdd 3.65810 osd.2 up 1.00000 1.00000
3 hdd 3.65810 osd.3 up 1.00000 1.00000
4 hdd 3.65810 osd.4 up 1.00000 1.00000
5 hdd 3.65810 osd.5 up 1.00000 1.00000
6 hdd 3.65810 osd.6 up 1.00000 1.00000
7 hdd 3.65810 osd.7 up 1.00000 1.00000
-9 43.66315 host alma-sds-ps-node-04
24 hdd 3.63860 osd.24 up 1.00000 1.00000
25 hdd 3.63860 osd.25 up 1.00000 1.00000
26 hdd 3.63860 osd.26 up 1.00000 1.00000
27 hdd 3.63860 osd.27 up 1.00000 1.00000
28 hdd 3.63860 osd.28 up 1.00000 1.00000
29 hdd 3.63860 osd.29 up 1.00000 1.00000
30 hdd 3.63860 osd.30 up 1.00000 1.00000
31 hdd 3.63860 osd.31 up 1.00000 1.00000
32 hdd 3.63860 osd.32 up 1.00000 1.00000
33 hdd 3.63860 osd.33 up 1.00000 1.00000
34 hdd 3.63860 osd.34 up 1.00000 1.00000
35 hdd 3.63860 osd.35 up 1.00000 1.00000
root@alma-sds-ps-node-01:~# /usr/lib/ceph/ceph-osd-prestart.sh --cluster Test_iSCSI_Cluster --id 8
root@alma-sds-ps-node-01:~#
root@alma-sds-ps-node-01:~# /usr/bin/ceph-osd -f --cluster Test_iSCSI-Cluster --id 8 --setuser ceph --setgroup ceph
2018-02-15 17:25:20.086717 7f150d989e00 -1 did not load config file, using default settings.
2018-02-15 17:25:20.088994 7f150d989e00 -1 Errors while parsing config file!
2018-02-15 17:25:20.089017 7f150d989e00 -1 parse_file: cannot open /etc/ceph/Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:25:20.089042 7f150d989e00 -1 parse_file: cannot open ~/.ceph/Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:25:20.089043 7f150d989e00 -1 parse_file: cannot open Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:25:20.091421 7f150d989e00 -1 Errors while parsing config file!
2018-02-15 17:25:20.091422 7f150d989e00 -1 parse_file: cannot open /etc/ceph/Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:25:20.091423 7f150d989e00 -1 parse_file: cannot open ~/.ceph/Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:25:20.091424 7f150d989e00 -1 parse_file: cannot open Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:25:20.091665 7f150d989e00 -1 ** ERROR: unable to open OSD superblock on /var/lib/ceph/osd/Test_iSCSI-Cluster-8: (2) No such file or directory
root@alma-sds-ps-node-01:~# /usr/lib/ceph/ceph-osd-prestart.sh --cluster Test_iSCSI_Cluster --id 9
root@alma-sds-ps-node-01:~#
root@alma-sds-ps-node-01:~# /usr/bin/ceph-osd -f --cluster Test_iSCSI-Cluster --id 9 --setuser ceph --setgroup ceph
2018-02-15 17:26:13.854577 7fa3b1e82e00 -1 did not load config file, using default settings.
2018-02-15 17:26:13.856985 7fa3b1e82e00 -1 Errors while parsing config file!
2018-02-15 17:26:13.856988 7fa3b1e82e00 -1 parse_file: cannot open /etc/ceph/Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:26:13.856994 7fa3b1e82e00 -1 parse_file: cannot open ~/.ceph/Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:26:13.857013 7fa3b1e82e00 -1 parse_file: cannot open Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:26:13.859253 7fa3b1e82e00 -1 Errors while parsing config file!
2018-02-15 17:26:13.859253 7fa3b1e82e00 -1 parse_file: cannot open /etc/ceph/Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:26:13.859255 7fa3b1e82e00 -1 parse_file: cannot open ~/.ceph/Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:26:13.859256 7fa3b1e82e00 -1 parse_file: cannot open Test_iSCSI-Cluster.conf: (2) No such file or directory
2018-02-15 17:26:13.859474 7fa3b1e82e00 -1 ** ERROR: unable to open OSD superblock on /var/lib/ceph/osd/Test_iSCSI-Cluster-9: (2) No such file or directory
admin
2,918 Posts
Quote from admin on February 15, 2018, 11:58 am2 issues:
A) you need to find out why you cannot ping on backend 2, maybe the ips were not entered correctly or maybe there is switch issue.
To see what ips are currently assigned to you interfaces :
ip addr
To view the configuration file for network setting
cat /opt/petasan/config/node_info.json
cat /opt/petasan/config/cluster_info.json
B) To get the output error while OSD is starting, you mis-typed your cluster name (should be _ rather that -) , so please retry:
/usr/bin/ceph-osd -f --cluster Test_iSCSI_Cluster --id 9 --setuser ceph --setgroup ceph
2 issues:
A) you need to find out why you cannot ping on backend 2, maybe the ips were not entered correctly or maybe there is switch issue.
To see what ips are currently assigned to you interfaces :
ip addr
To view the configuration file for network setting
cat /opt/petasan/config/node_info.json
cat /opt/petasan/config/cluster_info.json
B) To get the output error while OSD is starting, you mis-typed your cluster name (should be _ rather that -) , so please retry:
/usr/bin/ceph-osd -f --cluster Test_iSCSI_Cluster --id 9 --setuser ceph --setgroup ceph
Alex
7 Posts
Quote from Alex on February 15, 2018, 12:00 pmI finded error! I wrong was configured switch ports!
Now it is working!
I am so sorry! I Needed to be more attentive...
Thank you very mach for support!
I finded error! I wrong was configured switch ports!
Now it is working!
I am so sorry! I Needed to be more attentive...
Thank you very mach for support!
admin
2,918 Posts
Quote from admin on February 15, 2018, 4:20 pmno problem 🙂
no problem 🙂
gbujerin
4 Posts
Quote from gbujerin on September 9, 2019, 2:43 pmError List
Cluster Node petasan failed to join the cluster or is not alive.
Cluster Node PetaSAN-node2 failed to join the cluster or is not alive.
I am getting this error while adding the 3rd node. Please help
Error List
Cluster Node petasan failed to join the cluster or is not alive.
Cluster Node PetaSAN-node2 failed to join the cluster or is not alive.
I am getting this error while adding the 3rd node. Please help
admin
2,918 Posts
Quote from admin on September 9, 2019, 8:24 pmi recommend you recheck your ip/subnet assignments, wiring and re-install.
if it happens again, do let us know..
i recommend you recheck your ip/subnet assignments, wiring and re-install.
if it happens again, do let us know..