Forums - PetaSAN

ForumBug ReportingNew Lab Cluster issues
You need to log in to create posts and topics. Login · Register
New Lab Cluster issues

ArchNemesis
4 Posts

November 22, 2020, 12:34 am
Quote from ArchNemesis on November 22, 2020, 12:34 am
I've installed a standard 3 node lab cluster ( cloud based) each node is one OSD and one Journal just for testing

I've installed the cluster 10 times (not exaggerating) followed the quick start to the letter.

No matter what i do i always end up with this issue.

gluster> peer status
Number of Peers: 2

Hostname: 172.16.6.1
Uuid: 42a74f99-a842-419b-9552-d2d5386ce260
State: Peer in Cluster (Connected)

Hostname: 172.16.6.2
Uuid: a3ae1f16-c417-4be7-8ecf-8e6b60c52b9c
State: Peer in Cluster (Connected)
gluster>
root@ps-node-3:~# gluster vol create gfs-vol replica 3 172.16.6.1:/opt/petasan/config/gfs-brick 172.16.6.2:/opt/petasan/config/gfs-brick 172.16.6.3:/opt/petasan/config/gfs-brick

volume create: gfs-vol: failed: /opt/petasan/config/gfs-brick is already part of a volume

root@ps-node-3:~# gluster volume status
No volumes present

root@ps-node-3:~# gluster volume list
No volumes present in cluster

root@ps-node-3:~# ls /opt/petasan/config/

ls: cannot access '/opt/petasan/config/shared': Transport endpoint is not connected

certificates cluster_info.json crush etc flags gfs-brick lost+found node_info.json pages.json replication rolepages.json roles.json root services_interfaces.json shared stats tuning var

------

CIFSException

raise CIFSException(CIFSException.CIFS_CLUSTER_NOT_UP, '')

File "/usr/lib/python2.7/dist-packages/PetaSAN/backend/manage_cifs.py", line 214, in get_cifs_status

cifs_status = manage_cifs.get_cifs_status()

File "/usr/lib/python2.7/dist-packages/PetaSAN/web/admin_controller/manage_cifs.py", line 365, in get_cifs_status

Traceback (most recent call last):

21/11/2020 18:18:11 ERROR

As you will see node 3 172.16.6.3 is NOT in the gluster peer list and i do not see how to get it to sync up.

I love the concept of this software and it will fill a need for a project i have coming up, but if it can't deploy reliably i'm not sure.

I've installed a standard 3 node lab cluster ( cloud based) each node is one OSD and one Journal just for testing

I've installed the cluster 10 times (not exaggerating) followed the quick start to the letter.

No matter what i do i always end up with this issue.

gluster> peer status
Number of Peers: 2

Hostname: 172.16.6.1
Uuid: 42a74f99-a842-419b-9552-d2d5386ce260
State: Peer in Cluster (Connected)

Hostname: 172.16.6.2
Uuid: a3ae1f16-c417-4be7-8ecf-8e6b60c52b9c
State: Peer in Cluster (Connected)
gluster>
root@ps-node-3:~# gluster vol create gfs-vol replica 3 172.16.6.1:/opt/petasan/config/gfs-brick 172.16.6.2:/opt/petasan/config/gfs-brick 172.16.6.3:/opt/petasan/config/gfs-brick

volume create: gfs-vol: failed: /opt/petasan/config/gfs-brick is already part of a volume

root@ps-node-3:~# gluster volume status
No volumes present

root@ps-node-3:~# gluster volume list
No volumes present in cluster

root@ps-node-3:~# ls /opt/petasan/config/

ls: cannot access '/opt/petasan/config/shared': Transport endpoint is not connected

certificates cluster_info.json crush etc flags gfs-brick lost+found node_info.json pages.json replication rolepages.json roles.json root services_interfaces.json shared stats tuning var

------

CIFSException

raise CIFSException(CIFSException.CIFS_CLUSTER_NOT_UP, '')

File "/usr/lib/python2.7/dist-packages/PetaSAN/backend/manage_cifs.py", line 214, in get_cifs_status

cifs_status = manage_cifs.get_cifs_status()

File "/usr/lib/python2.7/dist-packages/PetaSAN/web/admin_controller/manage_cifs.py", line 365, in get_cifs_status

Traceback (most recent call last):

21/11/2020 18:18:11 ERROR

As you will see node 3 172.16.6.3 is NOT in the gluster peer list and i do not see how to get it to sync up.

I love the concept of this software and it will fill a need for a project i have coming up, but if it can't deploy reliably i'm not sure.

#1

admin
2,930 Posts

November 22, 2020, 10:30 am
Quote from admin on November 22, 2020, 10:30 am

what do you mean by cloud based ?

what is the output on all 3 nodes of:

systemctl status glusterd
gluster peer status
gluster vol status

what do you mean by cloud based ?

what is the output on all 3 nodes of:

systemctl status glusterd
gluster peer status
gluster vol status

#2

ArchNemesis
4 Posts

November 22, 2020, 6:30 pm
Quote from ArchNemesis on November 22, 2020, 6:30 pm
I simply mean the nodes are all VMs it's a lab and i'm not investing in hardware just to test.

--

root@ps-node-1:~# systemctl status glusterd
● glusterd.service - GlusterFS, a clustered file-system server
Loaded: loaded (/lib/systemd/system/glusterd.service; disabled; vendor preset: enabled)
Active: active (running) since Sun 2020-11-22 12:28:03 CST; 1min 51s ago
Process: 1365 ExecStart=/usr/sbin/glusterd -p /var/run/glusterd.pid --log-level $LOG_LEVEL $GLUSTERD_OPTIONS (code=exited, status=0/SUCCESS)
Main PID: 1367 (glusterd)
Tasks: 8 (limit: 4666)
CGroup: /system.slice/glusterd.service
└─1367 /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO

Nov 22 12:27:57 ps-node-1 systemd[1]: Starting GlusterFS, a clustered file-system server...
Nov 22 12:28:03 ps-node-1 systemd[1]: Started GlusterFS, a clustered file-system server.
root@ps-node-1:~# gluster peer status
Number of Peers: 2

Hostname: ps-node-3
Uuid: 1fec843b-85ab-44fd-a64a-ce9c292706be
State: Peer in Cluster (Connected)

Hostname: 172.16.6.2
Uuid: a3ae1f16-c417-4be7-8ecf-8e6b60c52b9c
State: Peer in Cluster (Connected)
root@ps-node-1:~# gluster vol status
No volumes present

--

root@ps-node-2:~# systemctl status glusterd
● glusterd.service - GlusterFS, a clustered file-system server
Loaded: loaded (/lib/systemd/system/glusterd.service; disabled; vendor preset: enabled)
Active: active (running) since Sun 2020-11-22 12:29:00 CST; 9s ago
Process: 1372 ExecStart=/usr/sbin/glusterd -p /var/run/glusterd.pid --log-level $LOG_LEVEL $GLUSTERD_OPTIONS (code=exited, status=0/SUCCESS)
Main PID: 1373 (glusterd)
Tasks: 8 (limit: 4666)
CGroup: /system.slice/glusterd.service
└─1373 /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO

Nov 22 12:28:55 ps-node-2 systemd[1]: Starting GlusterFS, a clustered file-system server...
Nov 22 12:29:00 ps-node-2 systemd[1]: Started GlusterFS, a clustered file-system server.
root@ps-node-2:~# gluster peer status
Number of Peers: 2

Hostname: 172.16.6.1
Uuid: 42a74f99-a842-419b-9552-d2d5386ce260
State: Peer in Cluster (Connected)

Hostname: ps-node-3
Uuid: 1fec843b-85ab-44fd-a64a-ce9c292706be
State: Peer in Cluster (Connected)
root@ps-node-2:~# gluster vol status
No volumes present

---

root@ps-node-3:~# systemctl status glusterd
● glusterd.service - GlusterFS, a clustered file-system server
Loaded: loaded (/lib/systemd/system/glusterd.service; disabled; vendor preset: enabled)
Active: active (running) since Sun 2020-11-22 12:28:40 CST; 1min 59s ago
Process: 1298 ExecStart=/usr/sbin/glusterd -p /var/run/glusterd.pid --log-level $LOG_LEVEL $GLUSTERD_OPTIONS (code=exited, status=0/SUCCESS)
Main PID: 1303 (glusterd)
Tasks: 8 (limit: 4666)
CGroup: /system.slice/glusterd.service
└─1303 /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO

Nov 22 12:28:33 ps-node-3 systemd[1]: Starting GlusterFS, a clustered file-system server...
Nov 22 12:28:40 ps-node-3 systemd[1]: Started GlusterFS, a clustered file-system server.
root@ps-node-3:~# gluster peer status
Number of Peers: 2

Hostname: 172.16.6.2
Uuid: a3ae1f16-c417-4be7-8ecf-8e6b60c52b9c
State: Peer in Cluster (Connected)

Hostname: 172.16.6.1
Uuid: 42a74f99-a842-419b-9552-d2d5386ce260
State: Peer in Cluster (Connected)
root@ps-node-3:~# gluster vol status
No volumes present

I simply mean the nodes are all VMs it's a lab and i'm not investing in hardware just to test.

--

root@ps-node-1:~# systemctl status glusterd
● glusterd.service - GlusterFS, a clustered file-system server
Loaded: loaded (/lib/systemd/system/glusterd.service; disabled; vendor preset: enabled)
Active: active (running) since Sun 2020-11-22 12:28:03 CST; 1min 51s ago
Process: 1365 ExecStart=/usr/sbin/glusterd -p /var/run/glusterd.pid --log-level $LOG_LEVEL $GLUSTERD_OPTIONS (code=exited, status=0/SUCCESS)
Main PID: 1367 (glusterd)
Tasks: 8 (limit: 4666)
CGroup: /system.slice/glusterd.service
└─1367 /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO

Nov 22 12:27:57 ps-node-1 systemd[1]: Starting GlusterFS, a clustered file-system server...
Nov 22 12:28:03 ps-node-1 systemd[1]: Started GlusterFS, a clustered file-system server.
root@ps-node-1:~# gluster peer status
Number of Peers: 2

Hostname: ps-node-3
Uuid: 1fec843b-85ab-44fd-a64a-ce9c292706be
State: Peer in Cluster (Connected)

Hostname: 172.16.6.2
Uuid: a3ae1f16-c417-4be7-8ecf-8e6b60c52b9c
State: Peer in Cluster (Connected)
root@ps-node-1:~# gluster vol status
No volumes present

--

root@ps-node-2:~# systemctl status glusterd
● glusterd.service - GlusterFS, a clustered file-system server
Loaded: loaded (/lib/systemd/system/glusterd.service; disabled; vendor preset: enabled)
Active: active (running) since Sun 2020-11-22 12:29:00 CST; 9s ago
Process: 1372 ExecStart=/usr/sbin/glusterd -p /var/run/glusterd.pid --log-level $LOG_LEVEL $GLUSTERD_OPTIONS (code=exited, status=0/SUCCESS)
Main PID: 1373 (glusterd)
Tasks: 8 (limit: 4666)
CGroup: /system.slice/glusterd.service
└─1373 /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO

Nov 22 12:28:55 ps-node-2 systemd[1]: Starting GlusterFS, a clustered file-system server...
Nov 22 12:29:00 ps-node-2 systemd[1]: Started GlusterFS, a clustered file-system server.
root@ps-node-2:~# gluster peer status
Number of Peers: 2

Hostname: 172.16.6.1
Uuid: 42a74f99-a842-419b-9552-d2d5386ce260
State: Peer in Cluster (Connected)

Hostname: ps-node-3
Uuid: 1fec843b-85ab-44fd-a64a-ce9c292706be
State: Peer in Cluster (Connected)
root@ps-node-2:~# gluster vol status
No volumes present

---

root@ps-node-3:~# systemctl status glusterd
● glusterd.service - GlusterFS, a clustered file-system server
Loaded: loaded (/lib/systemd/system/glusterd.service; disabled; vendor preset: enabled)
Active: active (running) since Sun 2020-11-22 12:28:40 CST; 1min 59s ago
Process: 1298 ExecStart=/usr/sbin/glusterd -p /var/run/glusterd.pid --log-level $LOG_LEVEL $GLUSTERD_OPTIONS (code=exited, status=0/SUCCESS)
Main PID: 1303 (glusterd)
Tasks: 8 (limit: 4666)
CGroup: /system.slice/glusterd.service
└─1303 /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO

Nov 22 12:28:33 ps-node-3 systemd[1]: Starting GlusterFS, a clustered file-system server...
Nov 22 12:28:40 ps-node-3 systemd[1]: Started GlusterFS, a clustered file-system server.
root@ps-node-3:~# gluster peer status
Number of Peers: 2

Hostname: 172.16.6.2
Uuid: a3ae1f16-c417-4be7-8ecf-8e6b60c52b9c
State: Peer in Cluster (Connected)

Hostname: 172.16.6.1
Uuid: 42a74f99-a842-419b-9552-d2d5386ce260
State: Peer in Cluster (Connected)
root@ps-node-3:~# gluster vol status
No volumes present

#3

admin
2,930 Posts

November 22, 2020, 7:09 pm
Quote from admin on November 22, 2020, 7:09 pm
1. can you also post the content of

/opt/petasan/config/cluster_info.json

2. can you manually start the volume via

gluster vol start gfs-vol

3. on node 3 log file /opt/petasan/log/PetaSAN.log do you see any errors for gfs-vol

4. can you double check your management network and backend networks are 2 distinct non overlapping subnets.

5. can you check if you have an external dns and if so does it resolve node names to management ips rather than backend ips.

1. can you also post the content of

/opt/petasan/config/cluster_info.json

2. can you manually start the volume via

gluster vol start gfs-vol

3. on node 3 log file /opt/petasan/log/PetaSAN.log do you see any errors for gfs-vol

4. can you double check your management network and backend networks are 2 distinct non overlapping subnets.

5. can you check if you have an external dns and if so does it resolve node names to management ips rather than backend ips.

Last edited on November 22, 2020, 7:09 pm by admin · #4

ArchNemesis
4 Posts

November 22, 2020, 7:30 pm
Quote from ArchNemesis on November 22, 2020, 7:30 pm
cat /opt/petasan/config/cluster_info.json
{
"backend_1_base_ip": "172.16.6.0",
"backend_1_eth_name": "eth1",
"backend_1_mask": "255.255.0.0",
"backend_1_vlan_id": "",
"backend_2_base_ip": "",
"backend_2_eth_name": "",
"backend_2_mask": "",
"backend_2_vlan_id": "",
"bonds": [],
"default_pool": "both",
"default_pool_pgs": "256",
"default_pool_replicas": "3",
"eth_count": 2,
"jf_mtu_size": "",
"jumbo_frames": [],
"management_eth_name": "eth0",
"management_nodes": [
{
"backend_1_ip": "172.16.6.1",
"backend_2_ip": "",
"is_backup": false,
"is_cifs": true,
"is_iscsi": true,
"is_management": true,
"is_nfs": true,
"is_storage": true,
"management_ip": "172.16.5.1",
"name": "ps-node-1"
},
{
"backend_1_ip": "172.16.6.2",
"backend_2_ip": "",
"is_backup": false,
"is_cifs": true,
"is_iscsi": true,
"is_management": true,
"is_nfs": true,
"is_storage": true,
"management_ip": "172.16.5.2",
"name": "ps-node-2"
},
{
"backend_1_ip": "172.16.6.3",
"backend_2_ip": "",
"is_backup": false,
"is_cifs": true,
"is_iscsi": true,
"is_management": true,
"is_nfs": true,
"is_storage": true,
"management_ip": "172.16.5.3",
"name": "ps-node-3"
}
],
"name": "san",
"storage_engine": "bluestore"

gluster vol start gfs-vol
volume start: gfs-vol: failed: Volume gfs-vol does not exist

cat /opt/petasan/config/cluster_info.json
{
"backend_1_base_ip": "172.16.6.0",
"backend_1_eth_name": "eth1",
"backend_1_mask": "255.255.0.0",
"backend_1_vlan_id": "",
"backend_2_base_ip": "",
"backend_2_eth_name": "",
"backend_2_mask": "",
"backend_2_vlan_id": "",
"bonds": [],
"default_pool": "both",
"default_pool_pgs": "256",
"default_pool_replicas": "3",
"eth_count": 2,
"jf_mtu_size": "",
"jumbo_frames": [],
"management_eth_name": "eth0",
"management_nodes": [
{
"backend_1_ip": "172.16.6.1",
"backend_2_ip": "",
"is_backup": false,
"is_cifs": true,
"is_iscsi": true,
"is_management": true,
"is_nfs": true,
"is_storage": true,
"management_ip": "172.16.5.1",
"name": "ps-node-1"
},
{
"backend_1_ip": "172.16.6.2",
"backend_2_ip": "",
"is_backup": false,
"is_cifs": true,
"is_iscsi": true,
"is_management": true,
"is_nfs": true,
"is_storage": true,
"management_ip": "172.16.5.2",
"name": "ps-node-2"
},
{
"backend_1_ip": "172.16.6.3",
"backend_2_ip": "",
"is_backup": false,
"is_cifs": true,
"is_iscsi": true,
"is_management": true,
"is_nfs": true,
"is_storage": true,
"management_ip": "172.16.5.3",
"name": "ps-node-3"
}
],
"name": "san",
"storage_engine": "bluestore"

gluster vol start gfs-vol
volume start: gfs-vol: failed: Volume gfs-vol does not exist

#5

admin
2,930 Posts

November 22, 2020, 7:59 pm
Quote from admin on November 22, 2020, 7:59 pm
i can see the subnet mask is 255.255.0.0 so the subnets overlap, this will cause a lot of issues at the network layer.

make sure your management and backend are distinct subets, similarly check your management subnet

i can see the subnet mask is 255.255.0.0 so the subnets overlap, this will cause a lot of issues at the network layer.

make sure your management and backend are distinct subets, similarly check your management subnet

#6

ArchNemesis
4 Posts

November 23, 2020, 2:25 am
Quote from ArchNemesis on November 23, 2020, 2:25 am
I'll reinstall it again tomorrow and let you know just seems odd to me if they have separate addresses

I'll reinstall it again tomorrow and let you know just seems odd to me if they have separate addresses

#7

DividedByPi
32 Posts

December 10, 2020, 8:26 pm
Quote from DividedByPi on December 10, 2020, 8:26 pm
gotta love people who blame software because they don't understand simple networking.

gotta love people who blame software because they don't understand simple networking.

#8

Post Reply: New Lab Cluster issues

Cancel