Forums - PetaSAN

ForumBug ReportingiSCSI don't start automatically a …
You need to log in to create posts and topics. Login · Register
iSCSI don't start automatically after all servers reboot (using 1.3.1 version)

clsaad
8 Posts

August 15, 2017, 6:23 pm
Quote from clsaad on August 15, 2017, 6:23 pm
Hello,

I have now a LAB with 3 physical servers and all have 4 discs with 250GB SATA.

I have created the LAB with 2 iSCSI connections and after the 3 servers restart the iSCSI don't start automatically.

I try to find any information about iSCSI start service in node log and don't find anything. If I try to start manually works fine.

Bellow the node log from ps-node-02

15/08/2017 15:16:40 INFO GlusterFS mount attempt
15/08/2017 15:16:24 INFO Cluster is just starting, system will delete all active disk resources
15/08/2017 15:16:23 INFO Removed all mapped disks
15/08/2017 15:16:22 INFO Cleaning all mapped disks
15/08/2017 15:16:22 INFO Service starting, cleaning all mapped disks
15/08/2017 15:16:21 INFO Starting Cluster Management application
15/08/2017 15:16:21 INFO Starting PetaSAN service
15/08/2017 15:16:21 INFO Starting cluster file sync service
15/08/2017 15:16:16 INFO str_start_command: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>consul agent -config-dir /opt/petasan/config/etc/consul.d/server -bind 10.0.4.12 -retry-join 10.0.4.11 -retry-join 10.0.4.13
15/08/2017 15:16:07 INFO GlusterFS mount attempt
15/08/2017 15:16:07 INFO Successfully set node ips
15/08/2017 15:16:07 INFO Successfully set default gateway
15/08/2017 15:16:06 INFO Start settings IPs

Hello,

I have now a LAB with 3 physical servers and all have 4 discs with 250GB SATA.

I have created the LAB with 2 iSCSI connections and after the 3 servers restart the iSCSI don't start automatically.

I try to find any information about iSCSI start service in node log and don't find anything. If I try to start manually works fine.

Bellow the node log from ps-node-02

15/08/2017 15:16:40 INFO GlusterFS mount attempt
15/08/2017 15:16:24 INFO Cluster is just starting, system will delete all active disk resources
15/08/2017 15:16:23 INFO Removed all mapped disks
15/08/2017 15:16:22 INFO Cleaning all mapped disks
15/08/2017 15:16:22 INFO Service starting, cleaning all mapped disks
15/08/2017 15:16:21 INFO Starting Cluster Management application
15/08/2017 15:16:21 INFO Starting PetaSAN service
15/08/2017 15:16:21 INFO Starting cluster file sync service
15/08/2017 15:16:16 INFO str_start_command: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>consul agent -config-dir /opt/petasan/config/etc/consul.d/server -bind 10.0.4.12 -retry-join 10.0.4.11 -retry-join 10.0.4.13
15/08/2017 15:16:07 INFO GlusterFS mount attempt
15/08/2017 15:16:07 INFO Successfully set node ips
15/08/2017 15:16:07 INFO Successfully set default gateway
15/08/2017 15:16:06 INFO Start settings IPs

#1

admin
2,930 Posts

August 16, 2017, 2:38 pm
Quote from admin on August 16, 2017, 2:38 pm
Thanks for your feedback. It is quite important for us to know how to improve PetaSAN.

Currently in PetaSAN we support nodes going down and up and we automatically handle the distribution of iSCSI load, however this assumes that there is running cluster. We do not auto start iSCSI disks if the entire cluster was down, the admin must manually do this.

In a traditional 2 node active/passive storage system it is easy to support this, a node can operate on its local storage and independent on the backup. In PetaSAN storage is provided by Ceph cluster and it must be up, in addition paths are not owned by the nodes but distributed via the Consul service platform which again needs to be up for things to work. Moreover even if both Ceph and Consul are up we do not want a node taking all the iSCSI load while many other iSCSI server nodes are being started. Most of the technology used in PetaSAN is geared for cloud usage and is not designed to deal with regular restart of the entire system without a user doing so.

But what you mention is definitely something we will try to make it easier from the user point of view.

Thanks for your feedback. It is quite important for us to know how to improve PetaSAN.

Currently in PetaSAN we support nodes going down and up and we automatically handle the distribution of iSCSI load, however this assumes that there is running cluster. We do not auto start iSCSI disks if the entire cluster was down, the admin must manually do this.

In a traditional 2 node active/passive storage system it is easy to support this, a node can operate on its local storage and independent on the backup. In PetaSAN storage is provided by Ceph cluster and it must be up, in addition paths are not owned by the nodes but distributed via the Consul service platform which again needs to be up for things to work. Moreover even if both Ceph and Consul are up we do not want a node taking all the iSCSI load while many other iSCSI server nodes are being started. Most of the technology used in PetaSAN is geared for cloud usage and is not designed to deal with regular restart of the entire system without a user doing so.

But what you mention is definitely something we will try to make it easier from the user point of view.

Last edited on August 16, 2017, 2:41 pm by admin · #2

clsaad
8 Posts

August 21, 2017, 6:38 pm
Quote from clsaad on August 21, 2017, 6:38 pm
Hello,

Thanks for the explanation. I guess that I have a BUG. After shutdown all servers and start all servers and wait the replication, when I try to start the iSCSI connection sound that OK (started) bug none of the 2 IP's responds. I try to stop and restart the iSCSI configuration (for only 1 mapped disk). I found this error in the Log in the node:

21/08/2017 15:29:10 ERROR Cannot unmap image image-00003. error.
21/08/2017 15:29:10 INFO PetaSAN unlocked any consul locks not configured in this node.
21/08/2017 15:29:10 INFO Unlock path 00003/2
21/08/2017 15:29:09 ERROR Error could not acquire path 00003/2
RTSLibError: Device /dev/rbd/rbd/image-00003 is not a TYPE_DISK rbd device
raise RTSLibError("Device %s is not a TYPE_DISK rbd device" % dev)
File "/usr/lib/python2.7/dist-packages/rtslib/tcm.py", line 1044, in _configure
self._configure(dev, wwn, readonly)
File "/usr/lib/python2.7/dist-packages/rtslib/tcm.py", line 1031, in init
readonly=readonly, write_back=write_back)
File "/usr/lib/python2.7/dist-packages/rtslib/tcm.py", line 297, in storage_object
disk_meta.id)
File "/usr/lib/python2.7/dist-packages/PetaSAN/core/lio/api.py", line 59, in add_target
Traceback (most recent call last):
21/08/2017 15:29:09 ERROR Device /dev/rbd/rbd/image-00003 is not a TYPE_DISK rbd device
21/08/2017 15:29:09 ERROR LIO error could not create target for disk 00003.
21/08/2017 15:29:06 INFO Image image-00003 mapped successfully.
21/08/2017 15:26:05 INFO Could not lock path 00003/1 with session ef682812-7d62-d047-5d12-71ef6bf5b8a8.
21/08/2017 15:03:56 INFO Could not lock path 00003/1 with session ef682812-7d62-d047-5d12-71ef6bf5b8a8.

BTW, I'm using the latest version (1.4). The upgrade process works very fine 😉

Hello,

Thanks for the explanation. I guess that I have a BUG. After shutdown all servers and start all servers and wait the replication, when I try to start the iSCSI connection sound that OK (started) bug none of the 2 IP's responds. I try to stop and restart the iSCSI configuration (for only 1 mapped disk). I found this error in the Log in the node:

21/08/2017 15:29:10 ERROR Cannot unmap image image-00003. error.
21/08/2017 15:29:10 INFO PetaSAN unlocked any consul locks not configured in this node.
21/08/2017 15:29:10 INFO Unlock path 00003/2
21/08/2017 15:29:09 ERROR Error could not acquire path 00003/2
RTSLibError: Device /dev/rbd/rbd/image-00003 is not a TYPE_DISK rbd device
raise RTSLibError("Device %s is not a TYPE_DISK rbd device" % dev)
File "/usr/lib/python2.7/dist-packages/rtslib/tcm.py", line 1044, in _configure
self._configure(dev, wwn, readonly)
File "/usr/lib/python2.7/dist-packages/rtslib/tcm.py", line 1031, in init
readonly=readonly, write_back=write_back)
File "/usr/lib/python2.7/dist-packages/rtslib/tcm.py", line 297, in storage_object
disk_meta.id)
File "/usr/lib/python2.7/dist-packages/PetaSAN/core/lio/api.py", line 59, in add_target
Traceback (most recent call last):
21/08/2017 15:29:09 ERROR Device /dev/rbd/rbd/image-00003 is not a TYPE_DISK rbd device
21/08/2017 15:29:09 ERROR LIO error could not create target for disk 00003.
21/08/2017 15:29:06 INFO Image image-00003 mapped successfully.
21/08/2017 15:26:05 INFO Could not lock path 00003/1 with session ef682812-7d62-d047-5d12-71ef6bf5b8a8.
21/08/2017 15:03:56 INFO Could not lock path 00003/1 with session ef682812-7d62-d047-5d12-71ef6bf5b8a8.

BTW, I'm using the latest version (1.4). The upgrade process works very fine 😉

#3

admin
2,930 Posts

August 22, 2017, 1:08 pm
Quote from admin on August 22, 2017, 1:08 pm
Hi there, and thanks you like the upgrade process.

Re the iSCSI disk not starting, we do not see this problem. we also try to follow your case: shutdown 1.3.1 cluster, restart, upgrade and all worked. Can you give us info how to re-produce the issue. The error listed shows that the kernel cannot map the rbd image: have you upgraded the system outside PetaSAN or created/modified the rbd image manually ? Again any more info to reproduce the problem to help us look into it.

Hi there, and thanks you like the upgrade process.

Re the iSCSI disk not starting, we do not see this problem. we also try to follow your case: shutdown 1.3.1 cluster, restart, upgrade and all worked. Can you give us info how to re-produce the issue. The error listed shows that the kernel cannot map the rbd image: have you upgraded the system outside PetaSAN or created/modified the rbd image manually ? Again any more info to reproduce the problem to help us look into it.

#4

clsaad
8 Posts

August 23, 2017, 2:29 pm
Quote from clsaad on August 23, 2017, 2:29 pm
Hello,

I used the auto update proccess (using USB boot -> identity the old version -> make update -> reboot).

After that initiate the problem. I'm try to search more details about the problem and return back to you.

Hello,

I used the auto update proccess (using USB boot -> identity the old version -> make update -> reboot).

After that initiate the problem. I'm try to search more details about the problem and return back to you.

#5

admin
2,930 Posts

August 24, 2017, 12:09 pm
Quote from admin on August 24, 2017, 12:09 pm

Yes these are the steps we followed but it is working here ...so additional info will help:

Are all your disks not starting or only disk 00003 ?

Could you start/stop this disk in ver 1.3.1 before you upgraded to 1.4 ?

Have you used any cli commands either to add/upgrade packages or modified this disk outside PetaSAN ?

What is the output of

rbd info image-00003 --cluster CLUSTER_NAME

Also if this is not a production environment, try to map the disk manually but first stop the PetaSAN iscsi service, otherwise it will be unmapped automatically:

systemctl stop petasan-iscsi

rbd map image-00003 --cluster CLUSTER_NAME

rbd showmapped --cluster CLUSTER_NAME

Is the disk mapped successfully ?

when done

rbd unmap image-00003 --cluster CLUSTER_NAME

systemctl start petasan-iscsi

Thanks very much for your help.

Yes these are the steps we followed but it is working here ...so additional info will help:

Are all your disks not starting or only disk 00003 ?

Could you start/stop this disk in ver 1.3.1 before you upgraded to 1.4 ?

Have you used any cli commands either to add/upgrade packages or modified this disk outside PetaSAN ?

What is the output of

rbd info image-00003 --cluster CLUSTER_NAME

Also if this is not a production environment, try to map the disk manually but first stop the PetaSAN iscsi service, otherwise it will be unmapped automatically:

systemctl stop petasan-iscsi

rbd map image-00003 --cluster CLUSTER_NAME

rbd showmapped --cluster CLUSTER_NAME

Is the disk mapped successfully ?

when done

rbd unmap image-00003 --cluster CLUSTER_NAME

systemctl start petasan-iscsi

Thanks very much for your help.

Last edited on August 24, 2017, 12:12 pm by admin · #6

Post Reply: iSCSI don't start automatically after all servers reboot (using 1.3.1 version)

Cancel