Forums

Home / Forums

You need to log in to create posts and topics. Login · Register

Unable to add new node

Hi there,

I'm trying to setup a 4th storage node for my cluster however whenever I attempt to join existing node I put in the 2 backend IPs and end up with;

Internal Server Error

The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application.

 

It seems that the node never gets configured (after a reboot it shows the management details of existing nodes, but never shows in the management interface for any node), and the web interface after that point becomes non-workable for the new node.

Logs on existing nodes via the web interface do not show any errors so I am a little stuck at this point as to where to go from here.

 

This is the stage when we configure the networking for the node. If it fails here it will not have reached the page to add osds and join the cluster so it has not successfully joined the cluster.

some things to try:

  • try to redeploy the node ( port 5001 )
  • empty the browser-cache on your browser client and refresh
  • log on the node itself either from the blue screen menu or try ssh if management ip is working using the cluster password (it should have been set at this stage) and look at the logs: PetaSAN, kernel dmesg and syslog + look if some other processes are eating up memory or cpu..

Thanks.

I tried the clearing cache etc, and even a re-install to be greeted by the same error every deployment attempt (used a separate browser install to be sure)

I've taken a look in the logs and see the following;

In Syslog (nothing in dmesg)

Sep 19 14:50:44 PETASAN04 deploy.py[1132]: 3[2019-09-19 14:50:44,571] ERROR in app: Exception on / [GET]
Sep 19 14:50:44 PETASAN04 deploy.py[1132]: Traceback (most recent call last):
Sep 19 14:50:44 PETASAN04 deploy.py[1132]: File "/usr/lib/python2.7/dist-packages/flask/app.py", line 1982, in wsgi_ap p
Sep 19 14:50:44 PETASAN04 deploy.py[1132]: response = self.full_dispatch_request()
Sep 19 14:50:44 PETASAN04 deploy.py[1132]: File "/usr/lib/python2.7/dist-packages/flask/app.py", line 1614, in full_di spatch_request
Sep 19 14:50:44 PETASAN04 deploy.py[1132]: rv = self.handle_user_exception(e)
Sep 19 14:50:44 PETASAN04 deploy.py[1132]: File "/usr/lib/python2.7/dist-packages/flask/app.py", line 1517, in handle_ user_exception
Sep 19 14:50:44 PETASAN04 deploy.py[1132]: reraise(exc_type, exc_value, tb)
Sep 19 14:50:44 PETASAN04 deploy.py[1132]: File "/usr/lib/python2.7/dist-packages/flask/app.py", line 1612, in full_di spatch_request
Sep 19 14:50:44 PETASAN04 deploy.py[1132]: rv = self.dispatch_request()
Sep 19 14:50:44 PETASAN04 deploy.py[1132]: File "/usr/lib/python2.7/dist-packages/flask/app.py", line 1598, in dispatc h_request
Sep 19 14:50:44 PETASAN04 deploy.py[1132]: return self.view_functions[rule.endpoint](**req.view_args)
Sep 19 14:50:44 PETASAN04 deploy.py[1132]: File "/usr/lib/python2.7/dist-packages/PetaSAN/web/deploy_controller/wizard .py", line 233, in main
Sep 19 14:50:44 PETASAN04 deploy.py[1132]: disk_list = ceph_disk_lib.get_disk_list_deploy()
Sep 19 14:50:44 PETASAN04 deploy.py[1132]: File "/usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/ceph_disk_lib.py", line 269, in get_disk_list_deploy
Sep 19 14:50:44 PETASAN04 deploy.py[1132]: ceph_disk_list = get_disk_list()
Sep 19 14:50:44 PETASAN04 deploy.py[1132]: File "/usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/ceph_disk_lib.py", line 196, in get_disk_list
Sep 19 14:50:44 PETASAN04 deploy.py[1132]: ceph_disk_list = get_ceph_disk_list()
Sep 19 14:50:44 PETASAN04 deploy.py[1132]: File "/usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/ceph_disk_lib.py", line 104, in get_ceph_disk_list
Sep 19 14:50:44 PETASAN04 deploy.py[1132]: for device in ceph_disk.list_devices():
Sep 19 14:50:44 PETASAN04 deploy.py[1132]: File "/usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/ceph_disk.py", lin e 720, in list_devices
Sep 19 14:50:44 PETASAN04 deploy.py[1132]: space_map))
Sep 19 14:50:44 PETASAN04 deploy.py[1132]: File "/usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/ceph_disk.py", lin e 516, in list_dev
Sep 19 14:50:44 PETASAN04 deploy.py[1132]: if ptype in (PTYPE['regular']['osd']['ready']):
Sep 19 14:50:44 PETASAN04 deploy.py[1132]: TypeError: 'in <string>' requires string as left operand, not NoneType
Sep 19 14:54:23 PETASAN04 deploy.py[1132]: [2019-09-19 14:54:23,916] ERROR in app: Exception on / [GET]
Sep 19 14:54:23 PETASAN04 deploy.py[1132]: Traceback (most recent call last):
Sep 19 14:54:23 PETASAN04 deploy.py[1132]: File "/usr/lib/python2.7/dist-packages/flask/app.py", line 1982, in wsgi_ap p
Sep 19 14:54:23 PETASAN04 deploy.py[1132]: response = self.full_dispatch_request()
Sep 19 14:54:23 PETASAN04 deploy.py[1132]: File "/usr/lib/python2.7/dist-packages/flask/app.py", line 1614, in full_di spatch_request
Sep 19 14:54:23 PETASAN04 deploy.py[1132]: rv = self.handle_user_exception(e)
Sep 19 14:54:23 PETASAN04 deploy.py[1132]: File "/usr/lib/python2.7/dist-packages/flask/app.py", line 1517, in handle_ user_exception
Sep 19 14:54:23 PETASAN04 deploy.py[1132]: reraise(exc_type, exc_value, tb)
Sep 19 14:54:23 PETASAN04 deploy.py[1132]: File "/usr/lib/python2.7/dist-packages/flask/app.py", line 1612, in full_di spatch_request
Sep 19 14:54:23 PETASAN04 deploy.py[1132]: rv = self.dispatch_request()
Sep 19 14:54:23 PETASAN04 deploy.py[1132]: File "/usr/lib/python2.7/dist-packages/flask/app.py", line 1598, in dispatc h_request
Sep 19 14:54:23 PETASAN04 deploy.py[1132]: return self.view_functions[rule.endpoint](**req.view_args)
Sep 19 14:54:23 PETASAN04 deploy.py[1132]: File "/usr/lib/python2.7/dist-packages/PetaSAN/web/deploy_controller/wizard .py", line 233, in main
Sep 19 14:54:23 PETASAN04 deploy.py[1132]: disk_list = ceph_disk_lib.get_disk_list_deploy()
Sep 19 14:54:23 PETASAN04 deploy.py[1132]: File "/usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/ceph_disk_lib.py", line 269, in get_disk_list_deploy
Sep 19 14:54:23 PETASAN04 deploy.py[1132]: ceph_disk_list = get_disk_list()
Sep 19 14:54:23 PETASAN04 deploy.py[1132]: File "/usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/ceph_disk_lib.py", line 196, in get_disk_list
Sep 19 14:54:23 PETASAN04 deploy.py[1132]: ceph_disk_list = get_ceph_disk_list()
Sep 19 14:54:23 PETASAN04 deploy.py[1132]: File "/usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/ceph_disk_lib.py", line 104, in get_ceph_disk_list
Sep 19 14:54:23 PETASAN04 deploy.py[1132]: for device in ceph_disk.list_devices():
Sep 19 14:54:23 PETASAN04 deploy.py[1132]: File "/usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/ceph_disk.py", lin e 720, in list_devices
Sep 19 14:54:23 PETASAN04 deploy.py[1132]: space_map))
Sep 19 14:54:23 PETASAN04 deploy.py[1132]: File "/usr/lib/python2.7/dist-packages/PetaSAN/core/ceph/ceph_disk.py", lin e 516, in list_dev
Sep 19 14:54:23 PETASAN04 deploy.py[1132]: if ptype in (PTYPE['regular']['osd']['ready']):
Sep 19 14:54:23 PETASAN04 deploy.py[1132]: TypeError: 'in <string>' requires string as left operand, not NoneType
Sep 19 14:55:01 PETASAN04 CRON[2237]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)

 

I can see all disks just fine in lsblk and ensured that disks have no existing partitions. Very stumped!

This seems to be the same issue as

http://www.petasan.org/forums/?view=thread&id=515

Can you apply the supplied patch, and  restart the deployment app :

systemctl restart petasan-deploy

then re-start deployment ( refresh browser on port 5001 )

 

Hi. Thanks for that. I did try searching but I couldn't find anything, obviously not using the right search criteria.

So last night before I saw your reply, I had a bit of a play around. I ran the deployment and ended up with the errors as before, re booted the host and re-tried deployment, and the same errors, but this time I was able to get a little more from syslog, turns out I had at least 1 failed disk that was passing SMART but was causing the whole system to hangup.

I removed said disk and replaced it, rebooted it to be sure and re-ran the setup (without patch) and it went through just fine, so moral of the story is, as you suggested, something was muching on my system, but it was a failed disk not correctly reporting its failure state. I didn't apply the patch in the end.

Thanks so much for the help.