Accidentially deleted the device_health_metrics pool
RobertH
27 Posts
January 21, 2022, 3:53 pmQuote from RobertH on January 21, 2022, 3:53 pmMight not particularly call it a "bug" but petasan should probably either prevent the user from deleting the health pool or atleast make it easy to recreate it if it was deleted....
During new cluster install I inadvertently selected it to create the rados gateway pools and was trying to cleanup, found a post in the discussion forum where someone listed that pool along with the rgw pools and admin said it was ok to delete them but didnt mention anything specifically about the health pool so I figured it was safe, had never seen it before on older petasan install we had... came in next morning to find cluster health in error state complaining about missing pool
recreated the pool but only option for pool use was either cephfs, rbd, or radosgw, couldnt remember what it was marked for googled, found zero details anywhere, lots of other people doing same thing with proxmox,
the error message made it sound like it was supposed to be a rbd pool so created a pool with rbd with the name, error was still there and still couldnt run the cluster disk health scan (ceph device scrape-health-metrics), decided to reboot the cluster nodes that made part of the error message go away but still couldnt run the cluster health scan
built a new cluster to see what usage it was set to and found it was "mgr_devicehealth", with no way to set that up in the gui, tried deleting the pool I created from the gui to redo it from the command line pool deleted from the gui, but by the time I typed the ceph command in to recreate the pool it said that the pool already existed, went back to the gui to find the pool was still there and the usage had now changed to "mgr_devicehealth"
error was still showing on the cluster so rebooted all the nodes, error went away and the cluster health scan was working again
Might not particularly call it a "bug" but petasan should probably either prevent the user from deleting the health pool or atleast make it easy to recreate it if it was deleted....
During new cluster install I inadvertently selected it to create the rados gateway pools and was trying to cleanup, found a post in the discussion forum where someone listed that pool along with the rgw pools and admin said it was ok to delete them but didnt mention anything specifically about the health pool so I figured it was safe, had never seen it before on older petasan install we had... came in next morning to find cluster health in error state complaining about missing pool
recreated the pool but only option for pool use was either cephfs, rbd, or radosgw, couldnt remember what it was marked for googled, found zero details anywhere, lots of other people doing same thing with proxmox,
the error message made it sound like it was supposed to be a rbd pool so created a pool with rbd with the name, error was still there and still couldnt run the cluster disk health scan (ceph device scrape-health-metrics), decided to reboot the cluster nodes that made part of the error message go away but still couldnt run the cluster health scan
built a new cluster to see what usage it was set to and found it was "mgr_devicehealth", with no way to set that up in the gui, tried deleting the pool I created from the gui to redo it from the command line pool deleted from the gui, but by the time I typed the ceph command in to recreate the pool it said that the pool already existed, went back to the gui to find the pool was still there and the usage had now changed to "mgr_devicehealth"
error was still showing on the cluster so rebooted all the nodes, error went away and the cluster health scan was working again
Accidentially deleted the device_health_metrics pool
RobertH
27 Posts
Quote from RobertH on January 21, 2022, 3:53 pmMight not particularly call it a "bug" but petasan should probably either prevent the user from deleting the health pool or atleast make it easy to recreate it if it was deleted....
During new cluster install I inadvertently selected it to create the rados gateway pools and was trying to cleanup, found a post in the discussion forum where someone listed that pool along with the rgw pools and admin said it was ok to delete them but didnt mention anything specifically about the health pool so I figured it was safe, had never seen it before on older petasan install we had... came in next morning to find cluster health in error state complaining about missing pool
recreated the pool but only option for pool use was either cephfs, rbd, or radosgw, couldnt remember what it was marked for googled, found zero details anywhere, lots of other people doing same thing with proxmox,
the error message made it sound like it was supposed to be a rbd pool so created a pool with rbd with the name, error was still there and still couldnt run the cluster disk health scan (ceph device scrape-health-metrics), decided to reboot the cluster nodes that made part of the error message go away but still couldnt run the cluster health scan
built a new cluster to see what usage it was set to and found it was "mgr_devicehealth", with no way to set that up in the gui, tried deleting the pool I created from the gui to redo it from the command line pool deleted from the gui, but by the time I typed the ceph command in to recreate the pool it said that the pool already existed, went back to the gui to find the pool was still there and the usage had now changed to "mgr_devicehealth"
error was still showing on the cluster so rebooted all the nodes, error went away and the cluster health scan was working again
Might not particularly call it a "bug" but petasan should probably either prevent the user from deleting the health pool or atleast make it easy to recreate it if it was deleted....
During new cluster install I inadvertently selected it to create the rados gateway pools and was trying to cleanup, found a post in the discussion forum where someone listed that pool along with the rgw pools and admin said it was ok to delete them but didnt mention anything specifically about the health pool so I figured it was safe, had never seen it before on older petasan install we had... came in next morning to find cluster health in error state complaining about missing pool
recreated the pool but only option for pool use was either cephfs, rbd, or radosgw, couldnt remember what it was marked for googled, found zero details anywhere, lots of other people doing same thing with proxmox,
the error message made it sound like it was supposed to be a rbd pool so created a pool with rbd with the name, error was still there and still couldnt run the cluster disk health scan (ceph device scrape-health-metrics), decided to reboot the cluster nodes that made part of the error message go away but still couldnt run the cluster health scan
built a new cluster to see what usage it was set to and found it was "mgr_devicehealth", with no way to set that up in the gui, tried deleting the pool I created from the gui to redo it from the command line pool deleted from the gui, but by the time I typed the ceph command in to recreate the pool it said that the pool already existed, went back to the gui to find the pool was still there and the usage had now changed to "mgr_devicehealth"
error was still showing on the cluster so rebooted all the nodes, error went away and the cluster health scan was working again