Forums

Home / Forums

You need to log in to create posts and topics. Login · Register

Petasan doesnt appear to be running smart on NVME devices

Petasan 2.6.2

In this particular cluster we have 3 nodes, each with 3 10k SAS HDDs with an 2 NVME acting as its journal (2x osd on one, 1x osd on the other, sized the nvmes to allow for addition of up to 5 more HDDs)

While in the node > disk screen it has a column for smart status, for spinning drives, and sata/sas attached SSDs it appears to run smart monitoring, it however does not on NVME devices as that column is blank for them

Last night we had a report from one of the nodes that an OSD went offline, while reviewing the logs on the system it appears the OSD stopped because it couldnt write to the nvme that was journaling the HDD (dmesg spammed with critical target error towards the nvme), the drive list in petasan shows the nvme with no errors or warnings, however running smartctl on the drive returned that the drive was failed and media was placed in readonly mode

is there a way to get petasan to run the periodic smart tests on the nvme volumes the same way it does on sata / sas devices?

the nvme in question shows up in the dev tree as /dev/nvme1n1

We will likely end up having to down the node and pulling the nvme to run mfr tools on it to see if we can reset the drive or at the bare minimum warranty it since its only 3 months old

i can confirm this. we rely on smartctl --scan to scan for devices but this does not scan nvmes, we will fix this.