Last night when I was updating my Proxmox server I restarted it. But I wasn’t able to connect to it after five minutes, so I connected a monitor to it and saw that a hard drive wasn’t being detected. I Identified the failed drive, replaced it, and powered the server up.
When Proxmox started up, I opened a shell on the server from the web interface and ran the following command.
zpool status
Which then displayed the following output.
pool: storage
state: ONLINE
scan: scrub repaired 0B in 0 days 01:04:34 with 0 errors on Sun Aug 8 01:28:36 2021
config:
NAME STATE READ WRITE CKSUM
TB_Drives ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
ata-WDC_********-******_WD-**********
ONLINE 0 0 0
ata-WDC_********-******_WD-**********
ONLINE 0 0 0
errors: No known data errors
pool: os
state: DEGRADED
status: One or more devices has been removed by the administrator. Sufficient replicas exist for the pool to continue functioning in a degraded state.
action: Online the device using 'zpool online' or replace the device with 'zpool replace'.
scan: scrub repaired 0B in 0 days 01:04:34 with 0 errors on Sun Aug 8 01:28:36 2021
config:
NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
475234859233716278394
REMOVED 0 0 0
ata-SSD_**********_WD-*******
ONLINE 0 0 0
errors: No known data errors
To replace the drive, we’ll need the drive path of the new one to replace the old one. So go back to the Proxmox admin dashboard. Click on the server from the left side menu, then click on “Disks”. You’ll see a list of disks plugged into your Proxmox server. Find the new hard drive you added and under the “Device” column, copy the value for the hard drive. It should be something like /dev/sdb
.
After you have the value for the new hard drive, we’ll run the following command to add the new hard drive to the pool.
zpool replace -f (pool name) (Old HD ID) (New HD ID)
Using the command above, we’ll input the old and new hard drive identifiers. The command below is an example of what the command should look like using the hard drive IDs from this blog post.
zpool replace -f os 475234859233716278394 /dev/sdb
This will start the process of replacing the bad drive with the new drive. The time it will take depends on the size of the drive and how much data it needs to sync to the new drive. To check the status of the process run the following command.
zpool status
This will show output similar to the output below. If you check the status portion it will state the time reaming as well as how much data it has already synced.
pool: storage
state: ONLINE
scan: scrub repaired 0B in 0 days 01:04:34 with 0 errors on Sun Aug 8 01:28:36 2021
config:
NAME STATE READ WRITE CKSUM
TB_Drives ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
ata-WDC_********-******_WD-**********
ONLINE 0 0 0
ata-WDC_********-******_WD-**********
ONLINE 0 0 0
errors: No known data errors
pool: os
state: ONLINE
scan: resilvered 107.9G in 0 days 00:35:04 with 0 errors on Wed Sep 5 18:05:03 2021
config:
NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
sdb ONLINE 0 0 0
ata-SSD_**********_WD-*******
ONLINE 0 0 0
errors: No known data errors
After that, just let the server sync and when it’s done, the pool will show an online state and the drives should all show their state as online too.