Can I use the pool during resilver?

Yes, but performance will be reduced. ZFS prioritizes regular I/O over resilver I/O by default. You can adjust priority with zfs set resilver_delay_ms=0 for faster resilvering at the cost of performance.

What if I have no redundancy (stripe)?

If the pool is a stripe (no mirrors or raidz) and a disk fails, data on that disk is lost. ZFS cannot resilver without redundancy. This is why production ZFS pools should always use mirrors or raidz.

Linux ZFS Pool Degraded Resilver — What It Means & How to Fix It

Criticalfilesystem error

About Linux ZFS Pool Degraded Resilver

Fix ZFS storage pool degraded state by replacing failed disks and understanding the resilvering process for data recovery. This guide covers everything you need to know about this topic, including common causes, step-by-step solutions, and answers to frequently asked questions.

Here are the key things to understand: A degraded ZFS pool means one or more disks in a redundant VDEV have failed but data is still accessible. Resilvering is ZFS's process of rebuilding data onto a replacement disk from the remaining copies. Unlike traditional RAID rebuilds, ZFS resilvers only copy used blocks, which is often faster. A degraded pool is at risk: if another disk in the same VDEV fails before resilver completes, data is lost. Understanding these fundamentals will help you diagnose and resolve this issue more effectively.

The most common reasons this occurs include: Physical disk failure (bad sectors, drive death) causing the pool to lose a member. Disk temporarily disconnected (loose cable, USB disconnection) marked as faulted. Too many checksum errors on a disk causing ZFS to fault it out of the pool. Controller failure causing communication loss with one or more disks. Identifying the root cause is the first step toward finding the right solution.

To resolve this, follow these recommended steps: Check pool status: zpool status to identify which disk is faulted and which VDEV is degraded. Replace the failed disk: zpool replace poolname /dev/old-disk /dev/new-disk. Monitor resilver progress: zpool status shows estimated time remaining. After resilver completes, verify: zpool scrub poolname and check for zero errors in zpool status. If these steps do not resolve the issue, consider consulting additional resources or a qualified professional.

This article is part of our Linux Error Codes collection on Error Codes Wiki. We provide comprehensive, up-to-date information to help you find solutions quickly.

Quick Answer

How long does resilvering take?

It depends on pool size and used space. Since ZFS only resilvers used blocks (not the entire disk), a 10TB drive with 2TB used may resilver in hours rather than the days a full-disk RAID rebuild would take.

Overview

Fix ZFS storage pool degraded state by replacing failed disks and understanding the resilvering process for data recovery.

Key Details

A degraded ZFS pool means one or more disks in a redundant VDEV have failed but data is still accessible
Resilvering is ZFS's process of rebuilding data onto a replacement disk from the remaining copies
Unlike traditional RAID rebuilds, ZFS resilvers only copy used blocks, which is often faster
A degraded pool is at risk: if another disk in the same VDEV fails before resilver completes, data is lost

Common Causes

Physical disk failure (bad sectors, drive death) causing the pool to lose a member
Disk temporarily disconnected (loose cable, USB disconnection) marked as faulted
Too many checksum errors on a disk causing ZFS to fault it out of the pool
Controller failure causing communication loss with one or more disks

Steps

1Check pool status: zpool status to identify which disk is faulted and which VDEV is degraded
2Replace the failed disk: zpool replace poolname /dev/old-disk /dev/new-disk
3Monitor resilver progress: zpool status shows estimated time remaining
4After resilver completes, verify: zpool scrub poolname and check for zero errors in zpool status