Linux RAID Errors — mdadm Degraded Array and Disk Failure Recovery
Criticalfilesystem
Overview
Fix Linux software RAID (mdadm) errors including degraded arrays, failed disk replacement, RAID rebuild procedures, and monitoring RAID health.
Key Details
- Linux software RAID uses mdadm to manage RAID arrays (RAID 0, 1, 5, 6, 10)
- A degraded RAID array has lost a disk but is still operational (RAID 1, 5, 6, 10)
- RAID 0 has no redundancy — any disk failure means total data loss
- RAID rebuild (resync) can take hours to days depending on array size and I/O load
- SMART monitoring can predict disk failures before they cause RAID degradation
Common Causes
- Hard drive failure causing RAID array to enter degraded state
- Disk removed or disconnected during operation
- RAID rebuild interrupted by power failure or system crash
- Disk experiencing bad sectors causing md to mark it as failed
- SATA cable or controller fault causing intermittent disk disconnections
Steps
- 1Check RAID status: cat /proc/mdstat or mdadm --detail /dev/md0
- 2Identify the failed disk: mdadm --detail /dev/md0 shows status of each member
- 3Remove the failed disk: mdadm /dev/md0 --remove /dev/sdX
- 4Replace the physical disk, then partition it identically to the other members
- 5Add the new disk: mdadm /dev/md0 --add /dev/sdY — rebuild starts automatically
- 6Monitor rebuild progress: watch cat /proc/mdstat
Tags
raidmdadmdegradeddisk-failurerebuild
Related Items
More in Filesystem
linux-read-only-file-systemLinux Read-Only File System
Errorlinux-no-space-left-on-deviceLinux No Space Left on Device
Errorlinux-filesystem-ext4-errorsLinux EXT4 Filesystem Errors Detected — Requires fsck
Criticallinux-mount-wrong-fs-typeLinux Mount Error — Wrong FS Type, Bad Superblock
Errorlinux-lvm-errors-guideLVM Errors — Logical Volume Manager Troubleshooting Guide
Errorlinux-disk-io-performance-errorsLinux Disk I/O Performance Errors — High iowait, Slow Disk, and Diagnostics
WarningFrequently Asked Questions
Depends on array size and activity. A 4TB drive can take 8-24 hours. During rebuild, the array is vulnerable — a second disk failure in RAID 5 means data loss. RAID 6 tolerates two failures.