Fix RAID that is rebuilding
We'll confirm rebuild status, monitor progress, avoid interrupting, and troubleshoot stalls or failures—or tell you when to escalate.
What you'll need
- SSH or console access with sudo
Step-by-step diagnostic
Quick triage — pick your path
Get started
Choose the option that matches what you see. You can jump straight to that section.
Show full guide
Steps
Goal: Confirm rebuild status, monitor progress, avoid interrupting, and troubleshoot stalls.
- Run cat /proc/mdstat. Look for [=====>…] or resync in progress.
- Good: Rebuild in progress—proceed to Monitor.
- Bad: No rebuild—check mdadm -D for degraded state.
Check status
Goal: Confirm the array state and which devices are in use.
- Run mdadm -D /dev/mdX. Note RAID level, devices, and any failed drives.
- Good: Rebuild running—proceed to Monitor.
- Bad: Degraded with failed drive—replace drive and add with mdadm —fail, —add.
Monitor
Goal: Watch rebuild progress without interrupting.
- Run watch -n 5 cat /proc/mdstat. Progress percentage increases over time.
- Good: Progress advances—wait for completion.
- Bad: Progress stalls for hours—proceed to Check dmesg and SMART.
Check stall
Goal: Troubleshoot when rebuild stalls.
- Run dmesg | tail -100 and smartctl -a /dev/sdX for each drive.
- When I/O errors or bad SMART on a drive, replace it and run mdadm —fail, mdadm —add.
- When no errors, escalate with /proc/mdstat, mdadm -D, dmesg, smartctl.
Check degraded
Goal: Handle a degraded array (failed drive).
- Run mdadm -D. When a drive shows (F) or missing, replace the drive.
- Run sudo mdadm —fail /dev/mdX /dev/sdX, then mdadm —add /dev/mdX /dev/sdY with the new drive.
- Rebuild starts automatically.
When to escalate
Escalate if:
- Rebuild stalls for hours with no progress.
- A second drive fails during rebuild.
- mdadm reports uncorrectable errors.
Provide /proc/mdstat, mdadm -D, dmesg, and smartctl output for all drives.
Verification
- /proc/mdstat shows no resync or recovery in progress.
- mdadm -D shows all devices active.
- No I/O errors in dmesg.
Escalation ladder
Work from the device outward. Stop when the problem is fixed.
- Confirm status Run cat /proc/mdstat and mdadm -D.
- Monitor watch -n 5 cat /proc/mdstat.
- Check dmesg and SMART When rebuild stalls, check dmesg and smartctl.
- Replace failed drive mdadm --fail, --add when a drive fails.
- Escalate Provide /proc/mdstat, mdadm -D, dmesg, smartctl.
What to capture if you need help
Before calling support or posting for help, have these ready. It speeds everything up.
- /proc/mdstat output
- mdadm -D /dev/mdX output
- dmesg or journalctl -k output
- smartctl output for each drive
Is rebuild in progress?
Check /proc/mdstat for resync or recovery.
You can change your answer later.
Monitor and wait
Rebuild runs automatically. Do not interrupt.
Is progress advancing?
You can change your answer later.
Rebuild stalled — check disks
dmesg and smartctl for I/O errors.
Is the array degraded?
mdadm -D shows failed or missing drives.
Reviewed by Blackbox Atlas
Frequently asked questions
- How long does a RAID rebuild take?
- Depends on array size and disk speed. A 1 TB drive can take several hours to a day. Monitor /proc/mdstat for progress.
- Can I use the array while it is rebuilding?
- Yes. RAID 1, 5, 6, 10 remain usable during rebuild, but performance is reduced. Avoid heavy I/O when possible.
- When should I escalate a RAID rebuild?
- If rebuild stalls, progress does not advance for hours, or a second drive fails during rebuild. Provide /proc/mdstat, mdadm -D, dmesg, and smartctl output.
Rate this guide
Was this helpful?
Thanks for your feedback.