Skip to main content.
Last update: 09-17-2005
Contributed by A.P. Lawrence
"Recently I had a hard drive fail. It was part of a Linux software RAID 1 (mirrored drives), so we lost no data, and just needed to replace hardware. However, the raid does requires rebuilding. A hardware array would usually automatically rebuild upon drive replacement, but this needed some help."
Print Print E-mail Email


Rebuilding failed Linux software RAID

Recently I had a hard drive fail. It was part of a Linux software RAID 1 (mirrored drives), so we lost no data, and just needed to replace hardware. However, the raid does requires rebuilding. A hardware array would usually automatically rebuild upon drive replacement, but this needed some help.

When you look at a "normal" array, you see something like this:

# cat /proc/mdstat
Personalities : [raid1] 
read_ahead 1024 sectors
md2 : active raid1 hda3[1] hdb3[0]
      262016 blocks [2/2] [UU]
      
md1 : active raid1 hda2[1] hdb2[0]
      119684160 blocks [2/2] [UU]
      
md0 : active raid1 hda1[1] hdb1[0]
      102208 blocks [2/2] [UU]
      
unused devices: <none>

That's the normal state - what you want it to look like. When a drive has failed and been replaced, it looks like this:

Personalities : [raid1]
read_ahead 1024 sectors
md0 : active raid1  hda1[1]
      102208 blocks [2/1] [_U]

md2 : active raid1 hda3[1]
      262016 blocks [2/1] [_U]

md1 : active raid1 hda2[1]
      119684160 blocks [2/1] [_U]
unused devices: <none>

Notice that it doesn't list the failed drive parts, and that an underscore appears beside each U. This shows that only one drive is active in these arrays - we have no mirror.

Another command that will show us the state of the raid drives is "mdadm"

# mdadm -D /dev/md0
/dev/md0:
        Version : 00.90.00
  Creation Time : Thu Aug 21 12:22:43 2003
     Raid Level : raid1
     Array Size : 102208 (99.81 MiB 104.66 MB)
    Device Size : 102208 (99.81 MiB 104.66 MB)
   Raid Devices : 2
  Total Devices : 1
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Fri Oct 15 06:25:45 2004
          State : dirty, no-errors
 Active Devices : 1
Working Devices : 1
 Failed Devices : 0
  Spare Devices : 0


    Number   Major   Minor   RaidDevice State
       0       0        0        0      faulty removed
       1       3        1        1      active sync   /dev/hda1
           UUID : f9401842:995dc86c:b4102b57:f2996278

As this shows, we presently only have one drive in the array.

Although I already knew that /dev/hdb was the other part of the raid array, you can look at /etc/raidtab to see how the raid was defined:

raiddev             /dev/md1
raid-level                  1
nr-raid-disks               2
chunk-size                  64k
persistent-superblock       1
nr-spare-disks              0
    device          /dev/hda2
    raid-disk     0
    device          /dev/hdb2
    raid-disk     1
raiddev             /dev/md0
raid-level                  1
nr-raid-disks               2
chunk-size                  64k
persistent-superblock       1
nr-spare-disks              0
    device          /dev/hda1
    raid-disk     0
    device          /dev/hdb1
    raid-disk     1
raiddev             /dev/md2
raid-level                  1
nr-raid-disks               2
chunk-size                  64k
persistent-superblock       1
nr-spare-disks              0
    device          /dev/hda3
    raid-disk     0
    device          /dev/hdb3
    raid-disk     1

To get the mirrored drives working properly again, we need to run fdisk to see what partitions are on the working drive:

# fdisk /dev/hda

Command (m for help): p

Disk /dev/hda: 255 heads, 63 sectors, 14946 cylinders
Units = cylinders of 16065 * 512 bytes

   Device Boot    Start       End    Blocks   Id  System
/dev/hda1   *         1        13    104391   fd  Linux raid autodetect
/dev/hda2            14     14913 119684250   fd  Linux raid autodetect
/dev/hda3         14914     14946    265072+  fd  Linux raid autodetect

Duplicate that on /dev/hdb. Use "n" to create the parttions, and "t" to change their type to "fd" to match. Once this is done, use "raidhotadd":

# raidhotadd /dev/md0 /dev/hdb1
# raidhotadd /dev/md1 /dev/hdb2
# raidhotadd /dev/md2 /dev/hdb3

The rebuilding can be seen in /proc/mdstat:

# cat /proc/mdstat
Personalities : [raid1] 
read_ahead 1024 sectors
md0 : active raid1 hdb1[0] hda1[1]
      102208 blocks [2/2] [UU]
      
md2 : active raid1 hda3[1]
      262016 blocks [2/1] [_U]
      
md1 : active raid1 hdb2[2] hda2[1]
      119684160 blocks [2/1] [_U]
      [>....................]  recovery =  
0.2% (250108/119684160) finish=198.8min speed=10004K/sec
unused devices: <none>

The md0, a small array, has already completed rebuilding (UU), while md1 has only begun. After it finishes, it will show:

#  mdadm -D /dev/md1
/dev/md1:
        Version : 00.90.00
  Creation Time : Thu Aug 21 12:21:21 2003
     Raid Level : raid1
     Array Size : 119684160 (114.13 GiB 122.55 GB)
    Device Size : 119684160 (114.13 GiB 122.55 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 1
    Persistence : Superblock is persistent

    Update Time : Fri Oct 15 13:19:11 2004
          State : dirty, no-errors
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0


    Number   Major   Minor   RaidDevice State
       0       3       66        0      active sync   /dev/hdb2
       1       3        2        1      active sync   /dev/hda2
           UUID : ede70f08:0fdf752d:b408d85a:ada8922b

I was a little surprised that this process wasn't entirely automatic. There's no reason it couldn't be. This is an older Linux install; I don't know if more modern versions will just automatically rebuild.


Tony Lawrence