Replace a RAID1 array drive

  • Note: If your RAID array is degraded you will see something like this
cat /proc/mdstat 

Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] 
md0 : active raid1 sdb1[1] sda1[0](F)
      487104 blocks super 1.2 [2/1] [_U]
md1 : active raid1 sdb2[1] sda2[0](F)
      976142144 blocks super 1.2 [2/1] [_U]

Get information about the hard drive(s)

sudo hdparm -I /dev/sda

/dev/sda:
 HDIO_DRIVE_CMD(identify) failed: Input/output error

A healthy drive will show output similar to:

sudo hdparm -I /dev/sdb

/dev/sdb:

ATA device, with non-removable media
	Model Number:       WDC WD10EZEX-00WN4A0                    
	Serial Number:      WD-WCC6Y2ZHZJAH
	Firmware Revision:  01.01A01
	Transport:          Serial, SATA 1.0a, SATA II Extensions, SATA Rev 2.5, SATA Rev 2.6, SATA Rev 3.0
Standards:
	Used: unknown (minor revision code 0x001f) 
	Supported: 10 9 8 7 6 5 
	Likely used: 10
Configuration:
	Logical		max	current
	cylinders	16383	0
	heads		16	0
	sectors/track	63	0
	--
... etc

or an alternative command

sudo lshw -class disk -class storage

... etc

  *-scsi:0
       physical id: 1
       logical name: scsi0
       capabilities: emulated
     *-disk
          description: SCSI Disk
          physical id: 0.0.0
          bus info: scsi@0:0.0.0
          logical name: /dev/sda
          size: 931GiB (1TB)
          configuration: logicalsectorsize=512 sectorsize=512
  *-scsi:1
       physical id: 2
       logical name: scsi1
       capabilities: emulated
     *-disk
          description: ATA Disk
          product: WDC WD10EZEX-00W
          vendor: Western Digital
          physical id: 0.0.0
          bus info: scsi@1:0.0.0
          logical name: /dev/sdb
          version: 1A01
          serial: WD-WCC6Y2ZHZJAH
          size: 931GiB (1TB)
          capabilities: partitioned partitioned:dos
          configuration: ansiversion=5 logicalsectorsize=512 sectorsize=4096 signature=000e5fab


... etc

and another (possibly the best solution) program;

sudo smartctl -d ata -a -i /dev/sdb

... etc

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Blue
Device Model:     WDC WD10EZEX-00WN4A0
Serial Number:    WD-WCC6Y2ZHZJAH
LU WWN Device Id: 5 0014ee 20d232e87
Firmware Version: 01.01A01
User Capacity:    1,000,204,886,016 bytes [1.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-3 T13/2161-D revision 3b
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Mon Jan 30 13:43:46 2017 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

... etc

Remove the failed disk from the RAID1 array

In this example, I have 2 disks - /dev/sda and /dev/sdb with 2 partitions on each disk /dev/sda1, /dev/sda2, /dev/sdb1, /dev/sdb2 - and /dev/sda has died. Here is what mdstat looks like

mdadm --manage /dev/md0 --fail /dev/sda1
mdadm --manage /dev/md1 --fail /dev/sda2

mdadm --manage /dev/md0 --remove /dev/sda1
mdadm --manage /dev/md0 --remove /dev/sda2

shutdown -h now
  • Physically remove the broken disk and install the new one

Add the new disk to the RAID1 array

In this example, I have 2 disks - /dev/sda and /dev/sdb - and /dev/sda has died. Here is what mdstat looks like

cat /proc/mdstat 
Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] 
md1 : active raid1 sdb2[1]
      976142144 blocks super 1.2 [2/1] [_U]
      
md0 : active raid1 sdb1[1]
      487104 blocks super 1.2 [2/1] [_U]
      
unused devices: <none>

Mirror the partitioning scheme from the RAID disk to the new disk

sfdisk -d /dev/sdb | sfdisk /dev/sda

Add the new disk to the array

mdadm --manage /dev/md0 --add /dev/sda1 
mdadm --manage /dev/md1 --add /dev/sda2 

Check the recovery status

cat /proc/mdstat 
Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] 
md1 : active raid1 sda2[2] sdb2[1]
      976142144 blocks super 1.2 [2/1] [_U]
      [>....................]  recovery =  0.3% (3282752/976142144) finish=103.7min speed=156321K/sec
      
md0 : active raid1 sda1[2] sdb1[1]
      487104 blocks super 1.2 [2/2] [UU]
      
unused devices: <none>

Install GRUB to the MBR of the new disk

grub-install /dev/sda