The initial step in this process is to identify which RAID Arrays have failed. For that we have to check the status of the Raid by doing ‘cat /proc/mdstat’.
A good Raid mirrored partitions will show the result as given below.
[servermyserver]# cat /proc/mdstat
Personalities:[raid1]
read_ahead 1024 sectors
md2:active raid1 sdb3[1] sda3[0]
50304 blocks [2/2] [UU]
md1:active raid1 sdb2[1] sda2[0]
29632 blocks [2/2] [UU]
md0:active raid1 sdb1[1] sda1[0]
24576 blocks [2/2] [UU]
This is reflected by the fact that each mirrored partition is being displayed by [UU] – each “U” is a good partition.
To identify if a RAID Array is Good or Failed look at the string containing [UU]. Each “U” representsanhealthy partition in the RAID Array. If you see
[UU] then the RAID Array is healthy. If you see a missing “U” like [_U] then the RAID Array is degraded or faulty.
A faulty Raid mirrored partitions will show a result something like this.
# [servermyserver]# cat /proc/mdstat
Personalities:[raid1]
read_ahead 1024 sectors
md2:active raid1 sdb3[1] sda3[0]
50304 blocks [2/2] [UU]
md1:active raid1 sdb2[1] sda2[0]
29632 blocks [2/2] [U_]
md0:active raid1 sdb1[1] sda1[0]
24576 blocks [2/2] [UU]
From the aboveout putwe can see that RAID Array “md1?ismissing a “U” and is degraded or faulty.
Lets have a closer look at the failed Raid array “md1”
md1:active raid1 sdb2[1] sda2[0]
29632 blocks [2/2] [U_]
whichindicates that “sda2” has failed and sdb2 is Good.
Removing the failed partition(s) and disk:
Before we can physically remove the hard drive from the system we must first “fail” thedisks partition(s)from all RAID Arrays that they belong to. Even
thoughonly partition /dev/sda2 or RAID Array md1 has failed, we must manually fail all the other /dev/sda# partitions that belong to RAID Arrays, beforewe
canremove the hard drive from the system.
Now we will fail the disk partitions using the following command.
mdadm–manage /dev/md1 –fail /dev/sda2
We have to repeat this command for each partition changing /dev/md# and /dev/sdb# in the above command to match the output from “cat /proc/mdstat”.
like: # mdadm –manage /dev/md0 –fail /dev/sda1 etc.. .
Removing the failed partitions
Now, all the partitions are failed and can be removed from the RAID arrays.
Command to follow.
mdadm–manage /dev/md1 –remove /dev/sda2
We have to repeat this command for each partition changing /dev/md# and /dev/sdb# in the above command to match the output from “cat /proc/mdstat”.
Like:
# mdadm –manage /dev/md0 –remove /dev/sda1
Power off the system and physically replace the hard drive and power on (since it is a Software RAID)
# shutdown -h now
How to add the new disk to the RAID Array
The new hard disk has been successfully added. Now, we have to create the exact same partition as that of # /dev/sda
We can do that using the following command
# sfdisk -d /dev/sda | sfdisk /dev/sdb
We can make sure if both the hard drives having the same partitions using
# fdisk -l
Then follow the below commands to add the partition back to the RAID array using the command “mdadm”
# mdadm –manage /dev/md0 –add /dev/sda1 (Repeat them for each md# and sda#)
servermyserver:~# mdadm –manage /dev/md1 –add /dev/sda2
mdadm: re-added /dev/sda2
Follow this for md0 and md2 like
server1:~# mdadm –manage /dev/md0 –add /dev/sda1
mdadm: re-added /dev/sda1
Check if the partitions are beingsynchronisedusing ” cat /proc/mdstat “. It will show the current status of the synchronization process.
#[servermyserver]# cat /proc/mdstat
Personalities:[raid1]
read_ahead 1024 sectors
md2:active raid1 sdb3[1] sda3[0]
50304 blocks [2/2] [UU]
md1:active raid1 sdb2[1] sda2[0]
29632 blocks [2/1] [U_]
[>………………..] recovery= 1.8% (179/29632) finish=193.6min speed=81086K/sec
md0:active raid1 sdb1[1] sda1[0]
24576 blocks [2/2] [UU]
After a successful synchronization raid will be back to normal. You verify it by checking the status of ” cat /proc/mdstat “.
Check the status of md1 to see if all are good.
[servermyserver]# cat /proc/mdstat
Personalities:[raid1]
read_ahead 1024 sectors
md2:active raid1 sdb3[1] sda3[0]
50304 blocks [2/2] [UU]
md1:active raid1 sdb2[1] sda2[0]
29632 blocks [2/2] [UU]
md0:active raid1 sdb1[1] sda1[0]
24576 blocks [2/2] [UU]
After all are fine and recovery has completed, install grub on the drives
Leave A Comment
You must be logged in to post a comment.