13 December 2020

Migrating an existing system to RAID1

While going through SMART data on my machines I noticed my old server I had been migrating stuff off of hadn't been running smart tests in a while due to a mistake I made while merging the smartd configuration file. I had merged a DEVICESCAN statement into the configuration, resulting in my manual settings further down no longer getting used. Oops. :(

So after fixing that I kicked of a manual long selftest, which got stuck at 90%. Apparently that's a known issue with these drives. And, in case you're wondering about the age of that ticket, the warranty on this drive expired in 2012 (I put the serial in Seagate's website for laughs... ;) )

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda 7200.10
Device Model:     ST3320620AS
Serial Number:    
Firmware Version: 3.AAC
User Capacity:    320,072,933,376 bytes [320 GB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA/ATAPI-7 (minor revision not indicated)
Local Time is:    Wed Dec  2 22:45:49 2020 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
...snipped stuff...
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   117   088   006    Pre-fail  Always       -       134917032
  3 Spin_Up_Time            0x0003   096   095   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   099   099   020    Old_age   Always       -       1422
  5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail  Always       -       1
  7 Seek_Error_Rate         0x000f   078   060   030    Pre-fail  Always       -       59414292
  9 Power_On_Hours          0x0032   011   011   000    Old_age   Always       -       78048
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   099   099   020    Old_age   Always       -       1375
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   068   049   045    Old_age   Always       -       32 (Min/Max 26/32)
194 Temperature_Celsius     0x0022   032   051   000    Old_age   Always       -       32 (0 13 0 0 0)
195 Hardware_ECC_Recovered  0x001a   062   053   000    Old_age   Always       -       204144318
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0000   100   253   000    Old_age   Offline      -       0
202 Data_Address_Mark_Errs  0x0032   100   253   000    Old_age   Always       -       0

SMART Error Log Version: 1
No Errors Logged

The one reallocated sector was fairly recent, and combined with the selftest issue I figured it might be prudent to do something about this drive as it's been a lone system drive for a long time now, and while I do have backups I'd rather not have to deal with those at some inopportune time.

So instead of a sensible solution (where's the fun in that?) I decided to just plop in another drive and see if I could migrate the entire thing to RAID1 in place.

Now, after reading the Arch Wiki on the subject "in place" seemed a bit of a misnomer, since it still requires a bunch of downtime and eventually rewriting the entire disk, but well...

Migrating an existing system to RAID1

Backups

To avoid things writing to the disk while migrating bring everything down and move the drive to a secondary system. In my case this made the original disk sdb and the new disk sdc.

First things first dd a backup of the /boot and / partitions to a NFS share: sudo dd if=/dev/sda2 | gzip > /mnt/backup/backup-superbia-root-20201202.gz This should have been pretty fast, because infiniband, but this is a dual socket Harpertown room heater, so the CPU was the bottleneck.

In hindsight I should probably just have sent the data uncompressed and then done the compression on the remote end (or just let ZFS deal with it maybe? Not sure how effective that would be?) then I would have been able to just pump the data over as fast as the disk could deliver it.

Preparation

Make sure mdadm is available in the initramfs, in Gentoo this can be done by setting:

MDADM="yes"

in /etc/genkernel.conf, before (re)building the kernel.

Prepare the new disk

The original layout of the "source" disk:

sdb     298.1G
├─sdb1      1K
├─sdb2  243.1M ext2
├─sdb3  296.9G ext3
└─sdb5  972.7M swap

Clone the partition table of the original disk to the new disk as per the wiki:

sfdisk -d /dev/sda > raidinfo-partitions.sda
sfdisk /dev/sdb < raidinfo-partitions.sda

Next I insert the "new" disk for the RAID1 array and create the mdadm mirrors in degrated state for each partition (including swap):

mdadm --create /dev/md0 --level=1 --raid-devices=2 missing /dev/sdc2
mdadm --create /dev/md1 --level=1 --raid-devices=2 missing /dev/sdc3
mdadm --create /dev/md2 --level=1 --raid-devices=2 missing /dev/sdc5

Create the filesystems on the new disk. Given that we need to copy all the data later this is a good time to change filesystems. I used this opportunity to upgrade from ext3 to ext4:

mkfs.ext2 /dev/md0
mkfs.ext4 /dev/md1
mkswap /dev/md2

Resulting in this layout:

sdb     298.1G
├─sdb1      1K
├─sdb2  243.1M ext2              /mnt/boot-orig
├─sdb3  296.9G ext3              /mnt/root-orig
└─sdb5  972.7M swap
sdc     465.8G
├─sdc1      1K
├─sdc2  243.1M linux_raid_member
│ └─md0 242.1M ext2              /mnt/boot-new
├─sdc3  296.9G linux_raid_member
│ └─md1 296.8G ext4              /mnt/root-new
└─sdc5  972.7M linux_raid_member
  └─md2 971.6M swap

After mounting old and new partitions per above, time to copy the data:

rsync -aAXHv --exclude={"/dev/*","/proc/*","/sys/*","/tmp/*","/run/*","/mnt/*","/media/*","/lost+found"} /mnt/boot-orig/ /mnt/boot-new/
rsync -aAXHv --exclude={"/dev/*","/proc/*","/sys/*","/tmp/*","/run/*","/mnt/*","/media/*","/lost+found"} /mnt/root-orig/ /mnt/root-new/

Unfortunately that didn't quite work:

rsync: write failed on "/mnt/root-new/var/lib/ntopng/0/rrd/192/168/130/13/bytes.rrd": No space left on device (28)
rsync error: error in file IO (code 11) at receiver.c(374) [receiver=3.1.3]

# pydf
/dev/md1         291G  291G     0 100.0 [###################################################################] /mnt/root-new
/dev/sdb3        292G  242G   35G  83.0 [########################################################...........] /mnt/root-orig

Turns out rsync doesn't like devicemapper, so exclude it:

rsync -aAXHv --exclude={"/dev/*","/proc/*","/sys/*","/tmp/*","/run/*","/mnt/*","/media/*","/lost+found","/var/lib/docker/devicemapper/"} /mnt/root-orig/ /mnt/root-new/

Edit /etc/default/grub using the blkid of our root partition as real_root:

GRUB_CMDLINE_LINUX_DEFAULT="domdadm real_root=UUID=ce3dd9dd-672c-4980-afc6-5e1dbe475845 rootfstype=ext4"

/etc/fstab on the new array needs to be updated to point to our new raid devices:

# cat /mnt/root-new/etc/fstab
UUID="cb8c01bb-627d-4d8f-9c20-436badab3086"             /boot           ext2            noauto,noatime  1 2
UUID="ce3dd9dd-672c-4980-afc6-5e1dbe475845"             /               ext4            noatime         0 1
UUID="bc3e90c8-b19a-4b59-9a3a-3c3a8043287f"             none            swap            sw              0 0

Next chroot into the new system to configure mdadm and install Grub 2.

Since I moved the drives to a different system which used a different distro entirely chrooting didn't work, so I had to resort to using a livecd for the next steps, after moving both disks back to their original system.

mount --bind /sys /mnt/root-new/sys
mount --bind /proc /mnt/root-new/proc
mount --bind /dev /mnt/root-newd/dev
chroot /mnt/root-new/ /bin/bash

Configure mdadm

Insert the active raid configuration into mdadm.conf

# mdadm --detail --scan >> /etc/mdadm.conf

Install Grub 2

Make sure to install grub in both drives' MBR so the system can actually boot should a drive fail.

grub-mkconfig -o /boot/grub/grub.cfg
grub-install --verbose /dev/sda
grub-install --verbose /dev/sdb

Don't be a dummy like me and use the old grub.conf filename from Grub 0.9 and then wonder why things don't work...

Boot into the new raid array

If the drives are still in a second system, move them back to their original system.

Change the boot order in the BIOS to boot from the second disk, containing the new raid array.

Confirm the new enviroment booted by checking in the output of mount that partitions are mounted on the new raid devices.

Prepare the original disk

First copy the partition table from sdb back to sda:

sfdisk -d /dev/sdb | sfdisk /dev/sda

and then, after double checking the parition tables match add the new sda paritions to our array:

mdadm /dev/md127 -a /dev/sda5
mdadm /dev/md125 -a /dev/sda3
mdadm /dev/md126 -a /dev/sda2

mdadm should now start rebuilding the array:

# cat /proc/mdstat
Personalities : [raid1]
md125 : active raid1 sda3[2] sdb3[1]
      311191488 blocks super 1.2 [2/1] [_U]
      [>....................]  recovery =  0.3% (1001664/311191488) finish=77.4min speed=66777K/sec
      bitmap: 3/3 pages [12KB], 65536KB chunk

md126 : active raid1 sda2[2] sdb2[1]
      247936 blocks super 1.2 [2/1] [_U]
        resync=DELAYED

md127 : active raid1 sda5[2] sdb5[1]
      994944 blocks super 1.2 [2/2] [UU]

unused devices: <none>

Once the rebuild is finished, reboot, and change the boot device order back to boot off of sda, and verify the system boots succesfully from sda, confirming booting from either disk works.

Tags: RAID GNU/Linux