Basic tutorial of using mdadm command to operate raid in Linux

Time:2020-4-2

Mdadm for building, managing and monitoring raid arrays

Usage:

mdadm –create device options…
Using unused devices, create raid options,
mdadm –assemble device options…
Merge the RAID array you created earlier.
mdadm –build device options…
Create or merge a raid without metadata.
mdadm –manage device options…
Make changes to an existing array
mdadm –misc options… devices
Report or modify various MD related equipment.
mdadm –grow options device
RAID array adjusted to activate
mdadm –incremental device
Add / remove devices from a raid
mdadm –monitor options…
Monitor changes in one or more raid arrays
mdadm device options…
— short for manage

Mdadm — create main parameter

— auto = yes: it is decided to create the following software disk array device, that is, / dev / md0, / dev / MD1
— raid devices = n: use several disks as disk array devices
— spare devices = n: use several disks to act as spare devices of disk array
— level = [015]: set the level of disk array, commonly 0,1,5

Mdadm — manage main parameters

— add: the following devices will be added to this MD!
— remove: the following devices will be removed from this MD
— fail: it will set the following devices to the error state

1、 In Linux system, MD (multiple devices) virtual block device is used to realize software RAID. A new virtual device is created by using multiple underlying block devices, and striping technology is used to evenly distribute data blocks to multiple disks to improve the read-write performance of virtual device, Different data redundancy algorithms are used to protect the user’s data from being completely lost due to the failure of a block device, and the lost data can be recovered to a new device after the device is replaced
At present, MD supports linear, multipath, raid0 (striping), RAID1 (mirror), raid4, RAID5, raid6, RAID10 and other redundancy levels and hierarchical methods. Of course, MD can also support multiple raid display layers to form RAID10, RAID5 1 and other types of display,
This paper mainly explains how to manage the software RAID in the user layer mdadm and the problems often encountered in the use and the solutions, After the machine is started, we can use cat / proc / mdstat to see whether the kernel has loaded MD driver or cat / proc / devices has MD block device, and we can use lsmod to see whether MD can load modules into the system

Copy code

The code is as follows:

[[email protected] ~]# cat /proc/mdstat
Personalities :
unused devices:
[[email protected] ~]#
[[email protected] ~]# cat /proc/devices | grep md
1 ramdisk
9 md
254 mdp
[[email protected] ~]#mdadm –version
[[email protected] ~]# mdadm –version
mdadm – v2.5.4 – 13 October 2006
[[email protected] ~]#

2、 Mdadm management software RAID display
The mdadm program is an independent program, which can complete all the software RAID management functions, mainly including 7 modes of use:
Create
Create a new array with free devices, each with metadata blocks
Assemble
Assemble each block device originally belonging to an array into an array
Build
Create or assemble arrays that do not need metadata. Each device does not have metadata blocks
Manage
Manage the devices in the storage array, such as adding a hot spare disk or setting a disk failure, and then remove the disk from the array
Misc
Report or modify the information of related devices in the array, such as querying the status information of the array or devices
Grow
Change the capacity used by each device in the array or the number of devices in the array
Monitor
Monitor one or more arrays and report specified events
If the MD driver is compiled into the kernel, when the kernel calls to execute MD driver, it will automatically find the disk with FD (Linux raid autodetect format). Therefore, fdisk is usually used to partition HD disks or SD disks, and then set them to FD disks.

Copy code

The code is as follows:

[[email protected] ~]# fdisk /dev/hdc
The number of cylinders for this disk is set to 25232.
There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:
1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs
(e.g., DOS FDISK, OS/2 FDISK)
Command (m for help): n
Command action
e extended
p primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-25232, default 1):
Using default value 1
Last cylinder or size or sizeM or sizeK (1-25232, default 25232):
Using default value 25232
Command (m for help): t
Selected partition 1
Hex code (type L to list codes): fd
Changed system type of partition 1 to fd (Linux raid autodetect)
Command (m for help): w
The partition table has been altered!
Calling ioctl() to re-read partition table.
WARNING: Re-reading the partition table failed with error 16: Device or
busy.
The kernel still uses the old table.
The new table will be used at the next reboot.
Syncing disks.
[[email protected] ~]#

If the MD driver is loaded in the form of modules, the raid display needs to be started and run by the user level script when the system is running. For example, in the Fedora Core system, there is an instruction to start the soft RAID array in the / etc / rc.d/rc.sysinit file. If the RAID configuration file mdadm.conf exists, the options in the configuration file are checked by calling mdadm, and then the RAID array is started.

Copy code

The code is as follows:

echo “raidautorun /dev/md0” | nash –quiet
if [ -f /etc/mdadm.conf]; then
/sbin/mdadm -A -s

Fi – A: to load an existing display – s: to find the configuration information in the mdadm.conf file.
To manually stop the display:

Copy code

The code is as follows:

#mdadm -S /dev/md0

Create a new display
Mdadm uses the — create (or its abbreviation – C) parameter to create a new display and uses the identification information of some important arrays as metadata to write in the specified range of each underlying device
–Level (or its abbreviation – L) indicates the RAID level of the array
–Chunk (or its abbreviation – C) indicates the size of each stripe unit, in kilobytes, which is 64KB by default. The size configuration of the stripe unit has a great impact on the array read-write performance under different loads
–Raid devices (or its abbreviation – n) indicates the number of active devices in the array
–Spare devices (or its abbreviation – x) refers to the number of hot spare disks in the array. Once a disk in the array fails, MD kernel driver automatically adds the hot spare disk to the array, and then reconstructs the data on the lost disk to the hot spare disk.

Create a RAID 0 device:

Copy code

The code is as follows:

mdadm –create /dev/md0 –level=0 –chunk=32 –raid-devices=3 /dev/sdb1 /dev/sdc1 /dev/sdd1

Create a RAID 1 device:

Copy code

The code is as follows:

mdadm –create /dev/md0 –level=1 –chunk=128 –raid-devices=2 –spare-devices=1 /dev/sdb1 /dev/sdc1 /dev/sdd1

Create a RAID5 device:

Copy code

The code is as follows:

mdadm –create /dev/md0 –level=5 –raid-devices=5 /dev/sd[c-g]1 –spare-devices=1 /dev/sdb1

Create a raid 10 device:

Copy code

The code is as follows:

mdadm -C /dev/md0 -l10 -n6 /dev/sd[b-g] -x1 /dev/sdh

Create a RAID1 0 device:

Copy code

The code is as follows:

mdadm -C /dev/md0 -l1 -n2 /dev/sdb /dev/sdc
mdadm -C /dev/md1 -l1 -n2 /dev/sdd /dev/sde
mdadm -C /dev/md2 -l1 -n2 /dev/sdf /dev/sdg
mdadm -C /dev/md3 -l0 -n3 /dev/md0 /dev/md1 /dev/md2

The initialization time is related to the performance of the raid itself and the load of the read and write application. Use cat / proc / mdstat information to query the current reconstruction speed and the expected completion time of the RAID array.

Copy code

The code is as follows:

cat /proc/mdstat
[[email protected] mdadm-2.6.3]# cat /proc/mdstat
Personalities : [raid10]
md0 : active raid10 sdh[6](S) sdg[5] sdf[4] sde[3] sdd[2] sdc[1] sdb[0]
3145536 blocks 64K chunks 2 near-copies [6/6] [UUUUUU]
[===>………..] resync = 15.3% (483072/3145536) finish=0.3min speed=120768K/sec
unused devices:
[[email protected] mdadm-2.6.3]# cat /proc/mdstat
Personalities : [raid10]
md0 : active raid10 sdh[6](S) sdg[5] sdf[4] sde[3] sdd[2] sdc[1] sdb[0]
3145536 blocks 64K chunks 2 near-copies [6/6] [UUUUUU]
unused devices:

Use display:
MD devices can read and write directly like ordinary block devices, or format file systems.

Copy code

The code is as follows:

#mke2fs -j /dev/md0
mkdir -p /mnt/md-test
#mount /dev/md0 /mnt/md-test

Stop running display:
If the array does not have file system or other storage applications or advanced devices, you can use — stop (or its abbreviation – s) to stop the array. If the command returns an error of device or resource busy type, it means that / dev / md0 is being used by the upper application, and it cannot be stopped temporarily. You must first stop the upper application, so as to ensure the data consistency on the array.

Copy code

The code is as follows:

[[email protected] mdadm-2.6.3]# ./mdadm –stop /dev/md0
mdadm: fail to stop array /dev/md0: Device or resource busy
[[email protected] mdadm-2.6.3]# umount /dev/md0
[[email protected] mdadm-2.6.3]#./mdadm –stop /dev/md0

Mdadm: stopped / dev / md02.3 assemble the array pattern that has been created — assemble or its abbreviation (- a) is mainly to check the metadata information of the underlying device, and then assemble it into an active array. If we already know which devices the array consists of, we can specify which devices to use to start the array.

Copy code

The code is as follows:

[[email protected] mdadm-2.6.3]# ./mdadm -A /dev/md0 /dev/sd[b-h]

Mdadm: / dev / md0 has been started with 6 drives and 1 spare. If there is a configuration file (/ etc / mdadm. CONF), use the command mdadm – as / dev / md0. Mdadm first checks the device information in mdadm.conf, then reads the metadata information from each device, and checks whether it is consistent with the array information. If the information is consistent, start the array. If the / etc / mdadm.conf file is not configured, and you do not know which disks the array consists of, you can use the command — examine (or its abbreviation – E) to detect whether there is metadata information of the array on the current block device. [[email protected] mdadm-2.6.3]# ./mdadm -E /dev/sdi

Copy code

The code is as follows:

mdadm: No md superblock detected on /dev/sdi.
[[email protected] mdadm-2.6.3]# ./mdadm -E /dev/sdb
/dev/sdb:
Magic : a92b4efc
Version : 00.90.00
UUID : 0cabc5e5:842d4baa:e3f6261b:a17a477a
Creation Time : Sun Aug 22 17:49:53 1999
Raid Level : raid10
Used Dev Size : 1048512 (1024.11 MiB 1073.68 MB)
Array Size : 3145536 (3.00 GiB 3.22 GB)
Raid Devices : 6
Total Devices : 7
Preferred Minor : 0
Update Time : Sun Aug 22 18:05:56 1999
State : clean
Active Devices : 6
Working Devices : 7
Failed Devices : 0
Spare Devices : 1
Checksum : 2f056516 – correct
Events : 0.4
Layout : near=2, far=1
Chunk Size : 64K
Number Major Minor RaidDevice State
this 0 8 16 0 active sync /dev/sdb
0 0 8 16 0 active sync /dev/sdb
1 1 8 32 1 active sync /dev/sdc
2 2 8 48 2 active sync /dev/sdd
3 3 8 64 3 active sync /dev/sde
4 4 8 80 4 active sync /dev/sdf
5 5 8 96 5 active sync /dev/sdg
6 6 8 112 6 spare /dev/sdh

From the above command results, you can find the unique identification UUID of the array and the device name contained in the array, and then use the above command to assemble the array, or you can use the UUID identification to assemble the array. Information devices without consistent metadata (such as / dev / SDA and / dev / sda1, etc.) the mdadm program will automatically skip.

Copy code

The code is as follows:

[[email protected] mdadm-2.6.3]# ./mdadm -Av –uuid=0cabc5e5:842d4baa:e3f6261b:a17a477a
/dev/md0 /dev/sd*
mdadm: looking for devices for /dev/md0
mdadm: no recogniseable superblock on /dev/sda
mdadm: /dev/sda has wrong uuid.
mdadm: no recogniseable superblock on /dev/sda1
mdadm: /dev/sda1 has wrong uuid.
mdadm: no RAID superblock on /dev/sdi
mdadm: /dev/sdi has wrong uuid.
mdadm: /dev/sdi1 has wrong uuid.
mdadm: no RAID superblock on /dev/sdj
mdadm: /dev/sdj has wrong uuid.
mdadm: /dev/sdj1 has wrong uuid.
mdadm: no RAID superblock on /dev/sdk
mdadm: /dev/sdk has wrong uuid.
mdadm: /dev/sdk1 has wrong uuid.
mdadm: /dev/sdb is identified as a member of /dev/md0, slot 0.
mdadm: /dev/sdc is identified as a member of /dev/md0, slot 1.
mdadm: /dev/sdd is identified as a member of /dev/md0, slot 2.
mdadm: /dev/sde is identified as a member of /dev/md0, slot 3.
mdadm: /dev/sdf is identified as a member of /dev/md0, slot 4.
mdadm: /dev/sdg is identified as a member of /dev/md0, slot 5.
mdadm: /dev/sdh is identified as a member of /dev/md0, slot 6.
mdadm: added /dev/sdc to /dev/md0 as 1
mdadm: added /dev/sdd to /dev/md0 as 2
mdadm: added /dev/sde to /dev/md0 as 3
mdadm: added /dev/sdf to /dev/md0 as 4
mdadm: added /dev/sdg to /dev/md0 as 5
mdadm: added /dev/sdh to /dev/md0 as 6
mdadm: added /dev/sdb to /dev/md0 as 0
mdadm: /dev/md0 has been started with 6 drives and 1 spare.

Profile:
/As the default configuration file, etc / mdadm.conf is mainly used to facilitate the tracking of the configuration of soft raid, especially for the configuration of monitoring and event reporting options. The assembly command can also use — config (or its short form – C) to specify the configuration file. We can usually create a configuration file as follows

Copy code

The code is as follows:

#echo DEVICE /dev/sdc1 /dev/sdb1 /dev/sdd1 > /etc/mdadm.conf
#mdadm –detail –scan >> /etc/mdadm.conf

When you start an array with a configuration file, mdadm queries the device and array contents in the configuration file, and then starts running all raid arrays that can run. If you specify the device name of the array, only the corresponding array is started.

Copy code

The code is as follows:

[[email protected] mdadm-2.6.3]# ./mdadm -As
mdadm: /dev/md1 has been started with 3 drives.
mdadm: /dev/md0 has been started with 6 drives and 1 spare.
[[email protected] mdadm-2.6.3]# cat /proc/mdstat
Personalities : [raid0] [raid10]
md0 : active raid10 sdb[0] sdh[6](S) sdg[5] sdf[4] sde[3] sdd[2] sdc[1]
3145536 blocks 64K chunks 2 near-copies [6/6] [UUUUUU]
md1 : active raid0 sdi1[0] sdk1[2] sdj1[1]
7337664 blocks 32k chunks
unused devices:
[[email protected] mdadm-2.6.3]# ./mdadm -S /dev/md0 /dev/md1
mdadm: stopped /dev/md0
mdadm: stopped /dev/md1
[[email protected] mdadm-2.6.3]# ./mdadm -As /dev/md0
mdadm: /dev/md0 has been started with 6 drives and 1 spare.
[[email protected] mdadm-2.6.3]# cat /proc/mdstat
Personalities : [raid0] [raid10]
md0 : active raid10 sdb[0] sdh[6](S) sdg[5] sdf[4] sde[3] sdd[2] sdc[1]
3145536 blocks 64K chunks 2 near-copies [6/6] [UUUUUU]
unused devices:

Querying the status of an array
We can use cat /Check the status of all running raid arrays with proc / mdstat information. In the first line, the device name of MD, the active and inactive options indicate whether the array can read and write, followed by the RAID level of the array, followed by the block device belonging to the array, the number in brackets [] indicates the serial number of the device in the array, (s) indicates that it is a hot spare, and (f) indicates that the disk is in fault status. In the second line, first is the size of the array, in kilobytes, then the chunk size, and then the layout type. The layout types of different RAID levels are different, [6 / 6] and [UUuuuUu] indicate that the array has six disks and six disks are working normally, while [5 / 6] and [UUuuuUu] Indicates that five of the six disks in the array are in normal operation, and the disks corresponding to the underlined position are in fault status.

Copy code

The code is as follows:

[[email protected] mdadm-2.6.3]# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [raid1]
md0 : active raid5 sdh[6](S) sdg[5] sdf[4] sde[3] sdd[2] sdc[1] sdb[0]
5242560 blocks level 5, 64k chunk, algorithm 2 [6/6] [UUUUUU]
unused devices:
[[email protected] mdadm-2.6.3]# ./mdadm /dev/md0 -f /dev/sdh /dev/sdb
mdadm: set /dev/sdh faulty in /dev/md0
mdadm: set /dev/sdb faulty in /dev/md0
[[email protected] mdadm-2.6.3]# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [raid1]
md0 : active raid5 sdh[6](F) sdg[5] sdf[4] sde[3] sdd[2] sdc[1] sdb[7](F)
5242560 blocks level 5, 64k chunk, algorithm 2 [6/5] [_UUUUU]
unused devices:

We can also view the brief information (using — query or its abbreviation – Q) and details (using — detail or its abbreviation – D) of the specified array through the mdadm command Detailed information includes raid version, creation time, RAID level, array capacity, available space, number of devices, super block status, update time, UUID information, status of each device, raid algorithm level type and layout, block size and other information. Device status information can be divided into active, sync, spare, fault, rebuilding, removing, etc.

Copy code

The code is as follows:

[email protected] mdadm-2.6.3]# ./mdadm –query /dev/md0
/dev/md0: 2.100GiB raid10 6 devices, 1 spare. Use mdadm –detail for more detail.
[[email protected] mdadm-2.6.3]# ./mdadm –detail /dev/md0
/dev/md0:
Version : 00.90.03
Creation Time : Sun Aug 22 17:49:53 1999
Raid Level : raid10
Array Size : 3145536 (3.00 GiB 3.22 GB)
Used Dev Size : 1048512 (1024.11 MiB 1073.68 MB)
Raid Devices : 6
Total Devices : 7
Preferred Minor : 0
Persistence : Superblock is persistent
Update Time : Sun Aug 22 21:55:02 1999
State : clean
Active Devices : 6
Working Devices : 7
Failed Devices : 0
Spare Devices : 1
Layout : near=2, far=1
Chunk Size : 64K
UUID : 0cabc5e5:842d4baa:e3f6261b:a17a477a
Events : 0.122
Number Major Minor RaidDevice State
0 8 16 0 active sync /dev/sdb
1 8 32 1 active sync /dev/sdc
2 8 48 2 active sync /dev/sdd
3 8 64 3 active sync /dev/sde
4 8 80 4 active sync /dev/sdf
5 8 96 5 active sync /dev/sdg
6 8 112 – spare /dev/sdh

Management array
Mdadm can add and delete disks to the running array in manage mode. It is often used to identify failed disks, add spare disks, and remove failed disks from the array. Use — fail (or its abbreviation – F) to specify a disk corruption.

Copy code

The code is as follows:

[[email protected] mdadm-2.6.3]# ./mdadm /dev/md0 –fail /dev/sdb
mdadm: set /dev/sdb faulty in /dev/md0

When the disk has been damaged, use the — remove (or its abbreviation — F) parameter to remove the disk from the disk array; however, if the device is still in use by the array, it cannot be removed from the array.

Copy code

The code is as follows:

[[email protected] mdadm-2.6.3]# ./mdadm /dev/md0 –remove /dev/sdb
mdadm: hot removed /dev/sdb
[[email protected] mdadm-2.6.3]# ./mdadm /dev/md0 –remove /dev/sde
mdadm: hot remove failed for /dev/sde: Device or resource busy

If the array has spare disk, the data on the damaged disk will be reconstructed to the new spare disk automatically;

Copy code

The code is as follows:

[[email protected] mdadm-2.6.3]# ./mdadm -f /dev/md0 /dev/sdb ; cat /proc/mdstat
mdadm: set /dev/sdb faulty in /dev/md0
Personalities : [raid0] [raid10]
md0 : active raid10 sdh[6] sdb[7](F) sdc[0] sdg[5] sdf[4] sde[3] sdd[2]
3145536 blocks 64K chunks 2 near-copies [6/5] [U_UUUU]
[=======>……..] recovery = 35.6% (373888/1048512) finish=0.1min speed=93472K/sec
unused devices:

If the array does not have a hot spare, you can use the — add (or its abbreviation – a) parameter to add a hot spare

Copy code

The code is as follows:

[[email protected] mdadm-2.6.3]# ./mdadm /dev/md0 –add /dev/sdh
mdadm: added /dev/sdh

Monitoring array
You can use mdadm to monitor raid arrays. The monitor periodically queries whether a specified event occurs, and then handles it properly according to the configuration. For example, when there is a problem with the disk device in the array, an email can be sent to the administrator; or when there is a problem with the disk, a callback program can automatically replace the disk, and all monitoring events can be recorded in the system log. Currently, the events supported by mdadm include rebuildstarted, rebuildnn (NN is 20, 40, 60, or 80), rebuildfinished, fail, failspare, spareactive, newarray, degradedarray, movespare, sparesmissing, testmessage.
If the mddm monitoring process is configured to query the MD device once every 300 seconds, when an error occurs in the array, an email will be sent to the specified user, the event processing program will be executed, and the reported event will be recorded in the log file of the system. Use the — daemonise parameter (or its short form – F) to keep the program running in the background. If the sendmail program is needed to send mail, when the email address is configured as an external network address, you should first test whether it can be sent out.

Copy code

The code is as follows:

[[email protected] mdadm-2.6.3]#./mdadm –monitor –[email protected] –program=/root/md.sh
–syslog –delay=300 /dev/md0 –daemonise