[data recovery in North Asia] data recovery case of stornext file system of Kunteng series storage server

Time:2022-5-14

Server data recovery environment:
Kunteng series storage;
9 disk cabinets with 24 hard disks;
Eight storage cabinets store data and one storage cabinet stores metadata;
There are 24 146g hard disks in the metadata storage cabinet: 8 groups of RAID1 + 1 group of RAID10 + 4 global hot spare disks with 4 disks;
192 hard disks in the data storage cabinet: 32 groups of RAID5 with 6 disks, divided into 2 storage systems.

Fault:
Two hard disks in a group of raid in one storage system in the data storage successively fail offline, raid fails, and the whole storage system crashes and cannot be used. The administrator shall contact the North Asia Data Recovery Center for data recovery.
The storage and file system architecture is roughly as follows:
[data recovery in North Asia] data recovery case of stornext file system of Kunteng series storage server

[data recovery in North Asia] data recovery case of stornext file system of Kunteng series storage server
Note: meta_ Lun (metadata volume)_ Lun (user data volume)

Data recovery process:
1. In order to prevent secondary damage to the original disk due to misoperation during data recovery, first backup the original storage environment.
Number and mark the six disks in the failed raid, pull out the hard disk from the storage cabinet, connect it to the data backup server of North Asia data recovery center, and backup the six hard disks in full.
Perform storage level backup for the remaining faultless raid. Connect the Beiya private data backup server and storage device with optical fiber cable, enter the Kunteng storage device management interface for configuration, so that the backup server and storage device can communicate normally, and use software to mirror and backup the LUNs in raid.
During the backup process, the data recovery engineer of Beiya found that there were a large number of bad tracks in one failed hard disk in the failed raid, which could not be backed up normally. Beiya engineers opened the failed hard disk, replaced the firmware, and repaired it with pc3000 tools to complete the backup of the failed hard disk.

[data recovery in North Asia] data recovery case of stornext file system of Kunteng series storage server
Partial image file

2. Data analysis.
Analyze the failed raid and obtain the raid related information. The data recovery engineer of North Asia uses the obtained raid information to virtually restructure the RAID array and restore the Lun in the raid to a mirror file. During the analysis, it is found that the hard disk with serious damage is the offline hard disk.
Log in to the management interface of Kunteng storage device and get some basic information related to volumes in stornext file system, as shown in the following figure:
[data recovery in North Asia] data recovery case of stornext file system of Kunteng series storage server

Continue to analyze the meta volumes and data volumes in the stornext file system. In this case, the stornext file system contains two data volumes. Each complete data volume is composed of LUNs in multiple groups of raid. Beiya data recovery engineer analyzes these LUNs, obtains the algorithm law of the combination between LUNs, and virtually reorganizes the complete data volume.

[data recovery in North Asia] data recovery case of stornext file system of Kunteng series storage server

This paper analyzes the node information, directory item information and the corresponding relationship between meta volume and data in meta volume. For the situation that one meta volume manages multiple data volumes, data recovery engineers in North Asia have developed the indexing algorithm from meta volume to data volume.

[data recovery in North Asia] data recovery case of stornext file system of Kunteng series storage server
File node

[data recovery in North Asia] data recovery case of stornext file system of Kunteng series storage server
Directory block

3. Through the analysis, all the information required for data recovery is obtained. The data recovery engineer of North Asia writes a program to scan the node information and directory item information in the meta volume, analyze the directory items and nodes, obtain the complete file system directory structure, analyze the pointer information in each node, and record these information in the database.
[data recovery in North Asia] data recovery case of stornext file system of Kunteng series storage server
file information

4. The data recovery engineer of North Asia writes a file extraction program to read the database and extract the data according to the parsed information and the aggregation algorithm between the two data volumes.

Validation data & handover:
The generated data are randomly sampled and tested, and there is no problem with the data. Extract all documents locally and transfer the data after confirming the extraction.
[data recovery in North Asia] data recovery case of stornext file system of Kunteng series storage server