Big factories are also using six kinds of data desensitization schemes to strictly prevent “insiders” who leak data

Time:2021-6-14

In recent days, I always received some strange phone calls at home, “brother, you are XXX, we are XXX high-end men’s private club…”, holding grass, I was stunned at first, and then scolded back. Haughty face turned his head, with a smile slightly flattering: wife, listen to me, I really did nothing, you have to believe me!

Pop~

Big factories are also using six kinds of data desensitization schemes to strictly prevent

After kneading my face and thinking about it carefully, it must be some immoral website that sold my personal information. Now people are in a state of streaking on the Internet. Personal information no longer belongs to individuals. Nowadays, it seems that this kind of thing is not surprising. However, there are many reasons for this kind of thingAn insider

Big factories are also using six kinds of data desensitization schemes to strictly prevent

As developers, what we can do is to try our best to avoid the leakage of user data. Today, let’s talk about the internal means to prevent the leakage of private data in the Internet-Data desensitization

What is data desensitization

What is data desensitization? Data desensitization is also called data De privacy. When we give desensitization rules and strategies, for sensitive data such ascell-phone numberBank card numberAnd other information, a technical means of conversion or modification, to prevent sensitive data from being used directly in an unreliable environment.

For example, the government, the medical industry, financial institutions and mobile operators began to apply data desensitization earlier, because what they have is the most core private data of users, and the consequences of leakage are immeasurable.

The application of data desensitization is quite common in life. For example, in the details of Taobao shopping order, the merchant account information will be used*This is a way of data desensitization.

Big factories are also using six kinds of data desensitization schemes to strictly prevent

Data desensitization is divided into static data desensitization(SDM)And dynamic data desensitization(DDM):

Static data desensitization

Static data desensitization(SDM): it is suitable for extracting data from production environment and distributing it to test, development, training, data analysis and other scenarios after desensitization.

Sometimes we may need to integrate data from the production environmentcopyTo test, development library, in order to check problems or data analysis, but for security reasons, sensitive data can not be stored in the non production environment. At this time, sensitive data should be desensitized from the production environment and then used in the non production environment.

In this way, the desensitized data is isolated from the production environment to meet the business needs and ensure the safety of production data.

Big factories are also using six kinds of data desensitization schemes to strictly prevent

As shown in the figure above, the user’s realfull namecell-phone numberIDBank card numberadoptreplaceInvalidationOut of orderSymmetric encryptionAnd so on.

Dynamic data desensitization

Dynamic data desensitization(DDM): it is generally used in the production environment to desensitize in real time when accessing sensitive data, because sometimes different levels of desensitization processing are required for the same sensitive data reading in different situations, for example, different roles and different permissions will execute different desensitization schemes.

be careful: while erasing the sensitive content in the data, we also need to maintain the original data characteristics, business rules and data association, so as to ensure that our development, testing and data analysis business will not be affected by desensitization, and make the data consistency and effectiveness before and after desensitization.In a word: take off as you like, don’t affect my use

Data desensitization scheme

The data desensitization system can define and write desensitization rules according to different business scenarios, and can desensitize data for a sensitive field in the database table without landing.

Big factories are also using six kinds of data desensitization schemes to strictly prevent

There are many ways of data desensitization. Next, the data in the following figure will be used to demonstrate each scheme one by one.

Big factories are also using six kinds of data desensitization schemes to strictly prevent

1. Invalidation

When the desensitization data is processed in the invalidation scheme, the field data value is adjustedtruncationencryptionhideAnd other ways to desensitize sensitive data, so that it no longer has the use value. Special characters are generally used(*This method of hiding sensitive data is simple, but the disadvantage is that the user can not know the format of the original data. If you want to obtain complete information, you need to let the user authorize the query.

Big factories are also using six kinds of data desensitization schemes to strictly prevent

For example, we replace the real ID number with the ID card, and turn it into “220724”.**3523 “, very simple.

Big factories are also using six kinds of data desensitization schemes to strictly prevent

2. Random value

Random value replacement, letters into random letters, numbers into random numbers, text randomly replace text to change sensitive data, the advantage of this scheme is that it can retain the original data format to a certain extent, often this method is not easy for users to detect.

We seenameandidnumberThe fields are desensitized by randomization, while the randomization of first name, surname and surname is a little special, which needs the support of corresponding surname dictionary data.

Big factories are also using six kinds of data desensitization schemes to strictly prevent

3. Data replacement

Data replacement is similar to the previous invalidation method. The difference is that instead of occlusion with special characters, a virtual value is used to replace the true value. For example, we set the mobile phone number to “13651300000”.

Big factories are also using six kinds of data desensitization schemes to strictly prevent

4. Symmetric encryption

Symmetric encryption is a special reversible desensitization method. Sensitive data is encrypted by encryption key and algorithm. The ciphertext format is consistent with the original data in logic rules. The original data can be recovered by key decryption, and the key security should be paid attention to.

Big factories are also using six kinds of data desensitization schemes to strictly prevent

5. Average value

The average scheme is often used in statistical scenarios. For numerical data, we first calculate their mean value, and then make the desensitized values randomly distributed near the mean value, so as to keep the sum of the data unchanged.

Big factories are also using six kinds of data desensitization schemes to strictly prevent

For price fieldpriceAfter the average value processing, the total amount of the field remains unchanged, but the desensitized field values are all around the average value of 60.

Big factories are also using six kinds of data desensitization schemes to strictly prevent

6. Offset and rounding

In this way, the digital data is changed by random shift, and the offset rounding keeps the security of the data while ensuring the general authenticity of the range. Compared with the previous several schemes, it is closer to the real data and has great significance in the big data analysis scene.

For example, the date field belowcreate_timein2020-12-08 15:12:25Become2018-01-02 15:00:00

Big factories are also using six kinds of data desensitization schemes to strictly prevent

In practical application, data desensitization rules are often used with a variety of schemes to achieve a higher level of security.

summary

Whether it is static desensitization or dynamic desensitization, the ultimate goal is to prevent the abuse of private data within the organization and prevent the outflow of private data from the organization without desensitization. Therefore, as a programmer, it is the minimum integrity not to disclose data.

Organized hundreds of various technical e-books, students in need can pay attention to the public number of the same name“Something inside the programmer”reply「 666 」Take it from yourself. And if you want to add technology group, you can add my friends, talk about technology with big guys, push from time to time, and do some internal things for programmers.

This work adoptsCC agreementReprint must indicate the author and the link of this article

Recommended Today

[database] MySQL exercises and answers (educational administration management system)_ MySQL_ 5.7)

✨ statement The answer to the exercise is not guaranteed to be completely correct, which is for reference only MySQL:5.7 Recommended software: Navicat premium Learning SQL statements requires diligent practice! ✨ Field description of each table course CId Course serial number, primary key PCId Prerequisite course DId Opening department number, foreign key CName Course name […]