Preface of the author
In the past 10 years, I have participated in the design of risk control system in different fields of three companies. I have carefully studied all aspects of risk control system from front to back. However, I still feel that I have just stepped into the door with one foot.
Most of the products that people make have a clear purpose, such as order payment and what the account system needs to do at the beginning, and there are many competing products to refer to. However, the risk control system is totally different – it is impossible to fully understand what problems we have to face in the future, and we are cautious in doing every function, because if one does not pay attention to the wrong direction, it may be in some future We should completely overthrow the three stages.
For the lack of R & D resources, security needs, often put themselves in a very embarrassing situation at a certain time, the problem can not be solved, and the transformation faces a lot of time and communication costs.
Therefore, I will share some of the pits I have stepped on here, so that those who are ready to build risk control will have a number in their hearts.
Business security risk control design 101 information collection
Business risk control mainly includes four things:
1. Get enough data
2. Make enough flexible analysis platform to analyze data
3. Output risk events to block risks
4. Quantify the value of risk interception and continuously analyze cases to optimize the strategy
Taking data is almost the core of determining the success or failure of a risk control system. Due to the space problem, we will mainly talk about this point, and there are three things to consider
The more detailed the data, the better
Take account security as an example. If you can get basic login registration data, you can analyze it from frequency and login registration characteristics;
If you can further get the context of login and registration behavior, such as which pages are visited before login and what are visited after login, more analysis dimensions can be added from the visit behavior track, such as page stay time, whether there are pages to be visited, etc;
If you can also get the user’s operation behavior data, such as the mouse movement track, keyboard input, then you can further increase the analysis dimension from the operation process, such as whether the password has been input multiple times to delete? Is it a direct copy and paste account password?
Establish a standard log format
After confirming what data you can get, you should start to establish a standard log format.
Common log formats such as login, registration, ordering, password modification, binding certificate modification, etc. should be given a standard log format, and full consideration should be given to the unification of field names. For example, if the names of password and user name fields are not unified in different logs, there will be a lot of trouble in the subsequent analysis and policy specification.
3 data quality
Are all the necessary fields available?
Information that risk control concerns, such as IP address, useragent, and referer, is often ignored. However, the lack of such information may cause many strategies unable to be implemented. Therefore, it is necessary to have a clear list of information at the beginning of information collection. Once compromise is made, rework is required, and R & D will be ignored.
Is the data accurate?
It is more common to need the user’s access IP, and the IP address obtained is the server IP of the Intranet; or if you want the user name, the result is passed to uid. This requires a lot of early communication and confirmation work. Once online, if you find that the data is not correct, you will also be blindfolded.
Take the dataActive modeandPassive modeThere are two types:
1. Active mode
The active way is to read in the database and log.
The real-time performance of this method is relatively poor, and it is difficult to add information. However, it does not need research and development to cooperate with too many things. It is suitable for scenes where people like to eat and clothe themselves.
Of course, some mature companies have their own message bus. Risk control can subscribe to information in real time and then analyze it as a data source, but this is usually a small number;
2. Passive mode
The passive way is to provide interface to R & D, and let the business spray messages according to the format standard.
This cooperation cycle is very long, but high-quality information can be obtained according to the standard, so it is a relatively common way to build risk control system.
Stepping on the pit
If a message is a multi data source, the time order of the message must be considered
For example, the login log is sent from the public service, and the website access is taken_ Log, user operation behavior data is sent from page JS or SDK, so the time of the three is inconsistent.
It is necessary to make analysis and judgment after confirming that all messages are in place. Otherwise, if the real-time strategy considers that there must be a page keyboard to click when logging in, and the time when the two data are in place is inconsistent, there may be a large number of false sealing and accidents.
The quality of the collected data must be monitored regularly——
The collected data may be inaccurate due to technical structure adjustment, code update and other exotic reasons. If it is not found in time, it may cause errors in the following series of analysis process.
The collection point should choose a stable business point as far as possible. For example, it can collect the log in public service at one time. If there is a problem in the future, just find a point.
If you go to the front end to collect data from the web, mobile terminal and other points that call the login service, the work to be changed will be doubled if there is a problem, and there may be the situation that the log of the new business point cannot be covered.
About technology selection:
Message queuing is a must. Restful can only process business logs. For example, logging in at most several times a second. If you want to collect page access behavior later, thousands of messages per second must be used in the queue
For open source, rabbitmq or Kafka can be considered, and the stability is good.
About log storage:
Elk is a good choice to provide basic query function for subsequent analysis platform.
Information collection is often the most difficult link in the implementation of risk control, but it is also the most important link. Coverage, quality and timeliness all determine the success or failure of the project.
Because there are many details in the article, it is difficult to build a system because of the pressure of communication.
If you have a problem with this,Welcome to leave a message and communicate with usIf you are interested in the following content, please encourage the editor and we will provide the following chapters as soon as possible.
Introduction to the author
Liu Ming, co-founder and chief product technology officer of Ma’an Technology Co., Ltd
With more than 6 years of experience in risk control and product, he once worked in Netease and was responsible for the account system security of world of Warcraft in China. Now he leads the risk control team of Huaan technology Internet business to provide customers with risk control services including star products warden and red. Q.