Hello everyone, I’m Xu Zhenwen. Today’s topic is “Tencent game big data service application practice based on Flink + servicemesh”, which is mainly divided into the following four parts:
- Introduction to background and Solution Framework
- Real time big data computing onedata
- Data interface service onefun
- Microservice & servicemesh
1、 Introduction to the solution framework and solution
1. Offline data operation and real-time data operation
First of all, we have been doing the game data operation for a long time. In 13 years, we have been doing offline data analysis of the game, and can apply the data to the operation of the game.
However, there was a defect in the data at that time, which was that most of them were offline data. For example, the data generated today will not be pushed online until the next day after we have calculated the data. So the real-time data, and the real-time intervention of game users, real-time operation effect will be very bad. Especially for example, if I win the prize today, I can get the gift bag tomorrow, which is very uncomfortable for players.
What we advocate now is: “what I see is what I want” or “I want what I want right away”. Therefore, since 16 years ago, the whole game data has gradually shifted from offline operation to real-time operation. However, in the process of doing it, offline data is definitely indispensable, because some offline calculations, accumulated values and data calibration are very valuable.
The real-time aspect is mainly to supplement our experience of game operation. For example, after playing a game or completing a task in the game, we can immediately get the corresponding reward or the next step of playing guidance. For users, this kind of timely stimulation and intervention, for their game experience will be better.
In fact, it’s not just games, but other aspects are also the same. So when we do this system, we combine offline with real-time, but we mainly move towards real-time. The direction of big data in the future is also going to be in the direction of real-time as much as possible.
2. Application scenarios
1) in game mission system
This scene to introduce you, is the task system in the game, you should have seen. For example, the first one is eating chicken. How many rounds do you finish every day? Did you share it? There are other activities that will do resumes, but these resumes are all real-time, especially those that need to be fully calculated or shared with other communities. In the past, when we were doing data operation, we would only get the reward the next day after the task was completed. Now, all tasks can be intervened in real time.
The task system of the game is a particularly important link in the game. We should not think that the task system is to let everyone complete the task and collect money. In fact, the task system gives players good guidance, so that they can get a better game experience in the game.
■ 2) real time ranking
Another very important application scenario is the ranking list in the game. For example, the glory of the king should be on the star and the king is actually in the way of ranking. But our rankings may be more specific, such as today’s combat power rankings or today’s matchmaking rankings, which are all real-time rankings of global computing. And we have the snapshot function. For example, when there is a snapshot at 0:00, we can immediately give rewards to the players in the snapshot.
These are typical application cases of real-time computing, a task system and a ranking list. We will introduce others later.
3. Game demand for data
Let’s talk about why there is such a platform. In fact, when we first did data operation, we developed it in the form of silo or manual workshop. After receiving a demand, we will do a resource review, data access, big data coding. After coding and data development, we will also do online resource application, release and verification, and then develop the service interface after big data calculation, and then develop the page and online system. After these are finished, we will send them to the line for online monitoring. Finally, there will be one Resource recovery.
In fact, there was no problem in this way at a very early time. Why not adapt now? The main reason is that the process is too long. We now have very high requirements for game operation. For example, we will be able to access the ability of data mining. After the real-time calculation of big data is completed, we need to integrate the real-time user profile and offline profile, and then recommend them which tasks are suitable for him, and then guide them to complete them.
In this case, the threshold of the original approach is relatively high, each of which has to be done separately. Moreover, the cost is high, and the reusability of data is relatively poor. It is easy to make mistakes, and there is no way to precipitate. After each one is finished, code recycling will be thrown into one piece. At most, when I do it next time, I will remember that I have this code, which can be used for reference a little, but this kind of reference is basically a manual way.
Therefore, we hope to have a platform based way to integrate project creation, resource allocation, service development, online testing, independent deployment, service online, online monitoring, effect analysis, resource recovery and project closure into one-stop service.
In fact, we refer to the idea of Devops, that is, your development and operation should be completed by one person, and there should be such a system to support this matter. When a service is presented on the platform, it is possible to reuse the calculated data, such as the number of real-time logins or kills. Then this indicator can be shared in subsequent services.
And with such a platform, developers only need to focus on their development logic, and the other two operation and maintenance release and online operation are guaranteed by the platform. Therefore, we hope to have a platform based way to unify data calculation and interface services. Through data standardization and data dictionary unification, we can form different data applications above. This is our first goal.
In fact, we are all in this way now. The first thing is to follow the guiding ideology of Devops. Especially when Tencent does it, the amount of data services is very large. For example, we did 50000 or 60000 marketing services last year. In this case, if there is no platform to support and manage these services, the cost of relying on people is very high.
Three modern, big data application Devops.
Our idea is the same: three modernizations, and realize the Devops idea of big data application.
- Standardization: process specification, data development specification and development framework;
- Automation: resource allocation, release online, monitoring deployment (this is indispensable in Devops);
- Integration: data development, data interface development, test release, operation and maintenance monitoring.
Therefore, we will divide the application system of big data into three parts: one is the development of big data, and the other is the development of data service interface. Of course, there are some pages and clients behind the interface. After these are completed, these development needs to be supported by a complete development process.
In this way, we can provide one-stop data development and application solution services, unified activity management, data index calculation and development management, and all kinds of data application interface automatic production management for various data application scenarios.
Such a system can guarantee these things, and we also reasonably split it here. Don’t mix big data and interfaces together. We must do decoupling. This is a very critical place.
5. Overall architecture of data service platform
■ 1) calculation and storage
You can take a look at this framework. I think it can be used for reference. If you want to build a data service platform internally, the basic idea is the same. The bottom-level iasss can be used directly from Tencent cloud or alicloud or other cloud services.
We mainly do the upper part of things, and the bottom part of the calculation and storage is not care when we do the system internally. It is better to contract out this part. Now that iasss has developed to this level, these things can be purchased directly on the cloud like MySQL database or redis database, such as Kafka, pulsar, Flink and storm.
Our internal storage includes tredis and tspider, which are actually the upgraded versions of redis and mysql. I suggest that if you build it yourself, you don’t need to pay too much attention to it.
2) service scheduling
The core of the system is mainly in the middle of the service scheduling part, it is a unified scheduling API, that is, some services in the upper layer can be sent out and then unified scheduling. The other is the process development. We have an indispensable scheduling system. Here we use DAG scheduling engine. In this way, we can combine offline tasks, real-time tasks, real-time + offline, offline + function interface services to complete more complex real-time data application scenarios.
For example, in our current real-time ranking list, after the real-time computing task is sent to Flink, a URL will be sent to Flink at the same time. After Flink gets the URL, it will send all qualified data to the URL. This URL is actually a function service. These function services sort the data in redis, and finally generate a ranking list.
Further down the scheduler, you can continue to expand horizontally. For example, I can be a scheduler for storm, a scheduler for Flink, a scheduler for spark, and so on. In this area, we can form our own algorithm library, which can be done according to the scene. For example, some of Flink’s SQL packages are subpackaged, that is, SQL is passed in, and it can calculate and encapsulate jar packages. In addition, for example, some simple data starting and rule judgment can also be done, and the algorithm library can be directly divided into this section.
In fact, this is not directly related to the business scenario, but the algorithm library must be related to the scenario. In addition, we will have a file writing channel at the lower level, such as the distribution of some jar packages. Here Tencent uses COS to transmit some data and submit jar packages.
There is also a command pipeline, which is mainly for machines. For example, when submitting Flink tasks, it must be through the command pipeline, and then pull down the jar package on a machine, and then submit the task to the Flink cluster at the same time. The data pipeline is a similar function.
3) various management
In addition, there is also an important content, such as operation monitoring, cluster management, system management (user rights management, business management, scene management, menu configuration management, etc.) of the green section on the right, as well as the message center and help documents, which are necessary for the whole system.
There is also a part of component management, including big data component management, function management and binary management of services, which can be managed uniformly here.
Data assets, such as the data indicators that we can generate through Flink or storm, are managed in this way, including labeling or delimiting these indicators after we calculate them, and we also use them as data assets.
The most important thing is the management of data tables. Whether we are Flink or storm, the final landing point of its calculation must be calculated through a data table. Everything else is OK. Data reports, such as how much data is calculated every day, how many successful calculations are made, how many tasks are running every day, and how many new tasks are added. All these can be done in it, including the release changes of our version. There is also an external management side, which can be done according to the business scenario. You can see when we demonstrate our management side. In fact, our menu is relatively simple. According to our data access, we can access the data from the source to Kafka or pulsar. Then, the data indicators are calculated based on the accessed data tables. For example, jar packets with some characteristics are used for data mixed calculation of multiple tables, or mixed calculation of added tables. A series of sub packages are done through hard scenarios.
After we finally finish these tasks, all big data is exposed through external service API, such as whether the final game task is completed or not. After the user ID comes, we can see whether the user’s task is completed. Some application scenarios like this can be operated directly by using the API.
This is the whole process. If we talk about it in detail, we will be more clear.
2、 Real time big data computing onedata
1. Data development process
This is our overall data application process:
Our game server first uploads the data to the log server (data access part), and then the log server transfers the data to Kafka or pulsar, which is the message queue.
The data table is the data table, and the data table is the description. Indicators and data are developed based on the description table. For example, we have three types here, one is SQL, the other is the framework we have sub installed, you can fill in its personality code by yourself, and then you can write Flink program online.
There is also a brand-new local code written, and then sent to the system to test. As mentioned before, we must decouple big data computing and data interface. The way of decoupling is storage. We use redis for storage. In this way, redis and SSD disks can be combined, and then rockdb is added to hold the hot data in redis. At the same time, it lands these data into the SSD disk through this rockdb. Therefore, its read-write performance is very good, that is, it uses the entire disk as a data inventory, unlike ordinary redis In the same way, in the case of big data, the memory is used as the storage object.
After big data is stored in the data calculation, the latter is simple. We provide two kinds of query services: one is the calculation index, which can be generated by clicking on the interface, which is called the rule interface; and then we also provide the characteristic storage to the media, and I can define its SQL by myself Or query mode, and then process the data to generate the interface.
Another way is that we directly configure the data in Flink and storm to a function interface on our side. For example, in the ranking method I just mentioned, we give an interface. After processing in Flink, he spits the data into the function interface, and the function interface processes the data for secondary processing.
This is the whole processing method, so what we mentioned earlier is to build a comprehensive, managed and configurable big data processing service based on Flink and storm. The main consumption is Kafka data, and pulsar is now used in a small amount.
By doing so, we can lower the threshold of data development. Many people do not need to understand Flink or storm. As long as they can write SQL or some simple logic functions, they can complete the development of big data.
2. Unified data calculation
In fact, when we did it before, there were some optimization processes. Originally, every calculation task was written with jar package, and after that, editing, packaging, development and publishing were completed. Later, we divided three scenarios. One is sqlization, that is, some things that we can express in SQL. We try our best to package them into SQL, and then a jar package can execute the submitted SQL.
There is also an online webide, which is the logic of processing functions. For example, in the sub storm, you can expose the blot and splash. After you write these two functions, you can submit the parallelism to run. But here we implement it based on Flink.
The other is scenario configuration. Our personalized jar package can be scheduled uniformly and executed according to the scheduling logic.
3. Data computing service system
This is the process of our entire onedata computing system, which supports three types: one is self-developed SQL, one is Flink SQL, and the other is jar package.
We developed how to store SQL. We first used storm, but stormsql is very inefficient. So we split SQL according to SQL parser. We parse SQL ourselves and form functions. After SQL is submitted, we directly translate it into Java bytecode, and then throw the bytecode into storm for calculation.
We also inherited this method for Flink. We will talk about the difference between the two methods later. In fact, our self-developed SQL is better than Flink SQL in terms of flexibility.
This is platform oriented. We can’t just put a flinksql directly to run, because we want to count the execution of the entire business logic, such as the amount of data processed by SQL, the correct and wrong data, including some attenuation, should be counted.
This is the basic process. After that, we will form some basic scenarios on it, such as real-time statistics, PV and UV, which can be calculated by using independent jar package. You can configure the table to calculate. In addition, real-time indicator services, such as homicide books, the accumulated number of gold coins, the number of games, the number of times the king’s glory goes down the road, such data can be used as real-time indicators.
There is also a rule trigger service, which triggers an interface when a certain data in the table meets certain conditions. There are also real-time communication rankings and some customized services.
■ 1) self developed SQL
Let’s talk about the process of self-developed SQL. In order to avoid hive like (function stack calls), we used the syntax abstraction of SQL parser to generate a section of function, so we didn’t need so many reconciliation calls.
This is the function generation process, and the final generated code is such a piece of code, which is used to do calculation logic. A function can be completed without calling the function stack, so the efficiency will be greatly improved. We used to run 80000 on a single core, but now we can run 200000.
During the whole processing, we compile the SQL into bytecode. After Flink consumes the data, it converts the data into the functions that SQL can execute, which is the way of roll. Then the whole data of roll is transferred to the class for execution, and finally output.
This scenario is suitable for, for example, flinksql, which has a state value. If we want to count a certain maximum value, we must keep the maximum value of the user in memory. The SQL developed by us is a function written by ourselves. It stores data with the help of a third party, such as tredis. Each time you only need to read and write data, do not need to do too much memory hold.
At present, the status can be landed in real time. Even if it is hung up, it can be immediately executed. Therefore, the data calculation over 10g and 100g is not a problem. However, if flinksql is to calculate, its state value must be hould into memory, and it can only be recovered by its check point after it is hung up.
So this is the application scenario of these two kinds of SQL.
In addition, we can do other things in SQL. Our data is persistently stored in the storage. If it is the same table and the same latitude in the storage, for example, we all use QQ. In this latitude, we configure two indicators, can we finish the calculation at one time? Consume only once, calculate the data, and store it once.
In fact, there are a lot of this in big data computing. At present, we are doing platform based computing. For example, one is to calculate the number of logins, and the other is to calculate the highest level. The logic of these two calculations is different, but the consumption data table is the same, and then the aggregation dimension is the same, and so is the aggregation keyword. Then the data can be consumed once, and the data can be calculated at the same time, which greatly reduces the storage and calculation costs.
We now have more than 11000 indicators in the whole game, which are calculated. The latitude of storage is more than 2600, and the actual saving of calculation and storage is about 60%.
Two SQL, even more SQL. It’s normal for us to calculate more than ten indicators in one table. It used to cost more than ten times. Now we can calculate it only once. And this situation is insensitive to users. User a has assigned a dimension to this table, and user B has assigned a dimension to this table. Then, when we calculate the data of these two users, we will consume once, calculate twice, store once, and finally get the same data.
**3) online real-time programming
- There is no need to build a local development environment;
- Online development test;
- Strict input and output management;
- Standardized input and output;
- One stop development test release monitoring.
Let’s introduce the online real-time programming mentioned just now. In fact, it is very troublesome for developers to build a local Flink cluster for development debugging. Therefore, we are now providing a test environment. The upper level code is fixed and cannot be modified. For example, data has been consumed, data processing, and finally to the storage to plug it.
In this way, we can unpack simple logic and need function code, but it is more complex than SQL and simpler than automatic jar package development. We can write code online, submit and test the code directly after writing the code to complete the output of the result. And the advantage of this is that the reporting logic of the data and the statistical logic of the data are all packed here. Just take care of the development of business logic.
4. Flink feature application
- Time characteristic: monitoring based on event time watermark can reduce the amount of calculation and improve the accuracy;
- Asynchronous IO: improve throughput, ensure sequence and consistency.
When we first did it in storm, the time of data generation and the time of data entering the message queue were all based on the time stamp in the message. Each message should be compared. With Flink and watermark, this part of the calculation can be reduced.
The actual test results are also quite ideal. We used to use single core computing in storm, which is about the previous QPS, plus read-write and processing performance, and a single core with five threads. But in Flink, we can get to 10000, plus the storage IO cost of redis.
On the other hand, we wanted to get the data from redis, calculate the maximum and minimum values, and then write them to redis after finishing the calculation. All of these are written synchronously. However, one problem with synchronous IO is that the performance is not high. So now we are changing it to asynchronous IO, but asynchronous IO also has a feature that the processing of the whole data must be synchronous. The data must be taken from redis first, and then the value will be calculated, and then it will be pushed into it. After that, the next unified data will be processed.
Let’s do some optimization like this. Flink has some features that can ensure the consistency of our data and improve efficiency.
5. Unified big data development service – service case
Then we will introduce more cases. If you play the League of heroes, then the task system is designed by us. Next time you play this task, you can think of me. There are also Tianlong Babu, CF, King glory lbs glory war zone (through big data real-time calculation + lbs data ranking), daily activities of King glory (real-time data + interface + rule calculation), and which friends are online in real time and match you.
3、 Data interface service onefun
1. Export of data application
Next, we will introduce the next function. There are also some problems when we do the function. If the data is stored in the storage, if the storage is opened directly and let others use it at will, the storage pressure and management level will be very problematic. So later we adopted a solution similar to FASS. We manage the metadata stored in the database. After that, the interface will be configured. If you want to use my dB, I will compare the maximum QPS of this dB. You can use this amount only after you allow it.
For example, the maximum QPS of my DB is only 100000. If you want to apply for 110000, I will not be able to apply for it. I can only inform DB to expand the capacity of redis and provide it to you after the expansion. Therefore, this involves the metadata management of our indicator data and the connection between the interface.
2. Onefun: an all in one function execution engine
3. Online golang function execution engine based on SSA
Here we will focus on golang. In fact, we do it based on the SSA characteristics of the golang language itself. We have an actuator, which has been written. Its function is to submit the golang code you wrote and load it into its executor.
And we can take the code we write as a function library, accumulate it and put it in. It can call these function libraries when it is executed. The code syntax written in this is exactly the same as golang.
At the same time, when we execute in this, we specify a coroutine, and each coroutine is executed by sandbox mechanism. The first implementation is implemented by external context. Then we can realize the development of Web-based golang. This is a bit like Lua’s scripting language. You can write the language online and submit it for execution directly.
4. Online function service engine based on V8 engine
5. Integrated function execution engine — function as a service
This is our online function writing process:
The lower right corner is our function code writing area. After writing, the black box on the left is click test. The output can be written here. Click test will output the result. In this way, we greatly expand the development ability of our data platform. The original is to finish the Golang code, then debug it and send it to the online environment to test. Now we can standardize it very much. For example, the introduction of data source, we can directly stipulate that you can only import the data source that you have applied, you can’t introduce the data source arbitrarily, including the introduction of your data source, QPS I can see it in this way.
- Reduce the start-up cost;
- Faster deployment pipeline;
- Faster development speed;
- Higher system security;
- Adapt to microservice architecture;
- Automatic expansion capability.
This is our one-stop method. After developing the function, we can directly submit it. We can see the real-time report in Prometheus + grafana.
6. Case introduction
This is a typical application. When calculating in Flink, it filters the data, and after that, it conducts a remote Call, this remote call execution function code, in most cases, one development can complete the development of big data and the development of this function interface, so that the development of such an activity can be completed. The threshold of the whole activity development is much lower. The real realization of Devops is that the development can complete the whole process by itself.
4、 Microservice & servicemesh
1. Microservice is the only way for data application
The above is the implementation principle and mechanism of onedata and onefun. How do we apply it internally? We vigorously promote this system inside the game.
This is especially the interface. In fact, if we want to microservice, what we can do with big data is that we can control resources and tasks with yarn or k8s, but the real service governance is still in the interface. At present, we have 3500 up and down interfaces, and 50 new interfaces are added every week.
So when we do it, we also take it into consideration. Originally, our services were developed one by one, but there was no governance. Now we add services or develop them one by one. Even some services we will turn into a service, but we have joined the governance of this service.
Many people are proposing microservices. If there is no platform to govern microservices, it will be a disaster. Therefore, micro service brings us convenience, but it also brings us some problems. Therefore, in our scenario, micro service is very good. Every interface can be used as a service, which is a natural microservice.
2. Integrated service governance design
But this kind of microservice governance will be a big problem for us, so we spent a lot of energy to build a microservice governance system. From the time of project registration, he registered the project in the microservice center, and put the API into When the services are published and sent to the cluster, these services should be actively registered with our roster service, namely Consoul.
However, registration in the service is not used as the origin of the service, but after we get to the service, we do the health examination and status collection in Prometheus. As long as I register, I can immediately perceive and collect the status, and then mainly do real-time report and alarm.
First of all, we have a guarantee for the stability and health of the service. The other is that after the service information is registered in the consult, we have a service gateway. We use enviy. In fact, we also use it as sidecar internally, which will be introduced later.
After registration, envoy will load all the load into the log center. It will do the routing of its services. At the same time, we will report the whole log to the log center, including the logs of the gateway, to the log center, which will do offline reports and real-time alarm monitoring.
So we also add a configuration based on consult, that is, real-time control including server can be configured through consult. After configuration, we can watch it immediately and then execute it.
This is the basic service governance, but now our service governance has been upgraded, which is better than this one. The basic principle is this.
3. Unified control of North South flow + East West flow
In addition, we have implemented a management and control of envoy, which we call service governance, mainly for traffic management, such as rich and poor load strategy, routing strategy, fusing, timeout control, fault injection and so on.
We can send the data to our agent through the configuration management of consult. The agent can then send the data to enviy through the istio interface and k8s API. In this case, the API geteway and sidecar are envoy. So we can write to his XDS interface through istio, and then we can send all the configuration information here.
This system can control the whole cluster, the unified control of north-south flow and east-west flow. We are going to open source this system in the future. Now we are mainly using it internally. In addition, we have also made a graphical configuration. For all the configuration of envoy and istio, we have transformed yaml to istio and then to UI to make it graphical. We can do unified control in this area.
Moreover, after we have developed the agent, we will support multiple clusters, that is, multiple k8s clusters can be supported as long as they are joined in. We manage the API geteway.
There is also the sidecar management in servicemash. The function interface or the rule interface we just mentioned is a server.
Of course, there is also a function of chaos mesh, which we are still studying, and we are ready to implement it in this system.
4. Full link traffic analysis based on servicemesh
This is an analysis we made through servicemesh. Although we can see from a macro perspective how much pressure our interface has on DB, in fact, it is not enough for us to monitor the pressure by importing traffic in. Therefore, we can use the Servicemesh controls the export flow and import flow, and then makes detailed statistics of the flow. After the statistics, a snapshot can be generated. The data comparison between this snapshot and the next snapshot, the amount of incoming flow, and the pressure on the following flows are all calculated.
This is the whole diagram. We have several test cases. Between these two test cases, we can calculate the flow analysis of downstream pressure. The later analysis of downstream pressure and the expansion and reduction of downstream resources are of great value.
5. Case introduction
Finally, we will introduce some cases that we have implemented with this system, such as game data review (career review), task system and ranking list.
Q & A
Q1: how is servicemesh deployed? What are the main problems to be solved?
At present, the implementation of servicemesh technology we are using is istio, version 1.3.6. This version does not support physical machine deployment, so we deploy it in k8s. There are two deployment methods, which can be directly installed by using the istioctl command, or by using kubectl after generating yaml files.
The main problem of servicemesh architecture is the governance of east-west traffic in the cluster. At the same time, sidercar of servicemesh can be used as a protocol proxy service and can shield the service development technology stack behind sidercar. The services behind sidercar can be developed in various languages, but the traffic management and routing can be under unified control.
Q2: can you introduce the microservice governance architecture?
In my opinion, microservice governance architecture can be divided into two categories:
- Under the current k8s architecture, the governance of service instances is basically managed by k8s, including service instance publishing, upgrading, content expansion, service registration and discovery, etc;
- Service flow governance, which is commonly known as service governance, is mainly implemented by microservice gateway and service grid. The service gateway realizes the traffic governance inside and outside the cluster, and the service grid realizes the traffic governance within the cluster.
Q3: what kind of technical background do developers have?
For big data developers, we only need to know SQL statements and basic statistical knowledge to use our system.
Q4: real time computing, do you have any suggestions on the choice of Flink and spark?
Spark was also used in large-scale and real-time computing in 15 and 16 years. However, the version at that time was still weak in real-time computing, and there was still data accumulation in 500 ms batch processing. Therefore, there would be some problems in real-time performance. Spark was good at data iterative calculation and algorithm calculation. However, if the real-time requirements are not high and there are algorithmic requirements, spark is still a good choice.
Flink is a kind of churn processing model at the beginning of design, so Flink is more suitable for scenarios with high real-time requirements. In our internal test, we found that Flink’s churn computing throughput is much better than storm and spark. Moreover, Flink’s current window mechanism is very useful for window computing in real-time computing. Therefore, Flink is recommended for general real-time computing or scenarios with high real-time requirements.
Q5: is there any storage scenario for game playback data server?
There are two ways to play back the game. One is to record and transmit the playback, which is very expensive, but simple and timely. The other is to send the control command back to the server and restore the scene in another service. This method is relatively low in cost but complicated in use.
Q6: what protocol does the client use to send data to the server in the playback scenario?
It’s usually a private agreement for the game.