Absrtact: with the exponential increase of data volume and the limitation of regular data type, the business scenario expansion of deep learning becomes more difficult. Graph neural network can make more accurate prediction, provide different personalized services for each user, and realize precision marketing, which is also the technical breakthrough for the secondary transformation of Internet enterprises.
In the online shopping festival such as 618 promotion every year, a personalized and accurate marketing promotion is very powerful for the e-commerce platform. Therefore, how to select the products that consumers are most likely to buy from a large number of products has become the focus of many e-commerce platform technology. And behind this is AI.
As a relatively mature AI technology, deep learning has been widely used in industrial production and enterprise development as the export of Internet dividend in the past. However, with the exponential increase of data volume and the limitation of regular data type, the business scenario expansion of deep learning becomes more difficult.
Therefore, the market began to focus on graph neural network (GNN) technology. Graph neural network can make more accurate prediction, provide different personalized services for each user, and realize precision marketing, which is also the technical breakthrough for the secondary transformation of Internet enterprises.
At present, Huawei cloud map neural network is greatly improving the overall computing efficiency by virtue of the high-efficiency neural network training advantages of modelarts, making the application of graph neural network including commodity recommendation more mature.
Industry application of graph neural network
At present, the mainstream deep learning technology is still CNN, RNN and other technologies (corresponding to image recognition, text mining and other fields). However, traditional deep learning technology (CNN, RNN) can not effectively deal with structural data, such as financial field, gene protein network, social network, commodity recommendation and so on. If we want to extend deep learning to more relational scenes, we can get better results by using graph neural network (GNN) technology in higher-order learning of graph data.
Taking knowledge map as an example, its application as graph neural network is more well known than the technology itself. There are many scenes in life, such as semantic search engine, intelligent customer service, life assistant and so on. The knowledge map constructed by graph neural network can provide video / live subtitle, content audit, intelligent customer service, insurance compensation, medical map, knowledge elimination and other services. With the help of knowledge map, the exclusive industry knowledge can be customized into a graph network to analyze the industry information and help enterprises to transform and upgrade.
In the future, the operation of artificial intelligence will be closer to the human brain. The appearance of graph neural network makes artificial intelligence begin to understand the world and understand the world, instead of just statistical fitting. How to make graph deep learning fully tap its application value and realize the application scenario landing of high-dimensional sparse data will be the key for homogeneous enterprises to reshuffle in the next decade.
Modelarts2.0 marks Huawei cloud map neural network landing
At last year’s Huawei cloud full connection conference, Huawei cloud launched a one-stop AI development and management platform modelarts2.0. Announced that Huawei cloud has made a breakthrough in the field of image deep learning, and Huawei cloud map neural network was officially launched.
More than ten new features and services released by modelarts2.0 include intelligent data filtering, intelligent data annotation, intelligent data analysis, automatic multiple model search, modelarts SDK, graph neural network, reinforcement learning, model evaluation / diagnosis, model compression / conversion, automatic difficult case discovery, online learning, etc., covering the whole life cycle of AI model. It can be seen that Huawei cloud modelarts is playing a big game in the next game. The implementation of graph neural network is a breakthrough for modelarts to realize causal reasoning in the field of deep learning, and it is also an indispensable link to realize the ability of automatic AI.
Huawei cloud map neural network is a new type of graph neural network technology jointly created by ges graph engine and modelarts. The new architecture is constructed by using distributed graph computing platform and deep learning computing platform in parallel, so as to realize the analysis ability of large-scale graph neural network.
According to the architect of Huawei cloud map neural network, the design principle of Huawei cloud map neural network (GNN) framework is: clear responsibilities and unified architecture. For a single algorithm, sparse processing operations such as data preprocessing and domain sampling are pushed down to graph engine; deep learning layer focuses on operator optimization, and various GNN algorithm frameworks are unified and unified operators are reused.
Large scale graph network processing based on distributed graph computing platform
In the calculation of enterprise level graph deep learning, the scale of graph will reach 10 billion or even 100 billion according to business demand. Therefore, a mature graph deep learning will give the calculation of super large scale graph network to independent distributed graph computing platform.
At present, most frameworks of graph neural network are dealing with static graph. This is because most frameworks treat the algorithm of graph neural network as offline computing task. The data of offline calculation is invariant (static). For each calculation, the complete data needs to be loaded once, so it is not suitable for processing dynamic graph. However, the graph data itself is often changing. Dynamic algorithm is needed to traverse the graph in the running process. Then, the data is called deep learning from memory, and then it is returned to the modeling process. This problem is not very obvious in small graphs, but it will become a serious performance problem in the billion level graph network. And the traversal time will increase exponentially and even cause downtime.
Huawei’s claim in dynamic graph is that it adopts self-developed ges graph engine to maintain graph data and ensure that data can be added, deleted and modified dynamically. At the same time, it can save the end-to-end time, especially for large-scale graph. At present, the processing of dynamic graph can be optimized. For example, the change of data on dynamic graph can be regarded as incremental data. The best way is to design incremental algorithm to analyze incremental data, instead of neighborhood sampling, random walk, gradient calculation and other operations on the total data. The research on incremental graph neural network algorithm is still at the forefront and has not formed a complete theory.
At present, ges graph engine has more than 20 graph scene algorithms and a large number of graph optimization algorithms, which can complete 100 million level graph query in seconds. In graph algorithm, the ges graph engine integrates and implements more than 20 common algorithms such as PageRank according to the needs of industries and enterprises. The application scenarios cover urban industrial production, pipeline monitoring, commodity recommendation, social recommendation, project analysis, enterprise insight, knowledge map, financial risk control, enterprise IT application, relationship mining and other fields, and supports point query, edge search, attribute filtering and other basic fields This query will query storage and other functions.
Take Pixie algorithm as an example. Pixie algorithm is an algorithm designed by Huawei cloud to construct multiple data into the same graph and configure corresponding schema, point and edge attributes and weights on this heterogeneous graph. Pixie algorithm is a new real-time recommendation algorithm, which overcomes the problem of data acquisition and fusion of heterogeneous graphs, supports comprehensive recommendation under multi request nodes, and can meet the needs of various composite, time-varying and diverse recommendation scenarios; under large amount of data, it can adapt to the dynamic changes of data without training the model in advance, and achieves good real-time recommendation effect with strong scalability.
The new framework solves the problem of high frequency interaction between graph algorithm and deep learning
Improving the efficiency of data processing and a unified algorithm framework based on the original graph engine is the key and difficult point in the development of the current graph neural network platform. The traversal of graph data and the interaction with deep learning will greatly reduce the operation efficiency of graph, which is one of the bottlenecks of graph deep learning.
Therefore, if graph deep learning wants to make a breakthrough in performance, it is necessary to redesign a new GNN framework. The following is the Huawei cloud map neural network framework authorized by AI front line.
(1) A new GNN framework based on graph engine: on the basis of efficient neural network training operator in modelarts, combined with the existing high-performance graph computing framework platform capability of ges, using the characteristics of graph engine with high concurrency and low delay, the GNN training process is highly parallelized, such as the estimation of jump probability on the edge, the sampling of vertex neighborhood, the construction of negative samples, and so on The system provides a dynamic scheduler to make these local operations highly parallelized, which can greatly improve the overall throughput of the system.
(2) Unification of various GNN algorithm frameworks: using the unified architecture, it implements unsupervised large-scale graph embedding (such as deepwalk, node2vec) and semi supervised graph convolution (such as GCN, graphsage) to reduce the maintenance cost of the system.
Figure: graph embedding and graph convolution calculation based on unified GNN architecture
(3) Integration of GNN and graph data management: enterprise level GNN applications are usually not one-time calculation, and the data scale is also large, so these data must be maintained and managed. However, the existing GNN usually does not have such considerations. Users can only build another database for maintenance and export the data as a whole when calculating. It not only consumes a lot of resources, but also introduces many problems such as data consistency. Ges uses the property graph model and the fact standard gremlin graph query language to manage and maintain the distributed graph data. When training is needed, all kinds of in situ operators are called locally in the graph engine for concurrent execution, which reduces the end-to-end performance loss.
On the same platform, the R & D personnel compared the experimental performance of this product with multiple open source versions in data preprocessing and various sampling modes (from the internal data of Huawei cloud)
Figure: (1) performance comparison between the open source version and the same platform in data preprocessing and various sampling modes; (2) system scalability test results
Huawei cloud map neural network greatly improves the overall computing efficiency of GNN by virtue of the high-efficiency neural network training advantages of modelarts and the high-performance graph computing advantage of ges. Taking node2vec algorithm as an example, in PPI dataset, Huawei cloud map neural network can complete the training from sampling to training in 2 minutes, which is 20 times higher than the traditional open source implementation.
Trade off between precision and resources
In terms of the accuracy of graph neural network model, Huawei cloud map neural network adjusts the model precision by setting parameters, and uses CPU or GPU to train graph neural network algorithm.
Due to the particularity of graph data, the performance and effect of CPU training are not inferior to GPU for most types of data. At the same time, for graph embedding and graph convolution algorithms, Huawei cloud graph neural network adopts different optimization methods to reduce resource utilization and improve computing performance. Graph embedding algorithm uses parallel acceleration and storage design to optimize positive sampling and negative sampling; graph convolution part focuses on optimizing acceleration matrix because of its high complexity, which is the mathematical change between layer and layer 。 In the future, Huawei cloud will consider further improving the computing performance of graph neural network from the hybrid hardware architecture based on its own artificial intelligence chip.
The life cycle management of Huawei cloud map neural network model relies on Huawei cloud one-stop AI development and management platform modelarts. The trained model can be deployed with one click, and the whole life cycle of data algorithm model reasoning can be viewed through the traceability diagram provided by the platform.
At present, it will take some time for the industry to realize large-scale application of graph neural network. However, the implementation of Huawei cloud map neural network provides the following developers with theoretical experience and practical basis for social, financial, gene, image semantic and other relational scenarios. At present, Huawei cloud neural network has published several papers in the global academic conferences on machine learning and data mining And won the “Zijin Longpan Award” of China Artificial Intelligence Summit in 2019.
Graph neural network is not only a step to realize the real intelligence of artificial intelligence, but also the beginning of artificial intelligence to solve the relational data which is difficult to deal with in-depth learning. From now on, artificial intelligence can understand and learn the complex relationship of the world. I believe that it will appear in our life with more postures. The most intuitive is the promotion of various online e-commerce shopping.
Huawei cloud 618 has been greatly promoted, and the AI development platform modelarts has also prepared a 10% discount package for users. If you are interested in graph neural network or AI development, please wash your duck!
Click follow to learn about Huawei’s new cloud technologies~