Openai extends k8s to 7500 nodes to support machine learning; Graph diffusion network improves traffic flow prediction accuracy

Time:2021-7-20

Developer community technology weeklyI’m meeting you again. Let’s take a look at the important news that we developers should pay attention to this week.

  • TRECs: a new framework for text image processing
  • Openai extends k8s to 7500 nodes to support machine learning
  • Apache ecarts 5 officially released
  • Webrtc becomes formal standard of W3C and IETF
  • Chang’an chain, the first independent and controllable blockchain technology system in China, was released
  • Jingdong open source pytorch face recognition toolkit facex Zoo
  • AAAI 2021 graph diffusion network improves traffic flow prediction accuracy
  • AAAI 2021 uses the confusion relationship between tags to improve the effect of text classification

Industry news

1. Google Research Institute launched a new framework for text image processing: TRECs

Openai extends k8s to 7500 nodes to support machine learning; Graph diffusion network improves traffic flow prediction accuracy

To create a general neural machine translation system that can translate between any language, Google researchers have developed a new framework, namelyTReCS(Tag-Retrieve-Compose Synthesize system)。 By improving the evoking way of image elements and how the trace notifies their position, the image generation process is significantly enhanced. The system has received training of over 25 billion examples and has the potential to process 103 languages. Its function aligns the mouse track with the text description and creates a visual label for the provided phrase. The framework uses controllable mouse trajectories as fine-grained visual basis to generate high-quality images according to the user’s narration. The marker is used to predict the object label of each word in the phrase.

2. Openai extends k8s to 7500 nodes to support machine learning

Openai extends k8s to 7500 nodes to support machine learning; Graph diffusion network improves traffic flow prediction accuracy

In order to meet the needs of large-scale models such as gpt-3, clip and Dall +, as well as the fast and small-scale iterative research of scaling law similar to neural language models,Openai extends the infrastructure k8s cluster to 7500 nodes.According to its description, for large machine learning jobs, a node is usually occupied by a single pod, and the cluster deployed by openai has a bisection bandwidth. Therefore, although it has many nodes, the pressure of the scheduler is relatively low, and there will be scheduling pressure only when a new task creates hundreds of pods at a time. In addition, openai also describes in detail its important work in expanding k8s cluster, such as using IP addressing based on alias to solve the networking problem of a large number of nodes, deploying etcd and API servers on dedicated nodes to disperse the load, locating the oom problem when using Prometheus and grafana to collect indicators, and designing the health check of the cluster, And reasonable allocation of cluster resources in the team.

3. Apache ecarts 5 officially released

Openai extends k8s to 7500 nodes to support machine learning; Graph diffusion network improves traffic flow prediction accuracy

On January 28, Apache ecarts 5 was officially released, and this time brought5 modules15 featuresThe specific contents are as follows:

  • Dynamic narration(new dynamic sorting histogram and line chart, multi-dimensional display, more detailed custom series animation).
  • Visual design(the default design is greatly improved, the label is clearer, the timeline function, the personalized prompt box, the fully upgraded dashboard, and the more beautiful fan-shaped fillet design).
  • Interaction ability(optimize and improve the state management, add special effects and greatly improve the performance).
  • Development experience(enhanced data set data conversion capabilities, simpler language pack replacement settings).
  • Rewrite the code with typescript,It brings a lot of exciting features.
  • Accessibility(still pay attention to barrier free design, new Decal pattern to distinguish color expression).

4. Webrtc becomes the formal standard of W3C and IETF

Openai extends k8s to 7500 nodes to support machine learning; Graph diffusion network improves traffic flow prediction accuracy

After years of development, webrtc has been supported by many popular web browsers. The most detailed news is that the World Wide Web Consortium (W3C) and the Internet Engineering Task Force (IETF) have just announced:Web based real-time communication technology (webrtc) has become an audio and video transmission standard in the field of web.From the technical level, the framework allows developers to easily add audio and video chat functions to their projects. Alissa Cooper, chairman of IETF, commented: “IP based audio and video communication technology has completely changed the way people all over the world communicate. By integrating these technologies into the web platform, it is helpful to greatly expand its scope of use. Thanks to the close cooperation between IETF and W3C, webrtc technology has been formally standardized today. It is reported that after the webrtc standardization work is completed, any software project that wants to realize similar functions in the future will have a set of guidelines that can be clearly followed to ensure that the relevant functions can be correctly realized and meet various requirements.

5. The first independent and controllable blockchain technology system in China    “ Release of “Chang’an chain”

Openai extends k8s to 7500 nodes to support machine learning; Graph diffusion network improves traffic flow prediction accuracy

According to CCTV news reports, China’s first independent and controllable blockchain software and hardware technology system “Chang’an chain” was officially released in Beijing, and the first batch of application scenarios such as supply chain finance and carbon trading were launched.

It is reported that,“Chang’an chain” has the characteristics of modularization, supports on-demand customization, and realizes data “available but not visible”,Build a sharing mechanism to help the trusted storage and sharing of data in the whole process of transaction, circulation and statistics. At present, “Chang’an chain” has achieved all independent research and development of software and hardware.

6. Jingdong open source pytorch face recognition Toolkit       FaceX-Zoo

Openai extends k8s to 7500 nodes to support machine learning; Graph diffusion network improves traffic flow prediction accuracy

Jingdong open source framework facex zoo. Relying on highly modular and scalable design, facex zoo provides training modules with a variety of supervisory heads and backbone networks to achieve optimal face recognition.In this tool, people can test models on most popular benchmarks with simple configuration changes. In addition, it has a simple but full-featured face SDK to verify the trained model and make preliminary application. The tool does not contain a large number of existing technologies, but it is easy to expand and upgrade. The developers of Jingdong said that in the future, researchers also plan to further increase the number of facex zoo modules, such as face analysis and face lighting, supplement the backbone network architecture and the number of supervisory heads, and try to improve the efficiency of model training through distributed data parallel technology and hybrid precision training.

Academic frontier

1. AAAI 2021 graph diffusion               Network improves the accuracy of traffic flow prediction

Openai extends k8s to 7500 nodes to support machine learning; Graph diffusion network improves traffic flow prediction accuracy

As an important issue in intelligent transportation, urban traffic prediction is committed to accurately predict the traffic information of different regions in the city, so as to better realize the traffic control and congestion control between regions and ensure the public safety of the city. This paper will introduce a new methodUrban traffic flow prediction model based on spatiotemporal graph diffusion network.Traffic flow forecasting with spatial temporal graph diffusion network, a paper jointly published by Jingdong digital Silicon Valley R & D laboratory, Jingdong city and South China University of technology, has been accepted by AAAI 2021 (CCF class a), a top conference in the field of artificial intelligence.

*Link to the paper:

http://urban-computing.com/pdf/AAAI2021TrafficFlow.pdf

2. AAAI 2021 uses the confusion relationship between tags to improve the effect of text classification

Openai extends k8s to 7500 nodes to support machine learning; Graph diffusion network improves traffic flow prediction accuracy

This paper mainly focuses on the full use of tag information. Different from the traditional label smoothing or label embedding methods, this paper hopes to fully consider the input when using label information, because the input affects the overlap or dependency between labels. At the same time, the method in this paper is model agnostic, which can further improve the effect of different models and has flexible use methods. Finally, more comprehensive modeling and full use of tags can achieve better results at a lower cost.

*Link to the paper:https://arxiv.org/abs/2012.04987

* the above information is from the Internet, edited by the Jingdong technology developer official account.

It does not represent the position of Jingdong technology developers*

Recommended reading

Welcome to click【Jingdong Technology】, learn about the developer community

More wonderful technical practice and exclusive dry goods analysis

Welcome to official account of Jingdong technology developer.

Openai extends k8s to 7500 nodes to support machine learning; Graph diffusion network improves traffic flow prediction accuracy

Recommended Today

Implementation example of go operation etcd

etcdIt is an open-source, distributed key value pair data storage system, which provides shared configuration, service registration and discovery. This paper mainly introduces the installation and use of etcd. Etcdetcd introduction etcdIt is an open source and highly available distributed key value storage system developed with go language, which can be used to configure sharing […]