Recently, baidu semantic understanding platform Wenxin (Ernie) ushered in a heavy upgrade.The new task of text entity extraction is added, the customized text classification network is upgraded, and a series of improvements are made in training ability and deployment modeTo further improve the efficiency of NLP model development, developers can experience the new functions of Wenxin in the zero threshold AI development platform easydl.
Ernie is a semantic understanding technology and platform based on Baidu’s deep learning platform. It integrates advanced pre training model, comprehensive NLP algorithm set, end-to-end development kit and platform based services, providing a complete set of NLP customization and application capabilities for enterprises and developers.
Ernie website address:http://wenxin.baidu.com
In order to turn Ernie’s world-class technological breakthrough into the driving force of enterprise development and play a greater value in industrial application, Wenxin also provides a set of simple and efficient NLP development capabilities through easydl platform.
Easydl is a zero threshold AI development platform launched by Baidu brain. Based on the deep learning platform propeller independently developed by Baidu, and combined with the industry’s advanced engineering service technology, easydl covers two technical directions of vision and natural language processing, and supports one-stop intelligent annotation, model training, service deployment and other whole process functions. Developers do not need to know the details of the algorithm, and they can customize the model as soon as 5 minutes.
Figure 1: panorama of easydl text processing development services
Up to now, the peak number of calls of easydl text processing capacity carrying Wenxin has exceeded one million times, serving more than 1000 partners, and covering many fields such as finance, security, culture and innovation, etc., which has helped many enterprises take a key step in intelligent transformation.
The main upgraded functions of Wenxin in easydl text processing function are as follows:
Add text entity extraction task
“Text entity extraction”As the core task of text mining and information extraction, supporting the extraction of specific fact information from massive information sources is an important basis of artificial intelligence applications such as information retrieval, intelligent question answering, intelligent dialogue.
Wenxin’s newly launched “text entity extraction” task can meet the requirements of rapid and effective identification of named entities in text, such as extracting enterprise entities and transaction information from financial texts. In order to improve the usability of this capability, Wenxin also provides a series of supporting development services to help developers complete it more conveniently.
· Online intelligent annotation, saving cost: to solve the problem of data preparation, Wenxin has released a data annotation tool for the task of “text entity extraction”, which supports the direct delimitation and annotation in the text, which brings excellent annotation experience and higher annotation efficiency to taggers.
Figure 2: schematic diagram of intelligent annotation for text entity extraction
· Two training programs, flexible choice:The training scheme can be selected flexibly according to the data quantity. If you have a small amount of data (less than 1000), use“high-precision”The algorithm can get better training effect; if you have enough data, you can use it“High performance”The algorithm can train the model with short training time and fast prediction performance.
· Preset model and network, easy to developIn the model configuration of platform professional edition, Ernie pre training model with better effect and corresponding preset network can be selected, and network code modification is supported. The flexibility of model customization is improved from the source code level, and the creativity of experts is greatly released.
Upgrade text classification – single label task
“Text classification”Text content can be automatically classified and labeled. For example, in the field of news recommendation, “the Lakers beat the heat 4-2 to win the 17th championship” belongs to the sports category, while the National Bureau of statistics CPI rose by 3.3% year-on-year in April “belongs to the economic category.
This platform upgrades the model network of text classification (single label task), and also provides two training schemes with high precision and high performance. Based on the public datasets (classification tasks) provided by the platform, the “high precision” algorithm can get more than 90% accuracy; for more than 1W pieces of data, the “high performance” algorithm can obtain the extreme speed experience (usually about 15 minutes to complete the training). Select a reasonable training program, so that the model training to achieve twice the result with half the effort.
Provide more efficient development capabilities
· Increase multi machine training ability and speed up model training
The platform enriches the multi machine training ability of text processing, supports the training mode of V100 and P40 GPU models, multiple computing nodes, and accelerates model training,The platform also provides each user with 50 hours of training free of charge, 0 cost experience platform.
· Support direct application for private server deployment, faster deployment
Previously, easydl text processing has integrated a variety of deployment methods, including flexible expansion of public cloud API deployment, general device side SDK and special hardware adaptation SDK. This optimization also adds a model local private server deployment mode, which can quickly apply for and obtain the required deployment packages, and provide enterprises with more choices of AI landing methods.
Figure 3: private server deployment portal
· New model selling and purchasing function, saving cost and improving efficiency
The platform supports users to publish the trained “sentiment tendency analysis” high-precision algorithm model to the AI market for other users to purchase and retrain, and open a new paradigm of model trading. You can be either a publisher or a buyer of a model. For the purchased model, it can be directly retrained and deployed, greatly reducing the development cost and achieving predictable high-precision model effect.
Figure 4: schematic diagram of AI model market
The above are the new capabilities brought by this upgrade! Under the background of deep learning, to realize the simple development of NLP, let’s use the text heart of easydl platform!
In the future, Ernie will continue to rely on Baidu’s leading technical strength to further release the enabling efficiency of AI technology, enter the NLP developer world in a simpler and inclusive way, and help developers create higher business value.