Artificial intelligence is a science that studies how to simulate human cognitive ability through machines. At present, the interaction between computer vision and voice, which is widely used in artificial intelligence, relies on the deep learning method under supervised learning, while the training of deep learning algorithm under supervised learning relies heavily on human labeled data.
Relevant data shows that a newly developed computer vision algorithm needs tens of thousands to hundreds of thousands of labeled images for training, and the development of new functions requires training of nearly 10000 annotation images, and the regular optimization algorithm also needs thousands of images.
Behind these massive training data sets are the results of the joint efforts of countless taggers. As the famous science fiction writer Liu Cixin said, “in the current artificial intelligence, there are as many artificial intelligence as there are in front of it.”
However, with the acceleration of artificial intelligence landing process, this way of relying too much on human resources also exposes many disadvantages.
First of all, the commercialization of AI puts forward new requirements for the data annotation industry. In order to better meet the landing requirements and solve the specific pain points of the vertical scene, massive and high-quality annotation data is needed to support it, which virtually increases the management and labor costs of data service providers.
In addition, the increase of data demand also puts forward new requirements for the delivery ability of service providers, which is easy to cause chain reaction such as project delay.
In order to solve these problems, through the application of AI in data annotation and quality inspection, human-computer cooperation can effectively improve the labeling efficiency and give full play to the role of AI in the data annotation industry.
1. AI pre labeling
In phonetic transcription tagging projects, data taggers need to listen carefully to the pronunciation of each word, judge and transcribe their semantics, which has a high demand for the taggers’ dictation ability and concentration under long-term multi task.
Through the application of AI in this link, the speech data are preprocessed for speech recognition, text transcription and natural language understanding. After automatic annotation, manual proofreading is carried out, which not only reduces the difficulty of labeling, but also effectively improves the labeling efficiency.
Taking Manfu technology’s voice annotation tool as an example, with the blessing of pre labeling technology, the annotation tool will automatically recognize the transcribed voice data, and the tagger only needs to slightly modify the result of the pre annotation. Compared with the traditional transcribing operation, AI assistance can double the labeling efficiency and achieve more projects with less manpower.
2. AI quality inspection
A complete tagging process needs to go through multiple processes, such as labeling, auditing and quality inspection. Quality inspection plays a key role in the process of labeling, which is to find and remedy the defects and improve the overall quality of labeling.
At present, data quality inspection is mainly based on manual quality inspection. Duplicate samples and unqualified samples in marked data sets are found by random inspection. However, the manual spot check method has great deficiencies in accuracy and timeliness. It is easy to ignore the wrong samples, and it is difficult to accept the cost of repeated inspection.
These problems can be effectively solved by introducing AI in the quality inspection process. Compared with human quality inspection, machine quality inspection has more advantages in both efficiency and execution, and can cover all data, effectively discover various problems and improve data quality.
After the actual test of Manfu technology, compared with the traditional manual sampling inspection, AI assisted quality inspection can improve the data accuracy by more than 5%.
At present, labeling and quality inspection are the two links with the best effect of AI aided application. In the future, AI assistance can be introduced in the whole workflow process from the creation to the delivery of the annotation scheme, so as to give full play to the feedback role of AI for the data annotation industry, so as to realize the double improvement of efficiency and quality.