Caricature translation, inlay AI, Tokyo University papers are included in AAAI’21

Time:2021-4-28

Content summary: A Study on the automatic translation of comic text has aroused heated discussion. The mantra team composed of two doctors from the University of Tokyo published a paper, which has been included in AAAI 2021. The mantra project aims to provide automatic machine translation tools for Japanese comic books.

Original: hyperai super nerve

Key words: machine translation, emotion recognition, caricature AI


Recently, by Tokyo University mantra team, Yahoo (Japan) and other institutions jointly releasedTowards fully automated manga translation_(address of paper) https://arxiv.org/abs/2012.14271 )_ This paper has aroused the attention of academic circles and the second dimension circle.

Caricature translation, inlay AI, Tokyo University papers are included in AAAI'21

Caricature translation, inlay AI, Tokyo University papers are included in AAAI'21

As shown in the figure: the first on the left is the original Japanese version, and the English Version (second on the right) and Chinese Version (first on the right) are automatically output

The mantra team has successfully implementedThe dialogue, atmosphere words, labels and other words in the cartoon are automatically recognized, and the roles are distinguished and the context is contacted. Finally, the translated words are accurately replaced and embedded in the bubble area.

With this translation artifact, it is estimated that the translation team and the friends of Zhuiman should be enjoying themselves.

Publishing papers, publishing data sets and commercialization

In terms of scientific research, this paper has been accepted by AAAI 2021. The research team has also developed a translation evaluation data set consisting of five comic books with different styles (fantasy, love, combat, suspense and life).

Openmantra comic translation evaluation data set

Address:_https://arxiv.org/abs/2012.14271_

Data format: annotated JSON file and original image

Data content: 1593 sentences, 848 scenes, 214 pages of comics

Data size: 36.8 MB

Updated: December 7, 2020

Download address:_https://hyper.ai/datasets/14137_

In terms of productization,Mantra plans to launch packaged automatic translation engineIt not only provides automatic translation and distribution services for publishers, but also publishes services for individual users.

Here are some translations of Riman’s men around us, which we selected from mantra’s official twitter,This cartoon is full of joy and emotion, with the background of personification of digital devices commonly used in daily life

Caricature translation, inlay AI, Tokyo University papers are included in AAAI'21

Caricature translation, inlay AI, Tokyo University papers are included in AAAI'21

Japanese version of men around and Chinese English version of automatic machine translation

Recognition, translation and character embedding are indispensable

The specific implementation steps are explained in detail by mantra research team in the paper towards fully automated manga translation.

The first step is to locate the text

The first step in realizing automatic cartoon translation is to extract text region.

Caricature translation, inlay AI, Tokyo University papers are included in AAAI'21

However, due to the particularity of comics, dialogues from different characters, onomatopoeia, text tagging and so on, will all be displayed in a cartoon picture. Comic artists will use bubbles, different fonts, exaggerated fonts to show different effects of text.

Caricature translation, inlay AI, Tokyo University papers are included in AAAI'21

The recognition of hand-painted and abnormal characters in comics has become a difficulty

The research team found that due to the various fonts and hand-painted styles in comics, even using the most advanced OCR system (such as Google cloud vision API), the performance on comics is not ideal.

Therefore, the team developed a text recognition module optimized for comics, which realized the recognition of irregular characters by detecting text lines and recognizing the characters of each text line.

The second step is content recognition

In comics, the most common text is the dialogue between characters, and the dialogue text bubbles will be cut into multiple pieces.

This requires automatic machine translation to distinguish roles accurately, pay attention to the cohesion of subjects and avoid repetition, which puts forward higher requirements for machine translation.

Caricature translation, inlay AI, Tokyo University papers are included in AAAI'21

Click zoom in to view the scene classification, text order and emotion recognition process

In this step, we need to achieve it through context awareness and emotion recognition. In context awareness, mantra team used text grouping, text reading order and visual semantic extraction to achieve multimodal context awareness.

The third step is automatic character embedding

Mantra, as an automatic engine, can not only distinguish roles and contact context, but also solve the most time-consuming and labor-intensive part of cartoon Translation — character embedding.

Caricature translation, inlay AI, Tokyo University papers are included in AAAI'21

In the process of inlaying characters, the first step is to erase the inlaid area, and then inlay the characters. Because the Japanese, Chinese, and English characters are different in form, spelling, combination, and linking, this process is particularly difficult.

In this step, we need to do: page matching → text box detection → pixel statistics of text bubbles → split connected bubbles → alignment between languages → text recognition → context extraction.

Experiment: data set and model testing

In the experimental part of the paper, mantra team mentioned that there is no multi language comic data set at present, so they created openmantra (open source) and pubmanga data setsOpenmantra is used to evaluate machine translation, including 1593 sentences, 848 scenes and 214 pages of comics. The mantra team has asked professional translators to translate the data set into English and Chinese.

Openmantra comic translation evaluation data set

(same as above)

Address:_https://arxiv.org/abs/2012.14271_

Data format: annotated JSON file and original image

Data content: 1593 sentences, 848 scenes, 214 pages of comics

Data size: 36.8 MB

Updated: December 7, 2020

Download address:_https://hyper.ai/datasets/14137_

Pubmanga data set is used to evaluate the constructed corpus, which contains annotations: 1) the borders of text and frame; 2) Japanese and English texts (character sequences); 3) The reading order of frame and text.

In order to train the model, the team is ready842097 pairs of Japanese and English comic pages, a total of 3979205 pairs of Japanese English sentences.The specific method can be read in the paper, and the final model effect evaluation is completed manually, which is invited by mantra teamFive professional Japanese English translatorsGrade the sentences with professional translation assessment program.

Behind the project: interesting souls learn together

At present, this paper has been included by AAAI 2021, and the work of productization is also steadily advancing. From the twitter of mantra team, we can see that many cartoons have successfully used mantra for automated machine translation.

Such a treasure project was completed by two doctoral students from the University of Tokyo. CEO shonosuke ishiwatari and CTO ryota hinami graduated from the University of Tokyo and founded mantra team in 2020.

Caricature translation, inlay AI, Tokyo University papers are included in AAAI'21

Mantra CEO Shi hexiangjie (left) and CTO RI nanliangtai (right)

CEO Shi hexiangjie,He graduated from the Department of information science, University of Tokyo in 2019.He focuses on the research and development of natural language processing, including machine translation and dictionary generation. He is also the second author of this paper.

It is worth mentioning that Shihe Xiangjie has rich research experience. He not only visited CMU, but also interned for half a year at Microsoft Asia Research Institute in Beijing from 2016 to 17. At that time, he was engaged in NLC (Natural Language Computing) natural language computing research in the team of Liu Shujie, chief researcher of MsrA.

CTO sun Nanliang, Taishi and xiangzhijie entered school in the same year, focusing on the field of image recognition.During the same period of 2016-17, he worked with Shi hexiangjie as an intern at Microsoft Asia Research Institute.

Such a pair of small partners with complementary skills have completed most of mantra’s work. Are they enviable in terms of both quantity and results?

If you want to know more about mantra, you can visit the paper (_https://arxiv.org/abs/2012.14271_)、 Project website_(https://mantra.co.jp/)_ Or download the dataset_(https://hyper.ai/datasets/14137)_, Further research is needed.