What are the application scenarios of AI technology in music products?


Automatic annotation, smooth transition, music authentication, AI creation, when AI technology is applied to the music industry to bring convenience and more choices for human spiritual culture and entertainment life, it is also an exciting thing.

With the emergence of deep learning algorithm, the maturity of big data and 5g technology, ai artificial intelligence has gradually integrated into our production and life, playing a role in education, medical care, government affairs, urban management and other aspects.

With the in-depth research and application of AI technology in the music industry, music artificial intelligence is no longer new, and many new applications and products have been amazing.

Based on the understanding of music technology and products, this paper briefly combs all kinds of application scenarios of AI technology in music products.

1、 Automatic annotation

When the amount of platform library reaches the fixed order, if we rely on the traditional labeling mode again, it will cost a lot of money and suffer less subjective impact. Mobile tagging technology has been widely concerned. It can not only replace mobile tagging to save cost, but also objectively evaluate the content of music. Therefore, it can be extended to music recommendation of streaming media.

For example, spodify and KKBOX are both good for deep learning to make recommendations, in which KKBOX uses data such as frequency, lyrics, and user related annotations and comments as input, and judges whether music meets the recommendation conditions from multiple dimensions such as music, scene and emotion. Similar to KKBOX’s recommendation dimension, the general dynamic tagging function is also used to tag from the aspects of music, response, scene, instrumental music and emotion.

What are the application scenarios of AI technology in music products?

(example: the label of music label on the platform of my company)

For the tagging, I have heard some less professional Tucao on the top. For example, I saw that the automatic annotation of audio may make complaints about the two emotions of a song as “joy” and “Sadness” at the same time.

Before explaining the reason, we can simply popularize classifier, single label multi classification task and multi label multi classification task in machine learning.

In short, the classifier is trained by using the known input and output data, and then the classifier will classify the unknown input data or output a value. For ⼀ classifier models, the predicted results are two or more than two (only one result means that the classification model is not needed if the result is confirmed). If the number of possible results is 2, it is called ⼆ classification task ⼤ 2 is multi classification task; For emotion, there may be multiple results, such as hyperactivity, cheerfulness, quietness and sadness, so emotion classification is a multi classification task.

If we think that the emotion model is a single label multi category task, then it is absolutely impossible to have “joy” and “Sadness” appear at the same time. If “happiness” and “Sadness” appear at the same time, they can only exist in multi label and multi category tasks.

Is it wrong to have “joy” and “Sadness” at the same time? Not sure!

Music processing based on deep learning is generally segmented processing, that is, music is divided into multiple segments, and then each segment is predicted to judge its possible label. If the mood of a song fluctuates, for example, the mood of a song changes from “joy” to “Sadness”, then this situation is entirely possible. In reality, many songs in life do have multiple emotional labels that are mutually exclusive.

2、 Smooth transition

Smooth transition function is a new “cool” function in recent years.

The simple understanding is that when the ⼀⾸ song is about to finish playing, the next ⼀⾸ song may be ⽆ sewing ⼊. This kind of smooth transition between songs will not make the audience feel very abrupt.

The realization of this function also depends on the technology based on deep learning.

The principle of “Zhi” is to take the last part of the song and the head part of other songs which may have smooth transition as training samples. The trained model can predict the next segment that can be transited in the current output segment, and then when the player plays the tail segment of the song, the model can get the smooth transition of the next segment.

3、 Music authentication

Music infringement on the Internet has always existed, but it is often difficult for music copyright owners to protect their rights and interests on the Internet.

Because the Internet has a huge amount of content, and the content form is complex, for example, music content is only used as the background music of video, it is too difficult to find and identify manually.

In this regard, the application of AI technology has been able to realize real-time monitoring of whether there is infringement of songs in video, live broadcast or broadcast Festival.

The principle is to extract the key features of the songs in the copyright library and save them in the cluster database, then extract the frequency features to be detected, and quickly retrieve whether there is similar data in the database through the data technology.

At present, the company with similar technology, in addition to the author’s company, acrcloud is also more representative.

4、 AI creation

When AI enters the level of music creation, there are many AI music creation tools in the Internet industry, such as amber music, AIVA, jukedeck, ecrett music, melody, orb composer, etc.

At the company level, Sony, Google, Baidu and openai, a non-profit organization of artificial intelligence, have tried in the field of AI composition.

In 2016, Sony used a software called “flow machines” to create a Beatles style melody, which was then made into a complete pop song “Daddy’s car” by composer Benoit Carre.

In 2018, Microsoft announced that the fourth generation of Xiaobing joined the competition in the virtual singer market, and “sang” the invisible wings.

AI composer “AIVA”, developed by AIVA technology, created rock music “on the edge” and cooperated with singer Taryn Southern to create pop music “love sick”;

In China, the related products of my company can realize the functions of music recognition (identification of music elements in music works), lyrics and composition in AI intelligent creation, and have realized commercial authorization and application.

What are the application scenarios of AI technology in music products?

At the specific level of AI intelligent music creation, AI composing tools can assist creative creation.

For example, British music producer Alexa Da kid used the machine learning music generation algorithm in IBM Watson cognitive computing platform to create the single “not easy”, singer Taryn southern and the tool developed by AI composer amber music jointly created “break free” and AIVA cooperated to create the streaming music “love sick”. These works were once popular.

With the birth of more and more AI music creation tools, as the assistant of musicians, to help create more high-quality works, AI composers’ music creation ability is gradually recognized.

5、 Conclusion

When AI meets music, music is infused with more and more vitality. The tide of intelligence is coming. AI + music, the future is worth looking forward to!