How to download your favorite music in Python

Time:2020-2-10

Preface

The text and pictures of the article are from the Internet, only for learning and communication, and do not have any commercial use. The copyright belongs to the original author. If you have any questions, please contact us in time for handling.

PS: if you need Python learning materials, you can click the link below to get them by yourself

http://note.youdao.com/noteshare?id=3054cce4add8a909e784ad934f956cef

Music is the adjustment of life. At present, many music can only be played but not downloaded. How can we be reconciled to being technicians?

Knowledge points:

  1. requests

  2. regular expression

 

Development environment:

  1. Version: Anaconda 5.2.0 (Python 3.6.5)

  2. Editor: pycharm

 

Third party Library:

  1. requests

  2. parsel

Webpage analysis

Target site:http://music.taihe.com/search?key=%E9%99%88%E7%B2%92

Analyze the real address of music

Choose a song to showTake a walktake as an example

在这里插入图片描述

Open the developer tool, select network – > media – > refresh the web page to get the real address of the music

But the obtained address can not be read in the source code, it must be Baidu music that hides it. Generally, there are two situations. The first is to use JavaScript to splice or encrypt the request connection, and the second is to hide the data. Because we don’t know what happened. So we can only slowly analyze the requested data. 在这里插入图片描述 在这里插入图片描述 After analysis, we can see that the real music address exists in this APIhttp://musicapi.taihe.com/v1/restserver/ting?method=baidu.ting.song.playAAC&format=jsonp&callback=jQuery17206453751179783578_1544942124991&songid=243093242&from=web&_=1544942128336

And we request that the API return a JSON data (that is, the dictionary data type of Python). As long as we use the rules of the dictionary, we can extract all our data.

URL splicing to get all data

First we got the real address of the music, and then we analyzed the URL of the real address to expect the secret of downloading all the music. 在这里插入图片描述 在这里插入图片描述 A closer look at the URL reveals that,?HinderfromParameters and_Requests that do not affect data even if they do not exist.

AndsongidIt’s the only songidfromThe parameter actually indicates which platform is coming from

So when we download music, we only need to get the songs in bulksongidYou can download all the songs.

 

Get singid in batch

在这里插入图片描述 Using developer tools, you can view the source code of the web pagesongidIf we analyze a singer page’surlYou will find that it can be constructed as well.

At this point, the whole page analysis is over.

Realization effect

在这里插入图片描述 在这里插入图片描述

Complete code

import re
 import requests
 ​
 ​
 def get_songid():
     "" "get songid of music" ""
     url = 'http://music.taihe.com/artist/2517'
     response = requests.get(url=url)
     html = response.text
     sids = re.findall(r'href="/song/(\d+)"', html)
     return sids
 ​
 ​
 def get_music_url(songid):
     "" "get download link" ""
     api_url = f'http://musicapi.taihe.com/v1/restserver/ting?method=baidu.ting.song.playAAC&format=jsonp&songid={songid}&from=web'
     response = requests.get(api_url.format(songid=songid))
     data = response.json()
     print(data)
     try:
         music_name = data['songinfo']['title']
         music_url = data['bitrate']['file_link']
         return music_name, music_url
     except Exception as e:
         print(e)
 ​
 ​
 def download_music(music_name, music_url):
     "" "Download Music" ""
     response = requests.get(music_url)
     content = response.content
     save_file(music_name+'.mp3', content)
 ​
 ​
 def save_file(filename, content):
     "" "save music" ""
     with open(file=filename, mode="wb") as f:
         f.write(content)
 ​
 ​
 if __name__ == "__main__":
     for song_id in get_songid():
         music_name, music_url = get_music_url(song_id)
         download_music(music_name, music_url)