Today, we introduce a python extension library, bilibilibili, for getting data from station B_ API
The available data include:
Video video module
User user module
Dynamic dynamic module
This time, we use the video from the 10th anniversary of “running man” to do a demo to get the bullet screen.
I’m a contrast
Without comparison, there will be no harm, just like a student from Harbin Institute of technology and a student from Zhejiang University.
This is the process of acquiring the barrage before:
1. Barrage data interface
https://comment.bilibili.com/…(a fixed URL address + CID +. XML of video)
2. Use request module to get data
3. Parsing data with XPath
Next, it’s time to show real technology.
Through bilibilibili_ API encapsulation,Barrage dataOnly one line of code is used to get the part:
danmu = video_info.get_danmaku()
Corresponding acquisitionBasic information of videoandComment informationIt’s just as convenient.
basic_info = video_info.get_video_info()
comments = video_ info.get_ Comments() start now
Quick start
Next, this article uses BiliBili_ API obtains the bullet screen data of the 10th anniversary special edition of “running man” and draws the word cloud.
Links to Videos:
https://www.bilibili.com/vide…
Station B has AV number and bv number. After revision, BV number is directly displayed in the link. One of these two must be provided.
Bvid is a new unique video identifier of station B. It is composed of 12 digits and letters. It is case sensitive. Please include the “bv” in the header when passing in
For example, “bv1gc4y1h722”
1) Installation process
The installation needs to rely on the request module, which encapsulates the API of station B data.
Through PIP installation:
pip install bilibili_api
1) Import module
from bilibili_api import Verify
from bilibili_api.video import VideoInfo
from bilibili_api.video import Danmaku
Videoinfo class – get video information (bullet screen, comments, number of coins, playback, etc.)
Danmaku class – barrage class, used to obtain and send barrage
Verify class, available but not required. Some video information can only be used after logging in (i.e. sessdata is required).
Sessdata and CSRF are needed for user operations such as liking and coin depositing.
For detailed methods of obtaining sessdata and CSRF, please refer to the following links:
https://github.com/Passkou/bi…A kind of API / wiki / sessdata and CSRF acquisition methods (Chrome as an example)
2) Obtain barrage data
Create a videoinfo object and pass in two parameters:
Bvid = bv1gc4y1h722 “(BV number of video)
Verify = verify (obtain barrage according to sessdata and CSRF)
The obtained bullet screen data is the list of “danmaku classes”. Through traversing, print its text
Paste a code:
Verify = verify (sessdata = yours, CSRF = yours)
video_info = VideoInfo(bvid="BV1gC4y1h722", verify=verify)
danmu = video_info.get_danmaku()
for i in danmu:
print(i.text)
3) Drawing word clouds
The word cloud is drawn by Jieba word segmentation and worldcloud.
Parameters such as “background color”, “background image”, “font” can be passed in through wordcloud object.
Paste a code:
wc = WordCloud(
background_color='white',
mask=background_Image,
font_path=r'./SourceHanSerifCN-Medium.otf',
color_func=random_color_func,
random_state=50,
)
word_ cloud = wc.generate (words_ STR) ා produce word cloud
word_ cloud.to_ file(" rm.jpg ") ා save picture
#Show word cloud image
plt.imshow(word_cloud)
plt.axis('off')
plt.show()
4) Final effect
Through the word cloud, we can see that the most obvious ones are “happy 10th anniversary”, “RM 10th anniversary”, “ha ha ha ha ha ha”
I’m a summary
Through this module, “bilibilibili”_ API “, can quickly access B station video and user data, as for the data to get down how to play, depends on their imagination~
Download the related source code of this article:https://alltodata.cowtransfer…
Click follow to learn about Huawei’s new cloud technologies~