What will Miss Ma’s lightning five whip dynamic word cloud picture be like? No worse than hip-hop


The text and pictures of this article come from the network, only for learning and communication, and do not have any commercial purpose. If you have any questions, please contact us in time for handling.

The following article is from farnast, author F

Novices and Xiaobai who have just come into contact with Python can copy the following linkWatch the basic introduction to Python video for free




The headlines in November belong to Ma Baoguo.

A 69 year old comrade was attacked by young people and did not speak of martial ethics.



Look at those who bully old comrades

Or Mr. Ma speaks of benevolence, righteousness and morality, and shaking his hand is a five whip.



Hahaha, so in this issue, we will use Python to make a dynamic word cloud picture of lightning five whip for Mr. Ma Baoguo.

The word cloud data comes from station B and is drawn using stylecloud word cloud database.



It mainly refers to an open source project on Baidu AI studio and uses paddleseg to segment the portrait.

Young F, don’t talk about martial arts. How about this, mouse tail juice.



Barrage data acquisition

Instead of crawling directly from station B, the third-party library BiliBili is used_ api。

This is a library written in python that calls various APIs of BiliBili, covering video, audio, live broadcast, dynamic, column, user, fan drama, etc.

Address: https://passkou.com/bilibili_ api/docs/


Using the following two methods of the video module, you can obtain the video barrage every day in November.



First, you need to get the values of sessdata and CSRF (bili_jct).

Google browser can be viewed through the following figure. The domain name is bilibili.com.



In order of hits, select the video ranking first to obtain the bullet screen. I didn’t expect Mr. Ma to be on fire for a long time, mouse tail juice.



Click the video ranking first, and then obtain the BV number, bv1hj411l7dp, in the access bar of the browser.

Obtain the barrage code as follows.

from bilibili_api import video, Verify
import datetime

Verify = verify ("your sessdata value", "your bili_jct value")

#Gets a list of dates with historical bullets
days = video.get_history_danmaku_index(bvid="BV1HJ411L7DP", verify=verify)

#Obtain the barrage information and save it
for day in days:
    danmus = video.get_danmaku(bvid="BV1HJ411L7DP", verify=verify, date=datetime.date(*map(int, day.split('-'))))

    f = open(r'danmu.txt', 'a')
    for danmu in danmus:
        f.write(danmu.text + '\n')


Get results.



I’m big E. I didn’t flash.

Jieba is used to segment the barrage data.

import jieba
def get_text_content(text_file_path):
    Get filled text content
    text_content = ''
    with open(text_file_path, encoding='utf-8') as file:
        text_content = file.read()
    #Data cleaning, only save the Chinese, letters and numbers in the string
    text_content_find = re.findall('[\u4e00-\u9fa5a-zA-Z0-9]+', text_content, re.S)
    text_content = ' '.join(jieba.cut(str(text_content_find).replace(" ", ""), cut_all=False))
    return text_content

text_content = get_text_content('danmu.txt')


Select Ma Baoguo’s original material video, and there is HD video on station B.

Address: https://www.bilibili.com/video/BV1JV41117hq


Refer to the information on the Internet and run the following code to download the video of station B.

from bilibili_api import video, Verify
import requests
import urllib3

Verify = verify ("your sessdata value", "your bili_jct value")

#Get download address
download_url = video.get_download_url(bvid="BV1JV41117hq", verify=verify)

baseurl = 'https://www.bilibili.com/video/BV1JV41117hq'
Title = 'ma Baoguo'

def get_video():

    headers = {
        'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.198 Safari/537.36',
        'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
        'Accept-Encoding': 'gzip, deflate, br',
        'Accept-Language': 'zh-CN,zh;q=0.9,en;q=0.8'
    headers.update({'Referer': baseurl})
    res = requests.Session()
    begin = 0
    end = 1024 * 1024 - 1
    flag = 0

    temp = download_url

    filename = "./" + title + ".flv"
    url = temp["dash"]["video"][0]['baseUrl']
    while True:
        headers.update({'Range': 'bytes=' + str(begin) + '-' + str(end)})
        res = requests.get(url=url, headers=headers, verify=False)
        if res.status_code != 416:
            begin = end + 1
            end = end + 1024 * 1024
            headers.update({'Range': str(end + 1) + '-'})
            res = requests.get(url=url, headers=headers, verify=False)
            flag = 1
        with open(filename, 'ab') as fp:
        if flag == 1:

    Print ('video download completed ')
    filename = "./" + title + ".mp3"
    url = temp["dash"]["audio"][0]['baseUrl']
    while True:
        headers.update({'Range': 'bytes=' + str(begin) + '-' + str(end)})
        res = requests.get(url=url, headers=headers, verify=False)
        if res.status_code != 416:
            begin = end + 1
            end = end + 1024 * 1024
            headers.update({'Range': str(end + 1) + '-'})
            res = requests.get(url=url, headers=headers, verify=False)
            flag = 1
        with open(filename, 'ab') as fp:
        if flag == 1:

    Print ('audio download completed ')


Remember to add the values of sessdata and CSRF (bili_jct)


Paddleseg portrait segmentation

Project based on Baidu AI studio, project address:



First, download and unzip the related dependency packages of paddleseg.

#Download paddleseg
git clone https://hub.fastgit.org/PaddlePaddle/PaddleSeg.git

cd PaddleSeg/

#Install required dependencies
pip install -r requirements.txt


Usually go to “GitHub” to download things. The speed is relatively slow. You can use the acceleration link.

With the addition of fastgit.org here, the download speed can soar from tens of K to a few megabits per second.

#New folder
mkdir work/videos
mkdir work/texts
mkdir work/mp4_img
mkdir work/mp4_img_mask
mkdir work/mp4_img_analysis


Create new folders for storing related files.

Here, you can place the previously crawled video and audio in videos.

First frame the material video, that is to obtain the pictures of each frame of the video.

def transform_video_to_image(video_file_path, img_path):
    Save every frame in the video as a picture
    video_capture = cv2.VideoCapture(video_file_path)
    fps = video_capture.get(cv2.CAP_PROP_FPS)
    count = 0
    while (True):
        ret, frame = video_capture.read()
        if ret:
            cv2.imwrite(img_path + '%d.jpg' % count, frame)
            count += 1

    filename_list = os.listdir(img_path)
    with open(os.path.join(img_path, 'img_list.txt'), 'w', encoding='utf-8') as file:

    Print ('video pictures saved successfully,% d in total '% count')
    return fps

input_video = 'work/videos/Master_Ma.mp4'
fps = transform_video_to_image(input_video, 'work/mp4_img/')


A total of 564 pictures were obtained.



Then use paddleseg to segment all the video images and generate mask images.

#Generate mask result picture
Python your path / paddleseg / pdseg / vis.py\
           --CFG your path / work / humanseg.yaml\
           --vis_ Dir your path / work / MP4_ img_ mask


The model is used for prediction, in which the humanseg.yaml file is provided by the author and can be used for image segmentation.

Pre training model deep lab v3p_ xception65_ For humanseg, download, unzip and install it in paddleseg / pretrained_ Model.

Because the pre training model is large, it will not be put on the network disk. You can directly visit the following link to download it.

#Download the pre training model deeplobv3p_ xception65_ humanseg


Remember to change the path information in the humanseg.yaml file to your own path.



Run the above three lines of commands, and finally 564 mask files will be generated.



Word cloud generation

Use the stylecloud word cloud library to generate word clouds, and use the font Fangzheng Lanting journal black.

def create_wordcloud():
    for i in range(564):
        file_name = os.path.join("mp4_img_mask/", str(i) + '.png')
        # print(file_name)
        result = os.path.join("work/mp4_img_analysis/", 'result' + str(i) + '.png')
        # print(result)
                                  font_ Path = 'founder Lanting print black. TTF',


Because the stylecloud library cannot customize the word cloud image, xiaof modified its code.

To Gen_ Stylecloud adds a mask_ IMG is the parameter that ultimately acts on Gen_ mask_ Array function.



In this way, the mask picture can be transformed into a word cloud picture!



Combine these word cloud images into a video.

def combine_image_to_video(comb_path, output_file_path, fps=30, is_print=False):
        Merge images to video
    fourcc = cv2.VideoWriter_fourcc(*'mp4v')

    file_items = [item for item in os.listdir(comb_path) if item.endswith('.png')]
    file_len = len(file_items)
    # print(comb_path, file_items)
    if file_len > 0:
        temp_img = cv2.imread(os.path.join(comb_path, file_items[0]))
        img_height, img_width, _ = temp_img.shape

        out = cv2.VideoWriter(output_file_path, fourcc, fps, (img_width, img_height))

        for i in range(file_len):
            pic_name = os.path.join(comb_path, 'result' + str(i) + ".png")
            if is_print:
                print(i + 1, '/', file_len, ' ', pic_name)
            img = cv2.imread(pic_name)

combine_image_to_video('work/mp4_img_analysis/', 'work/mp4_analysis.mp4', 30)


Use ffmpeg to further process the video, clipping + overlapping.

#Video clipping
ffmpeg  -i  mp4_analysis_result.mp4  -vf  crop=iw:ih/2:0:ih/5  output.mp4

#Video overlap
ffmpeg -i output.mp4 -i viedeos/Master_Ma.mp4 -filter_complex "[1:v]scale=500:270[v1];[0:v][v1]overlay=1490:10" -s 1920x1080  -c:v libx264 merge.mp4

#Add audio
ffmpeg -i merge.mp4 -i  videos/Master_Ma.mp4 -c:v copy -c:a copy work/mp4_analysis_result2.mp4 -y

#Generate GIF graph
ffmpeg -ss 00:00:22 -t 3 -i merge.mp4 -r 15 a.gif


The installation and use of ffmpeg depends on everyone’s own Baidu ~