Use Python to pick out those “amazing” grannies in station B!



Recently, the new year’s Eve Party of station B has swept through the major video websites due to its unique creativity, which has brought great positive impact to the company. The share price also soared. Presumably, everyone is regretting not buying the stock of station B earlier




However, what we are going to discuss today is not the new year’s Eve Party of station B, but the core resource of station B: “amazing people”. The inspiration of this article comes from a question on the hot list of Zhihu






Data acquisition




A total of 859 answers have been obtained to the above questions, and the data in this paper also come from this. Because many of the answers will reflect the link with the grandma’s ID in the answers, as shown in the following figure:



We can crawl the space ID of the granny master in the question, but considering that not all the answers will have such ID, we extract some bold fonts and get some names of the granny as a supplement to the data



The above answer is a typical case, which refers to the very popular pupils who received cook’s birthday greetings. Some codes for extracting data are as follows:


#Start crawling data
driver = webdriver.Chrome()
url = ''
for i in range(1000):
     js="var q=document.documentElement.scrollTop=10000000"  

#Organize data
all_html = [k.get_property('innerHTML') for k in driver.find_elements_by_class_name('AnswerItem')]
all_text = ''.join(all_html)
pat = '/\d+'
spaces = list(set([k for k in re.findall(pat,all_text)]))


Now that we have obtained the ID of these “amazing” grandmother owners, the next step is to crawl their personal space in station B to get more detailed information:



The above is the personal space of famous scientist Geng in station B. from this, we can get the number of fans, the main types of videos (I always thought it should be science and technology, but I didn’t expect it was life. Station B’s discipline is OK), as well as the average number of plays, barrages and comments of all videos. As the basis for subsequent ranking, some codes are as follows:


upstat = pd.DataFrame(columns=['name','fans','face','main_type','total_video',
                               'total_play', 'total_comment'])
for i in range(len(spaces)):
        space_id = str(spaces[i].replace('/',''))
        url= '{}&jsonp=jsonp&article=true'.format(space_id)
        html = requests.get(url=url, cookies=cookie, headers=header).content
        data = json.loads(html.decode('utf-8'))['data']
        this_name = data['card']['name']
        this_fans = data['card']['fans']
        this_face = data['card']['face']
        this_video = int(data['archive_count'])
        total_page = int((this_video-1)/30)+1
        for j in range(total_page):
            url = '{}&ps=30&tid=0&pn={}&keyword=&order=click&jsonp=jsonp'.format(space_id,str(j+1))
            html = requests.get(url=url, cookies=cookie, headers=header).content
            data = json.loads(html.decode('utf-8'))
            if j == 0 :
                 type_list = data['data']['list']['tlist']
            this_list = data['data']['list']['vlist']
            video_list = video_list + [ this_list [k] for k in range(len(this_list))]
        type_list = list(type_list.values())
        type_list = {type_list[k]['name']:int(type_list[k]['count']) for k in range(len(type_list))}
        this_type = max(type_list,key=type_list.get)
        this_play = sum([video_list[k]['play'] for k in range(len(video_list)) if video_list[k]['play'] != '--'])
        this_comment = sum([video_list[k]['comment'] for k in range(len(video_list)) if video_list[k]['comment'] != '--'])
        upstat = upstat.append({'name':this_name,


Finally, we got the information of more than 200 “amazing” grandmother owners in station B. the overview data are as follows:








After obtaining these data, let’s first look at the distribution of the main types of videos released by these “amazing” grannies



As the classification of B station life is all inclusive, manual Geng and Li Ziqi are classified into life category. It is fantasy to think about it. Therefore, this type of video is grouped more. In addition, the proportion of technology and digital class is also very large, which confirms that station B is an excellent learning website. If you are interested, you can refer to another article: do you believe that you can learn programming by visiting station B?


In addition, videos can be collectively referred to as entertainment, including games, movies and TV. After that, the video types will be classified according to science and technology, life and entertainment, so as to find the most “amazing” grandmother in each category.


Before starting the official ranking, first use Python to splice the heads of these grannies, and get the following pictures to see how many of them are very familiar to you:



The code is as follows:


i = 0 
for i in range(upstat.shape[0]):
    LOC:'d: / Crawler / amazing / '+ upstat ['name'] + '. JPG'
 # request.urlretrieve(upstat['face'][i],loc)
    img = mpimg.imread(loc)[:,:,0:3]
    img = cv2.resize(img, (500,500),interpolation=cv2.INTER_CUBIC)
    if i % 20 == 0:
    elif i == 19:
        all_img = row_img
    elif i % 20 == 19:
        all_img = np.vstack((all_img,row_img))
    i = i+1    
plt.savefig ('head. PNG', DPI = 1000)





Comprehensive ranking




The next thing to do is bolder. We should take the courage to rank these grannies. Considering the number of fans, the average number of screen shots, the number of videos played, and the number of comments, we can get a comprehensive index. We hereby declare that this ranking is for entertainment only. If we want to further study, awsl will give you a comprehensive index


First of all, let’s take a look at the top 10 grannies



Xiaobian has just been listed in Amway’s wizard finance list recently. I suggest you go and have a look at it. I really put the complicated financial knowledge to the ground. Huanong brothers and Jing Hanqing are also on the list. Let’s take a look at the top 11-20 list:



Xu Da Sao, Li Ziqi and handmade Geng appear in the list at the same time. There is a chance in the future. I hope someone can plan a cooperation between them. The process is well planned. Manual Geng provides Li Ziqi with post-modern tools. Li Ziqi uses the artifact of Geng to make the hottest pepper in the world. After that, Xu Da Sao eats it in one bite, and the hand-made Geng finally collapses into Xu with his own brain Large Sao alleviates discomfort caused by hot pepper




Ranking by category




After the comprehensive ranking, all the grannies are ranked according to technology, life and entertainment, and they live in the top 10 of each category respectively





With the classified ranking, you can ask for it according to your preference. I believe that after watching, the grammar of brain hole will become larger. After a period of time, you can try to publish your own video on site B, and become a famous (strange) grandmother with double-digit fans in site B


Finally, the most popular video played by Geng in station B is used as the end of this article. This video reflects the theme of “amazing people” in this article. I hope you can try it in person. If you can write down the feeling of using your limbs after using it, you are welcome to share with us

Recommended Today

Implementation of docker deployment of Django + MySQL + redis + gunicorn + nginx

1、 Preface Docker technology is very popular now. It is very efficient to build a project environment through containers. So recently, I took the time to read some tutorials, and then changed the blog deployment mode to docker. I don’t feel that there is any particularly good blog on the InternetdockerdeploydjangoProject tutorial, specially wrote this […]