[crawler + visualization] Python crawls epidemic data and makes visual display

Time:2022-5-22

Knowledge points

  1. Basic process of crawler
  2. json
  3. Requests sends network requests from the crawler
  4. Pandas table processing / saving data
  5. Pyecarts visualization

development environment

  • Python 3.8 is relatively stable. Anaconda Jupiter notebook, the release version of the interpreter, is professional in writing data analysis code
  • Pycharm professional code editor is divided into versions by year and month

 

Students who have questions about this article can add [information white whoring, answer exchange group: 910981974]

Crawler complete code

Import module

Import requests # send network request module
import json
Import pprint # format output module
Import pandas is a very important module in PD # data analysis

 

Analysis website

First find the target data to climb today

https://news.qq.com/zt2020/page/feiyan.htm#/


Find the URL where the data is located

Send request

url = 'https://view.inews.qq.com/g2/getOnsInfo?name=disease_h5&_=1638361138568'
response = requests.get(url, verify=False)

 

get data

json_data = response.json()['data']

 

Parse data

json_data = json.loads(json_data)
china_ data = json_ Data ['areatree '] [0] ['children'] # list
data_set = []
for i in china_data:
    data_dict = {}
    #Region name
    data_dict['province'] = i['name']
    #Add confirmation
    data_dict['nowConfirm'] = i['total']['nowConfirm']
    #Death toll
    data_dict['dead'] = i['total']['dead']
    #Number of cured
    data_dict['heal'] = i['total']['heal']
    #Mortality
    data_dict['deadRate'] = i['total']['deadRate']
    #Cure rate
    data_dict['healRate'] = i['total']['healRate']
    data_set.append(data_dict)

 

Save data

df = pd.DataFrame(data_set)
df.to_csv('data.csv')

 

Data visualization

Import module

from pyecharts import options as opts
from pyecharts.charts import Bar,Line,Pie,Map,Grid

 

Read data

df2 = df.sort_values(by=['nowConfirm'],ascending=False)[:9]
df2

 

Mortality and cure rate

line = (
    Line()
    .add_xaxis(list(df['province'].values))
    .add_ Yaxis ("cure rate", DF ['healrate ']. Values. Tolist())
    .add_ Yaxis ("mortality rate", DF ['deadrate ']. Values. Tolist())
    .set_global_opts(
        title_ opts=opts. Titlepts (title = "mortality and cure rate"),

    )
)
line.render_notebook()

 

Number of confirmed cases and deaths by Region

bar = (
    Bar()
    .add_xaxis(list(df['province'].values)[:6])
    .add_ Yaxis ("death", DF ['dead ']. Values. Tolist() [: 6])
    .add_ Yaxis ("cure", DF ['heal ']. Values. Tolist () [: 6])
    .set_global_opts(
        title_ opts=opts. Titleopts (title = "number of confirmed cases and deaths in various regions"),
        datazoom_opts=[opts.DataZoomOpts()],
        )
)
bar.render_notebook()

 

Recommended Today

Webpack essays

Webpack packages only JS files by default Portal file configuration Single JS entry:’./index.js’ Multiple entries are packaged into a JS, array format entry:[‘./a.js’,’./b.js’] Multiple entries are packaged into multiple JS, and the name is key entry:{pageone:’./a.js’,pagetwo:’./b.js’} Webpack packaging HTML files Normal packing list page HTML Use plugins to package HTML files (HTML webpack plugin) Installation […]