Python crawled through the commodity data of Amoy data platform and found that the wig market was so hot

Time:2021-7-27

preface

The text and pictures of this article come from the network, only for learning and communication, and do not have any commercial purpose. If you have any questions, please contact us in time for handling.

 

 

 

Recently, I found a good data website called “Amoy data”. The data inside are the merchant data of Taobao, including store name, category, price, average transaction price, sales volume, sales amount, etc

Python crawled through the commodity data of Amoy data platform and found that the wig market was so hot

 

A classmate told me about this website. In that case, let’s start climbing

Python crawled through the commodity data of Amoy data platform and found that the wig market was so hot

 

Project objectives

Climb to get the professional data of Taobao wig. I chose the wig casually at that time. If I want to choose something else, I have to charge

Python crawled through the commodity data of Amoy data platform and found that the wig market was so hot

 

Maybe it’s fate. You know what programmers need

Python crawled through the commodity data of Amoy data platform and found that the wig market was so hot

 

Victim address

https://www.taosj.com/industry/index.html#/data/hotitems/?cid=50023283&brand=&type=&pcid=

 

environment

Python3.6

pycharm

Crawler code

Import required tools

import requests
import csv

 

Analyze the web page, first open the developer tool F12, copy the data you need, and find the label where the data is located

Python crawled through the commodity data of Amoy data platform and found that the wig market was so hot

 

Python crawled through the commodity data of Amoy data platform and found that the wig market was so hot

 

Find the required URL and parameters in headers

Python crawled through the commodity data of Amoy data platform and found that the wig market was so hot

 

Python crawled through the commodity data of Amoy data platform and found that the wig market was so hot

 

url = 'https://www.taosj.com/data/industry/hotitems/list?cid=50023283&brand=&type=ALL&date=1596211200000&pageNo=1&pageSize=10&orderType=desc&orderField='.format(page)

headers = {
    'Host':'www.taosj.com',
    'Referer':'https://www.taosj.com/industry/index.html',
    'User-Agent':'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36',
}

response = requests.get(url=url, headers=headers)
html_data = response.json()

 

Extract relevant data from JSON data

lis = html_data['data']['list']
for li in lis:
    tb_url = 'https://detail.tmall.com/item.htm?id={}'.format(li['id'])
    dit = {
        'title ': Li ['title'],
        'shop name ': Li ['shop'],
        'category': Li ['nextcatname '],
        'list price': Li ['oriprice '],
        'average transaction price': Li ['price '],
        'sales volume ': Li ['offer30'],
        'sales amount ': Li ['price30'],
        'taobao address': TB_ url,
    }

 

Save data

F = open ('panning data. CSV ', mode ='a', encoding ='utf-8-sig ', newline =' ')
csv_ Writer = CSV. Dictwriter (F, fieldnames = ['title', 'brand', 'store name', 'category', 'list price', 'average transaction price', 'sales volume', 'sales amount', 'Taobao address'])
csv_writer.writeheader()
print(dit)

 

design sketch

Python crawled through the commodity data of Amoy data platform and found that the wig market was so hot

 

Python crawled through the commodity data of Amoy data platform and found that the wig market was so hot

 

Complete code

import requests
import csv

F = open ('panning data. CSV ', mode ='a', encoding ='utf-8-sig ', newline =' ')
csv_ Writer = CSV. Dictwriter (F, fieldnames = ['title', 'brand', 'store name', 'category', 'list price', 'average transaction price', 'sales volume', 'sales amount', 'Taobao address'])
csv_writer.writeheader()

for page in range(1, 51):
    url = 'https://www.taosj.com/data/industry/hotitems/list?cid=50023282&brand=&type=ALL&date=1596211200000&pageNo={}&pageSize=10&orderType=desc&orderField=amount&searchKey='.format(page)
    """
    Copy the parameters in the requests headers in the developer tool and remember to add cookies
    """
    headers = {

        'Host': 'www.taosj.com',
        'Referer': 'https://www.taosj.com/industry/index.html',
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36',
    }

    response = requests.get(url=url, headers=headers)
    html_data = response.json()

    lis = html_data['data']['list']
    for li in lis:
        tb_url = 'https://detail.tmall.com/item.htm?id={}'.format(li['id'])
        dit = {
            'title ': Li ['title'],
            'brand': Li ['Brand '],
            'shop name ': Li ['shop'],
            'category': Li ['nextcatname '],
            'list price': Li ['oriprice '],
            'average transaction price': Li ['price '],
            'sales volume ': Li ['offer30'],
            'sales amount ': Li ['price30'],
            'taobao address': TB_ url,
        }
        csv_writer.writerow(dit)
        print(dit)