A crawler movie station based on golang

Time:2021-9-22

Go Movies

Shadow station implemented by golang + redis (low-level crawler). No management background, effect station:https://go-movies.hzz.cool/Support mobile terminal access and playback

Built in automatic crawler, which basically meets the daily film viewing needs.

GitHub address

https://github.com/hezhizheng/go-movies

Home page effect

A crawler movie station based on golang

Use installation

#Download
git clone https://github.com/hezhizheng/go-movies

#Enter directory
cd go-movies

#Generate a configuration file (redis db10 library is used by default, and the configuration in app.go can be modified by yourself)
cp ./config/app.go.backup ./config/app.go

#Start (the first start will automatically start the crawler task)
go run main.go 
or
#Install bee tool
bee run

#If the installation of dependent packages fails, use the agent
export GOPROXY=https://goproxy.io,direct
or
export GOPROXY=https://goproxy.cn,direct

visit
http://127.0.0.1:8899

Open crawler

  • Direct access linkhttp://127.0.0.1:8899/movies-spider(just start the scheduled task and crawl regularly)
    • A timed crawler is built in, and the crawler is started at 1 a.m. by default (the cron.timing_spider expression in the configuration file can be modified)
  • Consumption: about 10% cup and about 40MB memory under Windows
  • When the network is normal, it takes about 21 minutes to complete the crawling (some resources fail to crawl)

Tools

be careful

#To modify the static files / static and views / hero, you need to install the package dependency first and execute the following compilation command. For more usage, please refer to the official redame.md

# https://github.com/rakyll/statik
statik -src=xxxPath/go_movies/static -f 

# https://github.com/shiyanhui/hero
hero -source="./views/hero"

Compile executable (cross platform)

#Usage reference https://github.com/mitchellh/gox
#The generated file can execute Linux directly
gox -osarch="linux/amd64" 
......
  • Download compiled files of win64 and linux64(please compile by yourself)

Please ensure that redis is enabled. Db10 is used by default. After successful startup, the crawler will be executed automatically and can be accessed by yourself http://127.0.0.1:8899/movies -Spider crawler

Micro cloud(recommended)+proxyee-down, the original go version is already under development…)

A crawler movie station based on golang

Docker deployment (this step can be ignored directly by using docker compose)

#Install redis image (existing can be ignored) 
sudo docker pull redis:latest

#Start redis container
#Allocate ports according to actual conditions - P host ports: Container Ports
sudo docker run -itd --name redis-test -p 6379:6379 redis

#Modify the redis connection address of app.go to the container name
"addr":"redis-test"

#Compile go movies
gox -osarch="linux/amd64"

#Construction mirror
sudo docker build -t go-movies-docker-scratch .

#Start container
sudo docker run --link redis-test:redis -p 8899:8899 -d go-movies-docker-scratch

Docker compose one click Start

#Modify the redis connection address of app.go to the container name, which needs to be consistent with that in docker-compose.yml
"addr":"redis-test"

#Compile go movies
gox -osarch="linux/amd64"

#Run
sudo docker-compose up -d

Open Explorer access http://127.0.0.1:8899  You can see the website effect

The directory structure refers to beego settings

TODO

Other

Many go principles haven’t been understood yet. If you have the energy, you will study them slowly. It’s scribbled. Forgive me.

This work adoptsCC agreement, reprint must indicate the author and the link to this article

hezhizheng