Write a summary, the program can run, no problem.
Climb down the body of 30 websites and save it as a JSON file
test.jsonSave to a new
A new Python file called
check.py, compare the hash values of the two files. If they are different, send an email to my mailbox.
check.pyThere is an os.system (scratch XXXX) in it
Here comes the question Because I want to run it regularly. For example, if I set a regular execution plan on win, it will be executed every 5 minutes. This is OK.
One problem is the same as on VPS. The system is CentOS 6.
For example, you can open this in the way of path at any place
python Documents/check_web/check.py Scrapy 1.1.1 - no active project Unknown command: crawl Use "scrapy" to see available commands 0s 10s Traceback (most recent call last): File "Documents/check_web/check.py", line 35, in <module> f1 = open("./test.json", "rb") IOError: [Errno 2] No such file or directory: './test.json'
Check.py is roughly as follows
def getJson(): os.system('scrapy crawl check_web_sprider') time.sleep(10) def getHash(f): line = f.readline() hash = hashlib.md5() while (line): hash.update(line) line = f.readline() return hash.hexdigest() def IsHashEqual(f1, f2): str1 = getHash(f1) str2 = getHash(f2) return str1 == str2 if __name__ == '__main__': f1 = open("./test.json", "rb") f2 = open("./test1.json", "rb") if (IsHashEqual(f1, f2) is False): def _format_addr(s): name, addr = parseaddr(s) return formataddr(( \ Header(name, 'utf-8').encode(), \ addr.encode('utf-8') if isinstance(addr, unicode) else addr))
Ask why.. The way the path is displayed
Scrapy 1.1.1 - no active project Unknown command: crawl
But if I go to check.py and
There’s no problem.. Reptiles work as well..
I’m sorry that the writing is not very good. I hope someone can understand.. And know how to solve it.
check.pyIt’s in the project directory of scratch..
crawlCommand needs to be inEngineering catalogueAnd run scripts in other paths,
./test.jsonWill be saved in the current working directory.
First getAbsolute path：
app_path = os.path.dirname(os.path.realpath(__file__))
To open a file:
f1 = open(os.path.join(app_path, "test.json"), "rb")
scapySuggested usesubprocess, try adding
cwd=app_path, specify the working path.
Running check.py in the project directory of check will generate test.json and test1.json in the same level directory of check.py Then we can compare.. Actually, I don’t know why.. I’ll try the absolute path first