Extraction code: t3bq
Win10 install tesserocr
First, you need to download testeract, which provides the underlying support for tesserocr. Specific download official path: https://github.com/UB-Mannheim/tesseract/wiki , select the corresponding system version. You can select a stable version without dev to download, such as tesseract-ocr-setup-3.05.02-20180621.exe. Then install it all the way. You can only remember to check additional language data (download) and check the languages that may be used, such as simplified Chinese, traditional Chinese, mathematical modules, etc. you don’t need to select all of them. The download time of tessdata will be longer.
Download time will be longer, patience can wait, conditional can cross the wall, download speed will be much faster
Install the testerocr library corresponding to python
Install using the original WHL file. Download the official WHL file of testerocr: https://github.com/simonflueckiger/tesserocr-windows_ Build / releases. Download the WHL file corresponding to the local environment. For example, my windows 64 bit system and python version are 3.5. After downloading, use CD to jump to the directory where the WHL file is located, and then execute “PIP install testerocr-2.2.2-cp35-cp35m-win_ AMD64. WHL “, which is easy to install.
The following module is used to replace the Unicode decodeerror system problem
pytesseract ：pip install pytesseract
pip install pytesseract
The first run is always not smooth. I believe that most people will encounter the pit I encountered. Most of the mistakes are similar to:
Traceback (most recent call last): File "G:\pythonSources\my12306/obtain_message\test.py", line 4, in <module> print(tesserocr.image_to_text(image)) File "tesserocr.pyx", line 2400, in tesserocr._tesserocr.image_to_text Runtimeerror: failed to init API, possibly an invalid testdata path: "a local path"
A relatively simple and crude solution is to copy the tessract OCR folder of Tesseract OCR to the path of the prompt, which is effective for pro testing.
import tesserocr from PIL import Image image=Image.open('image.jpg') print(tesserocr.image_to_text(image))
The above is the introduction of win10 installation tesserocr configuration Python using tesserocr to identify alphanumeric verification code, I hope to help you, if you have any questions, please leave me a message, Xiaobian will reply you in time. Thank you very much for your support to the developeppaer website!
If you think this article is helpful to you, welcome to reprint, please indicate the source, thank you!