Examples to implement OCR(Optical Character Recognition) using tesseract using Python
-
Install tesserct-ocr using this command:
- On Ubuntu
sudo apt-get install tesseract-ocr - On Mac
brew install tesseract - On Windows, download installer from here
- On Ubuntu
-
Install python binding for tesseract, pytesseract, using this pip command:
pip install pytesseract -
Install image processing library in python, pillow using this pip command:
pip install pillow
For working with pdf files:
-
Install imagemagick using this command:
- On Ubuntu
sudo apt-get install imagemagick - For other platforms, download installer from here
- On Ubuntu
-
Install python binding for imagemagick, wand, using this pip command:
pip install wand