Extract text with OCR for all image types in python using pytesseract

What is OCR?

Optical Character Recognition(OCR) is the process of electronically extracting text from images or any documents like PDF and reusing it in a variety of ways such as full text searches.

In this blog, we will see, how to use 'Python-tesseract', an OCR tool for python.

pytesseract:

It will recognize and read the text present in images. It can read all image types - png, jpeg, gif, tiff, bmp etc. It’s widely used to process everything from scanned documents.

Installation:

$ sudo pip install pytesseract

Requirements:

* Requires python 2.5 or later versions.
* And requires Python Imaging Library(PIL).

Usage:

From the shell:

$ ./pytesseract.py test.png 

Above command prints the recognized text from image 'test.png'.

$ ./pytesseract.py -l eng test-english.jpg

Above command recognizes english text.

In Python Script:

import Image
from tesseract import image_to_string

print image_to_string(Image.open('test.png'))
print image_to_string(Image.open('test-english.jpg'), lang='eng')
    By Posted On
SENIOR DEVELOPER at MICROPYRAMID

Need any Help in your Project?Let's Talk

Latest Comments
Related Articles
How to customize the admin actions in list pages of Django admin? Chaitanya Kattineni

Django by default provides automatic admin interface, that reads metadata from your models to provide a beautiful model interface where trusted users can manage content ...

Continue Reading...
What's great about Django girls to inspire women into programming Vinisha Naladala

Django girls is a non-profit organization, that helps women to learn Django programming language and to inspire them into programming. They are organizing workshops all ...

Continue Reading...
Django Raw Sql Queries Vamsi Popuri

When your model query API don't go well or you want more performance, you can use raw sql queries in django. The Django Object Relational ...

Continue Reading...

Subscribe To our news letter

Free news and articles on Django, Python, Machine Learning, Amazon Web Services, DevOps, Salesforce, ReactJS, AngularJS, React Native.
*We don't provide your email contact details to any third parties