By continuing to navigate on this website, you accept the use of cookies to serve you more relevant services & content.
For more information and to change the setting of cookies on your computer, please read our Cookie Policy.

Extract text with OCR for all image types in python using pytesseract

What is OCR?

Optical Character Recognition(OCR) is the process of electronically extracting text from images or any documents like PDF and reusing it in a variety of ways such as full text searches.

In this blog, we will see, how to use 'Python-tesseract', an OCR tool for python.

pytesseract:

It will recognize and read the text present in images. It can read all image types - png, jpeg, gif, tiff, bmp etc. It’s widely used to process everything from scanned documents.

Installation:

$ sudo pip install pytesseract

Requirements:

* Requires python 2.5 or later versions.
* And requires Python Imaging Library(PIL).

Usage:

From the shell:

$ ./pytesseract.py test.png 

Above command prints the recognized text from image 'test.png'.

$ ./pytesseract.py -l eng test-english.jpg

Above command recognizes english text.

In Python Script:

import Image
from tesseract import image_to_string

print image_to_string(Image.open('test.png'))
print image_to_string(Image.open('test-english.jpg'), lang='eng')

To Know more about our Django CRM(Customer Relationship Management) Open Source Package. Check Code

    Posted On
  • 10 July 2015
  • By
  • Micropyramid

Need any Help in your Project?Let's Talk

Latest Comments
Related Articles
Custom Password Less Authentication in Django

Authentication backends allow the ability to change what method checks your users credentials.

For web services, ie Facebook authentication, you don't have access to user ...

Continue Reading...
Django Custom Template Tags And Filters

Django Template Tags are simple Python functions that accept a value, an optional argument, and return a value to be displayed on the page.
First, ...

Continue Reading...
Understanding Routers in Django-Rest-Framework

By using routers in django-rest-framework we can avoid writing of url patterns for different views. Routers will save a lot of time for developing the ...

Continue Reading...
open source packages

Subscribe To our news letter

Subscribe and Stay Updated about our Webinars, news and articles on Django, Python, Machine Learning, Amazon Web Services, DevOps, Salesforce, ReactJS, AngularJS, React Native.
* We don't provide your email contact details to any third parties