A Comprehensive Guide to Optical Character Recognition with Python

Technology Blogs

OCR, which stands for Optical Character Recognition, is a technology that Terra offers for seamlessly connecting your application to wearable data collected from users.

Here’s how it works: first, the scanner does its thing, seeing light areas as background and dark areas as text. Then comes the cleanup, fixing any alignment quirks and smoothing out imperfections. Next up is recognizing different scripts so it can handle languages like a champ.

When it comes to understanding text, OCR has its own bag of tricks. It uses two main tricks: pattern matching, where it compares characters in the scanned image to stored ones, and feature extraction, breaking down characters into lines, loops, and intersections.

After all that heavy lifting, it’s time for the postprocessing. This turns the text into a computer-friendly file. Some OCR systems can even create fancy PDFs with both the original and the cleaned-up versions of the scanned document.

Related read: Real-Time Data Streaming Using Terra SDK

The functioning of OCR software involves several key steps:

Image Acquisition

✅ Utilizing a scanner to read documents and convert them into binary data.
✅ Analyzing the scanned image, where light areas are identified as background and dark areas as text.

Preprocessing

✅ Cleaning the image by applying techniques such as deskewing to fix alignment issues.
✅ Despeckling, which involves removing digital image spots and smoothing text image edges.
✅ Refining the appearance of boxes and lines in the image.

Script Recognition for Multi-Language OCR Technology

✅ Recognizing various scripts to enable multi-language OCR capabilities.

Text Recognition

✅ OCR software employs two primary algorithms for text recognition: pattern matching and feature extraction.
✅ Pattern matching involves comparing isolated character images (glyphs) with stored glyphs of similar font and scale.
✅ Feature extraction breaks down glyphs into lines, loops, and intersections, using these features to find the best match.

Postprocessing

✅ Converting the extracted text data into a computerized file.
✅ Some OCR systems can generate annotated PDF files that include both the original and processed versions of the scanned document.

Types of OCR

Data scientists are like wizards in the world of data. They not only organize OCR technologies based on their use but also offer valuable insights and recommendations through data science consulting.

Simple OCR Software

☑️ Utilizes pattern-matching algorithms with stored font and text image templates.
☑️ Limited by the challenge of accommodating countless font and handwriting styles.

Intelligent Character Recognition (ICR) Software

☑️ Modern OCR systems employ ICR technology, utilizing machine learning and neural networks.
☑️ Analyzes text at multiple levels, considering attributes like curves, lines, intersections, and loops.

Intelligent Word Recognition

☑️ Operates on similar principles as ICR but processes entire word images instead of individual characters.

Optical Mark Recognition

☑️ Recognizing logos, watermarks, and additional textual symbols in documents represents a pivotal function.

Elevate Your Codebase: Hire Skilled Python Developers Now!

Get In Touch

Creation of Image Reading Modules

The development of image reading modules is an essential stage, necessitating expertise in understanding document design and patterns. In this process, distinctive labels are assigned to various text elements, laying the groundwork for extraction through the utilization of Tesseract.

def pan_read_data(text):
name = None
fname = None
pan = None
nameline = []
dobline = []
panline = []
text0 = []
text1 = []
text2 = []
lines = text.split('\n')
for lin in lines:
s = lin.strip()
s = lin.replace('\n','')
s = s.rstrip()
s = s.lstrip()
text1.append(s)
text1 = list(filter(None, text1))
lineno = 0
for wordline in text1:
xx = wordline.split('\n')
if ([w for w in xx if re.search(os.getenv(“regex”), w)]):
text1 = list(text1)
lineno = text1.index(wordline)
break
text0 = text1[lineno+1:]
try:
# Cleaning first names
name = text0[4]
name = name.rstrip()
name = name.lstrip()
name = name.replace("8", "B")
name = name.replace("0", "D")
name = name.replace("6", "G")
name = name.replace("1", "I")
name = re.sub('[^a-zA-Z] +', ' ', name)
# Cleaning Father's name
fname = text0[6]
fname = fname.rstrip()
fname = fname.lstrip()
fname = fname.replace("8", "S")
fname = fname.replace("0", "O")
fname = fname.replace("6", "G")
fname = fname.replace("1", "I")
fname = fname.replace("\"", "A")
fname = re.sub('[^a-zA-Z] +', ' ', fname)
# Cleaning PAN Card details
text0 = findword(text1, os.getenv(“regex2”))
panline = text0[0]
pan = panline.rstrip()
pan = pan.lstrip()
pan = pan.replace(" ", "")
pan = pan.replace("\"", "")
pan = pan.replace(";", "")
pan = pan.replace("%", "L")
except:
pass
data = {}
data['Name'] = name
data['Father Name'] = fname
data['PAN'] = pan
data['ID Type'] = id_type
return data


def findword(textlist, wordstring):
lineno = -1
for wordline in textlist:
xx = wordline.split( )
if ([w for w in xx if re.search(wordstring, w)]):
lineno = textlist.index(wordline)
textlist = textlist[lineno+1:]
return textlist
return textlist

Libraries Required

🔸 Pytesseract: A library employed for extracting text from images through OCR leveraging Tesseract, serves as a valuable tool in the realm of Optical Character Recognition.
🔸 cv2: This is an OpenCV library.
🔸 ftfy: Fixes text for you.
🔸 NumPy: Fundamental package for array computing.
🔸 os: Provides functions for interacting with the operating system.
🔸 re: Used to work with regular expressions.
🔸 PIL: Imaging library.

1. Image Extraction

The pre-processing of the photo-extraction includes defining the initial variable, converting the images into grayscale for better readability and then defining the function to extract the photo from the image file.

2. Extracting Text with Tesseract and Refining the Text

Extracting text with Tesseract and enhancing the text representation in Python is facilitated by Python-tesseract, an Optical Character Recognition (OCR) tool specifically designed for Python. This tool excels at identifying and interpreting text within images. Moreover, when employed as a script, Python-tesseract goes beyond conventional practices by directly printing the recognized text rather than storing it in a file.

filename = request.FILES.get('image')
text = pytesseract.image_to_string(Image.open(filename), lang='eng')
text_output = open('output.txt', 'w', encoding='utf-8')
text_output.write(text)
text_output.close()
file = open('output.txt', 'r', encoding='utf-8')
text = file.read()
text = ftfy.fix_text(text)
text = ftfy.fix_encoding(text)
data = pan_read_data(text)

Results

Image info is generated successfully. Modifying the modules to suit various documents is a flexible approach since, in the end, it boils down to employing Tesseract OCR for text recognition.

Conclusion

In conclusion, Optical Character Recognition (OCR) with Python emerges as a powerful technology, seamlessly integrated into Terra to connect applications with wearable data. The OCR process involves acquiring images, preprocessing them to enhance readability, recognizing various scripts, employing text recognition algorithms, and postprocessing to convert the extracted text into a digital format.

The creation of image reading modules, facilitated by Tesseract OCR, plays a crucial role in extracting specific text elements. The modules are adaptable, providing flexibility in understanding document design and patterns.

Key libraries like Pytesseract, cv2, ftfy, NumPy, os, re, and PIL are instrumental in the image extraction process. The blog guides developers through image extraction, text extraction with Tesseract, and refining the text using Python scripts. The results showcase successful image information generation.

Shubham G

Software Engineer

Shubham is a Full stack developer with 3.5+ years of experience. He has experience in technologies like ReactJS, Redux, Python, Django and UI Frameworks. His expertise in building interactive and responsive web applications, writing efficient, optimized and DRY code. He enjoys learning about new technologies.

Service
Career

Let's create something together!
We’re looking for the best. Are you in?

The Mindbowser team's professionalism consistently impressed me. Their commitment to quality shone through in every aspect of the project. They truly went the extra mile, ensuring they understood our needs perfectly and were always willing to invest the time to...

Spencer Barns

CTO, New Day Therapeutics

I collaborated with Mindbowser for several years on a complex SaaS platform project. They took over a partially completed project and successfully transformed it into a fully functional and robust platform. Throughout the entire process, the quality of their work...

David Rhodes

President, E.B. Carlson

Mindbowser and team are professional, talented and very responsive. They got us through a challenging situation with our IOT product successfully. They will be our go to dev team going forward.

Dan Munro

Founder, Cascada

Amazing team to work with. Very responsive and very skilled in both front and backend engineering. Looking forward to our next project together.

Anthony Lewis

Co-Founder, Emerge

The team is great to work with. Very professional, on task, and efficient.

Matthew Holsclaw

Founder, PeriopMD

I can not express enough how pleased we are with the whole team. From the first call and meeting, they took our vision and ran with it. Communication was easy and everyone was flexible to our schedule. I’m excited to...

Angela Boudreaux

Founder, Seeke

Mindbowser has truly been foundational in my journey from concept to design and onto that final launch phase.

Jovan Pizarro

CEO, KickSnap

We had very close go live timeline and Mindbowser team got us live a month before.

Shaz Khan

CEO, BuyNow WorldWide

If you want a team of great developers, I recommend them for the next project.

Vladimir Kudryavtsev

Founder, Teach Reach

Mindbowser built both iOS and Android apps for Mindworks, that have stood the test of time. 5 years later they still function quite beautifully. Their team always met their objectives and I'm very happy with the end result. Thank you!

Bart Mendel

Founder, Mindworks

Mindbowser has delivered a much better quality product than our previous tech vendors. Our product is stable and passed Well Architected Framework Review from AWS.

Pankaj Parashar

CEO, PurpleAnt

I am happy to share that we got USD 10k in cloud credits courtesy of our friends at Mindbowser. Thank you Pravin and Ayush, this means a lot to us.

Sudheer Bandaru

CTO, Shortlist

Mindbowser is one of the reasons that our app is successful. These guys have been a great team.

Dave Dubier

Founder & CEO, MangoMirror

Kudos for all your hard work and diligence on the Telehealth platform project. You made it possible.

Joyce Nwatuobi

CEO, ThriveHealth

Mindbowser helped us build an awesome iOS app to bring balance to people’s lives.

Addie Wootten

CEO, SMILINGMIND

They were a very responsive team! Extremely easy to communicate and work with!

Kristen M.

Founder & CEO, TotTech

We’ve had very little-to-no hiccups at all—it’s been a really pleasurable experience.

Chacko Thomas

Co-Founder, TEAM8s

Mindbowser was very helpful with explaining the development process and started quickly on the project.

Hieu Le

Executive Director of Product Development, Innovation Lab

The greatest benefit we got from Mindbowser is the expertise. Their team has developed apps in all different industries with all types of social proofs.

Alex Gobel

Co-Founder, Vesica

Mindbowser is professional, efficient and thorough.

MacKenzie Richter

Consultant, XPRIZE

Very committed, they create beautiful apps and are very benevolent. They have brilliant Ideas.

Laurie Mastrogiani

Founder, S.T.A.R.S of Wellness

Mindbowser was great; they listened to us a lot and helped us hone in on the actual idea of the app. They had put together fantastic wireframes for us.

Bennet Gillogly

Co-Founder, Flat Earth

Ayush was responsive and paired me with the best team member possible, to complete my complex vision and project. Could not be happier.

Katie Taylor

Founder, Child Life On Call

The team from Mindbowser stayed on task, asked the right questions, and completed the required tasks in a timely fashion! Strong work team!

Michael Wright

CEO, SDOH2Health LLC

Mindbowser was easy to work with and hit the ground running, immediately feeling like part of our team.

George Hodulik

CEO, Stealth Startup

Mindbowser was an excellent partner in developing my fitness app. They were patient, attentive, & understood my business needs. The end product exceeded my expectations. Thrilled to share it globally.

Jirina Harastova

Owner, Phalanx

Mindbowser's expertise in tech, process & mobile development made them our choice for our app. The team was dedicated to the process & delivered high-quality features on time. They also gave valuable industry advice. Highly recommend them for app development...

Marty Betz

Co-Founder, Fox&Fork