Optical Character Recognition (OCR) MasterClass in Python

Name: Optical Character Recognition (OCR) MasterClass in Python
Rating: 4.3 (202 reviews)

Learn OCR in Python using OpenCV, Pytesseract, Pillow and Machine Learning

Created byRaj Chhabria

Last updated 1/2023

English

English [Auto],

What you'll learn

Learn about Pillow Library in Python which is used for working with image data and perform various image manipulation steps.
OpenCV for image preprocessing in Python.
Learn about Pytesseract which is an Optical Character Recognition (OCR) tool for python. It will read and recognize the text in images, license plates, etc.
You will learn to use Machine Learning for different OCR use cases and build ML models that perform OCR with over 90% accuracy.
Build different OCR projects like License Plate Detection, Reading text from images etc...

Course content

5 sections • 22 lectures • 2h 2m total length

Introduction to the Course8:32
Explore the fundamentals of optical character recognition in Python, its history, and practical uses; learn OpenCV, Tesseract, preprocessing, and license plate detection projects.
Install the required libraries3:33
Learn how to install the essential OCR libraries for Python, including Pillow, opencv-python, and pytesseract, and configure tesseract in your system PATH.

Opening and Viewing an image2:50
Open and view images using the pillow library in python, loading test image.jpg from the same folder with image.open, and inspect the image object's mode rgb and size.
Obtaining information about opened image2:27
Learn to inspect an opened image with PIL, retrieving the file name, format, mode, and size, including width and height, from an IMG object.
Rotate and Resize3:47
Learn to rotate and resize images in Python using Pillow, rotating by 90 and 180 degrees and resizing to 400x250 and 200x200.
Crop an image using pillow2:56
Learn to crop images in Python with Pillow using the image.crop method. Open an image, inspect its size, set left, top, right, and bottom coordinates, and produce a cropped result.
Add text on an Image using pillow2:40
Learn to add text on an image with Pillow in Python by importing Image and ImageDraw, creating a draw object, and using draw.text at specified coordinates with color.
Add Padding to image with pillow3:06
Learn to add padding to an image in Python using Pillow, by setting left, right, top, and bottom values and pasting into a padded image.
Blur an image using pillow4:03
Blur images in Python using Pillow with box blur, Gaussian blur, and simple blur methods. Adjust blur intensity with radius and convolutional kernels using image.filter and ImageFilter.
Concatenate images using Pillow6:15
Concatenate two images in Pillow using Python by stacking horizontally or vertically, resizing when needed, and pasting to create a single combined image for OCR workflows.
Save an Image2:55
Learn to save images in python's image library using the save method after rotating or processing, saving intermediate and final results as jpg files in the same directory.

Opening an Image with OpenCV4:57
Open an image with OpenCV in Python using cv2.imread, inspect the pixel array as a numpy array, and display the image with matplotlib for OCR preprocessing.
Invert an Image4:19
Learn how to invert an image in the OCR masterclass in Python using cv2 bitwise not, turning white-background black-text images into black-background white-text displays.
Binarization5:14
Learn to binarize a color image to black and white in OpenCV by converting to grayscale and applying thresholding, then save and display the binary image.
Erosion and Dilation6:42
Master the erosion and dilation preprocessing steps in optical character recognition with Python, including image inversion, thinning fonts via erosion, and thickening fonts via dilation.

Image to Text2:35
Learn to extract and display text from images in Python using pytesseract and Pillow, loading an image, applying image_to_string, storing the result in a text variable, and printing it.
Getting Boxes Around Text6:29
Detect important words in images and draw bounding boxes around them using cv2 and Python, leveraging image_to_data and dict output to segment text on invoices.
Text Template Matching7:38
Learn text template matching in an OCR workflow using Python, building a date regex to locate and draw bounding boxes around dates in images.
License Plate Detection14:31
Detect license plates in images with OpenCV and imutils, convert to grayscale, reduce noise, detect edges, find top contours, crop the plate, and perform OCR with image_to_string in English.

Introduction to OCR using Machine Learning6:48
Explore how machine learning enables OCR by training models on labeled data, focusing on supervised learning to classify characters and digits, with practical examples of OCR tasks.
KNN Machine Learning Algorithm10:22
Learn the k-nearest neighbours algorithm, a non-parametric, lazy supervised learner for classification (and regression) using euclidean distance to classify new data in a Python Jupyter notebook.
OCR using Machine Learning Code Implementation9:34
Train a handwritten digit OCR in python using opencv and k-nearest neighbors by preprocessing an image, splitting into 20×20 blocks, training with digits 0–9, and achieving around 91.64% accuracy.

Requirements

Basic understanding of Python Programming Language.

Description

Welcome to Course "Optical Character Recognition (OCR) MasterClass in Python"

Optical character recognition (OCR) technology is a business solution for automating data extraction from printed or written text from a scanned document or image file and then converting the text into a machine-readable form to be used for data processing like editing or searching.

BENEFITS OF OCR:

Reduce costs
Accelerate workflows
Automate document routing and content processing
Centralize and secure data (no fires, break-ins or documents lost in the back vaults)
Improve service by ensuring employees have the most up-to-date and accurate information

Some Key Learning Outcomes of this course are:

Recognition of text from images using OpenCV and Pytesseract.
Learn to work with Image data and manipulate it using Pillow Library in Python.
Build Projects like License Plate Detection, Extracting Dates and other important information from images using the concepts discussed in this course.
Learn how Machine Learning can be useful in certain OCR problems.
This course covers basic fundamentals of Machine Learning required for getting accurate OCR results.
Build Machine Learning models with text recognition accuracy of above 90%.
You will learn about different image preprocessing techniques such as grayscaling, binarization, erosion, dilation etc... which will help to improve the image quality for better OCR results.

Who this course is for:

Python developers who are curious about Optical Character Recognition (OCR).
People from Data Science and Machine Learning background who want add a new skill of OCR in their resume.
Anyone who wants to learn about OCR.

Optical Character Recognition (OCR) MasterClass in Python

What you'll learn

Explore related topics

Course content

Introduction2 lectures • 12min

Python Pillow (PIL Fork)9 lectures • 31min

Preprocess Images for Text OCR using OpenCV4 lectures • 21min

Pytesseract4 lectures • 31min

OCR using Machine Learning3 lectures • 27min

Requirements

Description

Who this course is for: