Computer Vision - OCR using Python
What you'll learn
- A quick starter on OCR Architecture, Commercial Solutions and Use Cases in Industry
- Learn to implement OCR - Text Detection with OpenCV and Deep Learning Models
- Use Tesseract and EasyOCR to implement OCR - Text Recognition
- Work with OCR - Text Labelling using Spacy and Regular Expression
- Use OpenCV and Tesseract to apply Noise Removal Techniques including Thresholding, Rescaling, Dilation, Erosion and Deskewing
- Learn to develop web-based applications - Business Card Recognition and KYC Digitization for OCR using Flask
- Build OCR Solutions for Invoice Processing with Text Labelling and XML output & Vehicle Nameplate Recognition
- Executable Code of CTPN and EAST Model implementation for Text Detection and Text Recognition
- Learn to train Deep Learning Models of CTPN and EAST on ICDAR dataset
- Understand the Image Basics and apply it for Image Processing
- Basic Programming skills in Python
Become an expert in extraction of Text from Image or Scanned Documents with the help of Computer Vision, OpenCV and Deep Learning concepts and develop yourself into Computer Vision - Optical Character Recognition (OCR) Specialist.
Top 3 Reasons on why this course Computer Vision: OCR using Python stands-out among other courses:
Inclusion of 5 in-demand projects of Computer Vision that have been explained through detailed code walkthrough and work seamlessly
Dedicated In-Course Support is provided within 24 hours for any issues faced
Comprehensive Coverage inclusive of theory and practical implementation of 2 Deep learning-based Text Detection models (CTPN and EAST)
In this course, we are covering the complete life cycle of OCR which starts from Text Detection with OpenCV and Deep Learning Models in an Image to Text Recognition with Tesseract and then finally performing Text Labelling through Spacy and Regular Expression. Enroll in this course and become specialized in Computer Vision - OCR. Here is a summary of the key topics we will be learning and projects that we will design in the course:
In this course we help students perform all basic operations and installation of required packages and software before they begin their learning journey. This includes installation of right package of Python, PyCharm installation together with suggestions around setting it up for running first project, installation of basic packages, tips and tricks around issues faced. Post this we are also covering complete guide to install Tesseract both on Windows and Ubuntu environment.
Text Detection Techniques for OCR
When we talk about Text detection, it is often approached by researchers with the help of multiple techniques. The foremost is letting the Text Recognition tool perform Text detection as well, however that’s too much relying on one software, the example of same is given in EasyOCR which are covering in our course. Moving on to advanced techniques, Text Detection is done with the help of most popular Deep Learning concept which is Connectionist Text Proposal Network (CTPN) that accurately localizes text lines in natural image with the help of convolutional feature maps connected by a recurrent neural network using VGG16. This has been described in detail in the course and also a pre-trained model for text detection along with detailed explanation of training customized on your own dataset is also provided.
Now moving on to our next Text Detection Model, EAST (Efficient and Accurate Scene Text Detector) which is a deep learning-based algorithm that detects text with a single fully-convolutional neural network. EAST make use of ResNet-50 as its base model and along with pre-trained model, the detailed concept of how to perform training on your own dataset is also provided in the course.
For students who face restriction in running code or installing PyCharm locally on their machines, we are also providing Google Colab version of executing the code of Text Detection for both CTPN and EAST.
In this course we are also covering complete architecture of OCR in detail to make students understand not only the concept but also on how to design various OCR based solutions for industrial use.
Image Processing Concepts
In Image processing, we are including Pixel, Kernels & Feature map along with OpenCV to deepen the understanding of OCR concepts well. Thereafter, we are explaining pre-processing techniques to optimize our Text detection which includes Binarization, Thresholding, Rescaling and also Noise Removal Techniques like Morphology, Dilation, Erosion, Blurring, Orientation, De-skewing, Borders & Perspective Transformation.
While explaining image concepts we also cover how to perform image segmentation for better control and refined processing of image over small section.
Text Recognition for OCR
While performing text recognition we make use of Tesseract software and Pytesseract package and then update our existing code of CTPN and EAST to perform text recognition along with text detection. This helps in merging two steps into one and use it in optimized manner. In this course, we are also explaining in detail the various options that are available with PyTesseract to convert data on images to text format.
For this section as well, we are providing both local and Google colab version of code of CTPN and EAST for execution on Pycharm and Colab.
Name Entity Recognition (NER)
In this section we are covering the next step involved in the processing of OCR after Text Recognition which is Text Labelling or in Machine Learning term we also call it as NER. There are two methods we are covering in the course for performing text labelling operation.
First one is Regular Expression also called RegEx technique in which we identify patterns in the text and based on those patterns we identify and label various extracted text. Second technique is using Deep Learning Pre trained models provided by Spacy, these pretrained models have some patterns pre-defined and by making use of same we are able to categorize our text into different classes like Date, Name, Country etc and then use that as an output of OCR.
Training Deep Learning Model – CTPN and EAST
In the Model Training section of this course, we are explaining in detail, the steps involved in using user’s own data for more refined Text Detection with the help of Transfer Learning for both CTPN and EAST models. In this section we are making use of Google Colab to train both models on SIROE Dataset (It’s a text detection dataset available for training)
In the end of this course, we have chosen 5 in-demand projects of Computer Vision and are explaining them first through a Project Overview session and then through a detailed Code Walkthrough. The projects are:
Number Plate Recognition - In this project we are identifying number plate in the input image and then make use of Tesseract to recognize the text and then we are storing these identified number plates in csv file with the help of Pandas Dataframe.
Invoice Processing with Text Labelling - In this project we are taking Invoice scanned documents as input and by utilizing OCR techniques, we are converting them into text and towards the end we perform text labelling using Spacy model and regular expression.
Invoice Processing with XML Output - In this project we are taking Invoice scanned documents as input and making use of Tesseract we are converting them into text and in the end, we are storing this data into structured format using XML, this is required so that we can integrate code with other application.
Business Card Recognition - This project is an independent Flask based Web Application developed where we are asking users to upload Business card via web browser and then result is displayed on web page itself with detected text and its recognized values.
KYC Digitization - KYC digitization is another Flask based Web Application where we are allowing users to upload Identification documents. This project makes use of CTPN Deep learning techniques to identify the text present in input image, the output detected text image is displayed along with recognized text for each block. This recognized text on right hand side is editable and user can choose which values to keep and can also update label names live on web page itself. Once all values are filtered and labelled you can also download same in excel format to your local machine.
Who this course is for:
- Beginners to Computer Vision
- OCR Engineer
- OCR Specialist
- Machine Learning Professionals
- Anyone looking to become more employable as a Computer Vision Expert
Machine Learning and Deep Learning Architect with 18+ years of IT experience in developing algorithms and machine learning solutions across Finance, Healthcare, Retail and Travel domains. Most recently, I have worked as Technical Architect for Deep Learning and now I have started my own startup in the field of Artificial Intelligence.