*** NEW PREVIEW VIDEOS: All lectures in Section 2 and Section 3 enabled for preview! Check it out!
NEW: Project 4 Network Configuration Parser added to the course***
Python Regular Expressions is a hands-on course that teaches you everything you need to know about Regular Expression using Python Language. Master cutting edge pattern matching, data extraction and data cleanup skills in this course.
Regular Expression is a powerful text processing tool for log mining, data parsing, cleanup and preparation. Power and elegance of regular expression allows you to do complex data extraction and cleanup with very few lines of code.
Over 60% of the effort in big data projects is spent on data cleanup and preparation. Data can come from variety of sources including internal databases, log files, sensor generated data, Twitter, Facebook and so forth. Having access to powerful regular expression tool will open up lot of opportunities for you on how you look at your data and what can you do with it.
This course contains over 25+ hands-on exercises, practical tips, quizzes and four projects to apply the new skills you learned in this class. In the first project, we will be extracting useful information from unstructured text data from Robocopy tool, in the second project we will be on working on large data set generated by Sensors and in the third project we will look in to Health Care Systems that deal with Electronic Medical Records. ***NEW***Fourth project added on parsing Network Interface Configuration.
These exercises will demonstrate that with regular expression you can implement complex parsing with only a few lines of code.
As a bonus, you will receive an Python Interactive Tool with Source Code for learning regular expression faster.
This course uses free Anaconda Python distribution tool for development and exercise. This is an all video lecture with quizzes, full source code, downloadable list of data and patterns used.
Where can you use regular expressions? Class structure overview and teaching methodology
**UPDATE: Please Read Lecture 3 article for how to run interactive tool on newer Anaconda distributions with PyQt5**
Setup development environment from scratch:
Anaconda Python Environment, Course Notebook files and Learn Regex GUI Tool setup, Data Setup and Overview of Python Regular Expression Quick Reference Guide
Regular Expression Terminology used in this course
Coding tips to correctly handle patterns using raw strings
Shows different ways in which you can find a match for pattern in text
How do get all matches for a pattern. Different ways of retrieving the matches
How to break a pattern into sub-patterns using Groups. How are groups beneficial
Shows different ways of doing a find-replace capability using patterns and custom replacement logic
Shows differences between python string and raw string. Why raw strings are better for patterns
Let's start with simplest building block: Single Character Patterns and how to use Set based match
How to find matches using NOT operation
How to specify a range of characters using shortcut available in regular expression language
How to specify different ranges of characters, use of wildcard, escape special characters and use of control characters
How to provide instructions to regular expression engine to fine tune criteria for matching
How to write concise patterns using ready-made shortcuts known as character classes
How to specify repetition, upper and lower bounds, optional experssions
How to insert comments inside a regular expression pattern
If..Then...else style branching in regular expression
Summary of what we learned so far
Regular Expression Engine Overview
How Engine works internally to find a match
What order does Engine use to find a match
What order does Engine use to find a match
How does engine handle quantifiers? What does Greedy, Lazy and Back Tracking mean?
Lazy concept, example and demo
How does engine perform exhaustive search to find the right match?
Greedy hands-on demo
Groups - What is it? How can you use it? Different types of groups
Non capture groups to turn-off specific capability
How to refer back to a match inside the same pattern? How to replace a matching text?
What are look ahead and look behind? Where are they used and what capability does it provide
Go forward only if certain pre-condition is met
Go forward only if certain pre-condition is met before the current character
Overview of the project, how robocopy log is structured, what information is the solution going to extract
Strategy for solving the problem with regular expression, review of the solution and demo
What can cause performance issues in patterns?
What type of patterns to watch for performance issues
Why is the performance degrading? What is happening behind the scenes?
Various options for addressing the performance issue
Hands-on overview of all the concepts, solution with demo
What is the difference between compiled regular expression objects and module methods? When to use compiled objects?
Demo of comparison between module and compiled object methods
Introduction to sensors and how they collect and store data
How to use regular expression to extract the data and convert to JSON. Solution walk-through and demo
Summary of how we use regular expressions
In this project, we will use regular expression to extract information from hospital medical report files sent as HTML payload
Chandra Lingam spent 15 years at Intel, developing and managing systems that handled hundreds of terabytes of worldwide factory data. Chandra is an expert on Amazon Web Services, mission critical systems and machine learning. He has a Master's degree in Computer Science from ASU and Bachelor's degree in Computer Science from Thiagarajar College of Engineering, Madurai.
Chandra is the author of popular iOS educational apps Geometry Test, Math Stripes and Arithmetic Test.