*** NEW PREVIEW VIDEOS: All lectures in Section 2 are now available for preview! Check it out!
NEW: Project 4 Network Configuration Parser added to the course***
Learning Regular Expression with .Net is a hands-on course that teaches you everything you need to know about Regular Expression.
Regular Expression is a powerful text processing tool for log mining, data parsing, cleanup and preparation. Its power and flexibility allows you to accomplish a lot with very few lines of code.
Over 60% of the effort in big data projects is spent on data cleanup and preparation.
Data can come from variety of sources including internal databases, log files, sensor generated data, Twitter, Facebook and so forth. Having access to powerful regular expression tool will open up lot of opportunities for you on how you look at your data and what can you do with it.
This course contains over 25+ hands-on exercises, practical tips, quizzes and four projects to apply the new skills you learned in this class. In the first project, we will be extracting useful information from unstructured text data from Robocopy tool, in the second project we will be on working on large data set generated by Sensors and in the third project we will look in to Health Care Systems that deal with Electronic Medical Records. ***NEW***Fourth project added on parsing Network Interface Configuration.
These exercises will demonstrate that with regular expression you can implement complex parsing with only a few lines of code.
As a bonus, you will receive an Interactive Tool for learning regular expression faster. Source code for the tool is included.
This course uses free Visual Studio Community Edition tool for development and exercise. This is an all video lecture with quizzes, full source code, downloadable list of data and patterns used.
Setup Visual Studio Community Edition Development Environment
Download handy regular expression reference guide
Setup solution, code examples and data required for this course.
Introduce .Net Regex Validation Capability and tips for writing patterns and test cases
Overview of .Net Regex match functionality and how to get precise details on a match for pattern in text
Overview of .Net Regex Group functionality and how to parse a text using groups.
Use .Net Regex to replace a matching text with a custom output generated by an user defined method
Why verbatim string is recommended programming pratice when patterns are hardcoded in code.
Summary of .Net Regex capabilities that were covered in this lecture
Learn how to test patterns and observe behavior without writing code!
We will discuss simplest of patterns: single character patterns.
Download the PDF in resources. It contains all the patterns and text that will be used in this lecture.
Continue discussion on single character patterns: Or, And and Set based search
How to look for character that do not match a set - Negation. How to specify range of characters without typing them all manually.
Learn about wildcards and when not to use it, how to handle special characters and learn how to specify control characters like tab, new line and so forth
We will discuss how to test the concepts discussed in this lecture
Demo. We will discuss how to verify single character patterns and case insensitive search
Demo. We will discuss how to verify set based comparison, negation, and literal string search
Let's look at why we need anchors. Anchors help you specify conditions that should be met for a match for example: match only words, match at beginning or end of line and so forth
Let's discuss with an example on how to a search at a beginning of a string or line. Tips on how to handle text that contains embedded new lines in them.
Let's discuss with an example on how to match towards end of a string or line. Why special handling is needed on Windows. Tips on how to handle text that contains embedded new lines in them.
Demo. We will do an interactive testing to verify anchors, multi-line handling and steps to correctly handle windows carriage return and new line characters
Let's discuss how to use shortcuts to define a set of characters. These are know as character classes in regex
Discuss more character classes: white space, unicode categories to match a predefined set of characters. Unicode categories are very powerful built-in capability of Regex that will allow you to write very compact and feature rich regular expression patterns.
Let's look at quantifiers and how to specify optional, required, and frequency of occurrence for an expression.
Let's review some of the concepts we discussed on quantifiers with a demo
Learn how you can add comments to a regular expression and different commenting options.
Let's discuss how to specify more complex conditional evaluation with IF THEN ELSE capability of regex.
Let's review all the concepts we discussed in this lecture on regular expression language and next steps
I will introduce you to .Net regular expression engine and what are some of its key capabilities. We will also do a quick look at five key points that we will be covering in this lecture.
Download PDF document in the resources. It contains the pattern and text that we will use in this lecture.
Let's look at basic building block with a concrete example. Regex engine compares one character at a time to determine what to do next.
Let's review how regex engine scans the pattern and text and in what order. I will use concrete examples to walk through scenarios
Let's go through one more example on how left to right works when there are multiple paths to take
I will introduce Greedy, Lazy and Backtracking in this lecture. There is a question at the end for you to think through how greedy would work
Let's do a step by step walk through on Greedy with a concrete example and how backtracking is used to find a match.
Let's review what does Lazy mean in regex and how backtracking is done differently compared to greedy
Let's review the important concepts of Greedy and Lazy with a hands-on demo
Let's look at another example of backtracking and how regular expression does a thorough evaluation of all paths to find a match
In this lecture you will learn about the concept of groups. Groups helps you in variety of ways from simplifying patterns to automatically capturing values for sub patterns.
Let's discuss two different ways in which groups can be accessed from code and what are the advantages and disadvantages.
Let's discuss how to turn-off group capture and when you want to do this.
Let's look at concrete example on type of pattern where we can safely disable group capture and different ways of turning off group capture
Learn about backreference and how to refer a group that is already captured in the pattern. Let's discuss when and why this is useful.
Let's review backreference with a demo on how to detect duplicate words. In addition, we will also use substitution to remove the duplicate word
Look ahead and look behind allows you to mix two different logic in one pattern. It also helps you solve more complex problems
Learn about positive look ahead with a concrete example
Let's do a hands-on demo on positive lookahead and how to use interactive tool to visually verify what the look ahead pattern is doing
Let's review negative look ahead, how and when to use it with a concrete example
Let's discuss on positive lookbehind and when to use it
Let's discuss negative lookbehind and where to use it
Summary of all the key concepts we learned in this lecture. This lecture covered key points that will greatly improve your understanding of regular expression engine
We will apply all the concepts we learned by parsing an unstructured free form text file. Let's review sample log files generated by robocopy tool containing successful transfer and error. These log files will be parsed using regular expression.
Le'ts define the scope of the problem and what we want to accomplish in this project.
Let's review the patterns we want to define and how to build a solution using regular expression. In this lecture, we will look at the code to confirm how we can offload all the text parsing work to regular expression engine and collect the results back.
Includes practical tips on writing smaller patterns instead of one giant catch all pattern
When building complex patterns, we want to interactively and gradually build the pattern.
Review everything we did in the project and what we learned out of it.
I want to introduce you to performance considerations you need to be aware of when using regular expression. We will study the issues and discuss techniques to address potential issues.
Let's discuss greedy capture and backtracking with a road trip analogy
What can we do to fix excessive backtracking? Let's expand our analogy to fix the problem
Let's discuss with road trip analogy on how Lazy mode would work.
Concrete example of a regular expression that degrades in performance with every additional character. Shows why partial and no match have worst performance.
Let's discuss why the pattern is exhibiting exponential run time? What is going on behind the scenes?
Let's do a hands-on demo to visually see the exponential delay and review of techniques to address them. Demonstration of fixes to performance issues!
How to use Timeout capability in .Net Regex to limit time taken by a search. Insurance policy against potential uncertainties in input when used in production.
Hands-on code demo of timeout capability
Let's discuss steps performed to prepare a pattern and execute it.
We will compare and contrast static method versus instance method based interaction with Regex object model.
Overview of instance based interaction with Regex object model
Hands-on demo with code to review performance characteristics of static methods and scenarios under which the performance degrades.
Hands-on demo with code to review performance characteristics of instance methods and scenarios under which the performance degrades.
We will conclude with recommendations on when to use static versus instance methods as well as compile to assembly option.
Review of RightToLeft text parsing option and when this would be useful.
Hands on demo with code to explain how right to left works
Let's review how sensors typically collect data and how they are organized. Sensors are low cost monitors that produce a steady stream of data. We will look at a data center example that collects temperature and humidity every 30 seconds. We will read 1 year worth of sensor data and parse and prepare the data using regular expression.
In this project, we will take advantage of instance based invocation as well as static based invocation. Includes tips on how to dynamically determine parameter names from pattern and avoid hard-coding of names when iterating through match and groups.
Review the capabilities implemented and what we learned with this project
In this project, we will use regular expression to extract information from hospital medical report files sent as HTML payload
Download the PDF associated with this lecture for the problem description
Chandra Lingam spent 15 years at Intel, developing and managing systems that handled hundreds of terabytes of worldwide factory data. Chandra is an expert on Amazon Web Services, mission critical systems and machine learning. He has a Master's degree in Computer Science from ASU and Bachelor's degree in Computer Science from Thiagarajar College of Engineering, Madurai.
Chandra is the author of popular iOS educational apps Geometry Test, Math Stripes and Arithmetic Test.