Learning Regular Expressions with .NET

Learn how to use regular expressions for your big data parsing, cleanup and preparation needs with C# .NET
5.0 (6 ratings) Instead of using a simple lifetime average, Udemy calculates a
course's star rating by considering a number of different factors
such as the number of ratings, the age of ratings, and the
likelihood of fraudulent ratings.
81 students enrolled
$19
$30
37% off
Take This Course
  • Lectures 78
  • Length 4 hours
  • Skill Level All Levels
  • Languages English
  • Includes Lifetime access
    30 day money back guarantee!
    Available on iOS and Android
    Certificate of Completion
Wishlisted Wishlist

How taking a course works

Discover

Find online courses made by experts from around the world.

Learn

Take your courses with you and learn anywhere, anytime.

Master

Learn and practice real-world skills and achieve your goals.

About This Course

Published 6/2016 English

Course Description

*** NEW PREVIEW VIDEOS: All lectures in Section 2 are now available for preview! Check it out! 

     NEW: Project 4 Network Configuration Parser added to the course***

Learning Regular Expression with .Net is a hands-on course that teaches you everything you need to know about Regular Expression.

Regular Expression is a powerful text processing tool for log mining, data parsing, cleanup and preparation. Its power and flexibility allows you to accomplish a lot with very few lines of code.  

Over 60% of the effort in big data projects is spent on data cleanup and preparation.
Data can come from variety of sources including internal databases, log files, sensor generated data, Twitter, Facebook and so forth.  Having access to powerful regular expression tool will open up lot of opportunities for you on how you look at your data and what can you do with it.

This course contains over 25+ hands-on exercises, practical tips, quizzes and four projects to apply the new skills you learned in this class.  In the first project, we will be extracting useful information from unstructured text data from Robocopy tool, in the second project we will be on working on large data set generated by Sensors and in the third project we will look in to Health Care Systems that deal with Electronic Medical Records.  ***NEW***Fourth project added on parsing Network Interface Configuration.

These exercises will demonstrate that with regular expression you can implement complex parsing with only a few lines of code

As a bonus, you will receive an Interactive Tool for learning regular expression faster. Source code for the tool is included.

This course uses free Visual Studio Community Edition tool for development and exercise.  This is an all video lecture with quizzes, full source code, downloadable list of data and patterns used.

What are the requirements?

  • All material and software instructions are covered in house keeping lecture.
  • Familiarity with a Programming Language

What am I going to get from this course?

  • Confidently use regular expression as a powerful text processing tool for data parsing, cleanup and preparation
  • Minimize effort spent on custom development for data cleanup
  • Gain practical tips to let Regular Expression do bulk of the work for data preparation
  • Understand potential performance issues and techniques to address them

What is the target audience?

  • Course is intended for Data Scientists, Software Developers and Database Engineers

What you get with this course?

Not for you? No problem.
30 day money back guarantee.

Forever yours.
Lifetime access.

Learn on the go.
Desktop, iOS and Android.

Get rewarded.
Certificate of completion.

Curriculum

Section 1: Introduction
03:13

Introduction and Course Structure Overview. 

01:16

Setup Visual Studio Community Edition Development Environment

01:44

Download handy regular expression reference guide 

06:37

Setup solution, code examples and data required for this course.  

Section 2: .Net Regular Expression Capabilities Hands-on
01:13

We will review terminology used in the class, 

04:38

Introduce .Net Regex Validation Capability and tips for writing patterns and test cases

02:50

Overview of .Net Regex match functionality and how to get precise details on a match for pattern in text

03:58

Overview of .Net Regex Group functionality and how to parse a text using groups.

02:38

Use .Net Regex to replace a matching text with another pattern or string

02:45

Use .Net Regex to replace a matching text with a custom output generated by an user defined method

01:31

How to split a text using patterns. Example derived from MSDN split guide.

02:22

Why verbatim string is recommended programming pratice when patterns are hardcoded in code.

Regular Expression .Net Object Model
4 questions
00:33

Summary of .Net Regex capabilities that were covered in this lecture

Section 3: Learn Regex Interactive Tool Hands-on
05:57

Learn how to test patterns and observe behavior without writing code!

Section 4: Regular Expression Language
02:52

We will discuss simplest of patterns: single character patterns.  

Download the PDF in resources.  It contains all the patterns and text that will be used in this lecture.

02:12

Continue discussion on single character patterns: Or, And and Set based search

03:32

How to look for character that do not match a set - Negation. How to specify range of characters without typing them all manually.

02:55

Learn about wildcards and when not to use it, how to handle special characters and learn how to specify control characters like tab, new line and so forth

00:29

We will discuss how to test the concepts discussed in this lecture

00:44

Demo. We will discuss how to verify single character patterns and case insensitive search

00:48

Demo. We will discuss how to verify set based comparison, negation, and literal string search

01:58

Let's look at why we need anchors. Anchors help you specify conditions that should be met for a match for example: match only words, match at beginning or end of line and so forth

04:12

Let's discuss with an example on how to a search at a beginning of a string or line. Tips on how to handle text that contains embedded new lines in them. 

02:49

Let's discuss with an example on how to match towards end of a string or line. Why special handling is needed on Windows. Tips on how to handle text that contains embedded new lines in them. 

01:22

Demo. We will do an interactive testing to verify anchors, multi-line handling and steps to correctly handle windows carriage return and new line characters

02:21

Let's discuss how to use shortcuts to define a set of characters. These are know as character classes in regex

03:09

Discuss more character classes: white space, unicode categories to match a predefined set of characters. Unicode categories are very powerful built-in capability of Regex that will allow you to write very compact and feature rich regular expression patterns.

05:24

Let's look at quantifiers and how to specify optional, required, and frequency of occurrence for an expression.

00:33

Let's review some of the concepts we discussed on quantifiers with a demo

02:09

Learn how you can add comments to a regular expression and different commenting options.

06:18

Let's discuss how to specify more complex conditional evaluation with IF THEN ELSE capability of regex.

Regular Expression Language
7 questions
00:36

Let's review all the concepts we discussed in this lecture on regular expression language and next steps

Section 5: Regular Expression - Five Key Points
02:04

I will introduce you to .Net regular expression engine and what are some of its key capabilities.  We will also do a quick look at five key points that we will be covering in this lecture.

Download PDF document in the resources. It contains the pattern and text that we will use in this lecture.

03:39

Let's look at basic building block with a concrete example.  Regex engine compares one character at a time to determine what to do next.

06:58

Let's review how regex engine scans the pattern and text and in what order. I will use concrete examples to walk through scenarios

05:07

Let's go through one more example on how left to right works when there are multiple paths to take

One Character at a time and Left to Right
6 questions
03:00

I will introduce Greedy, Lazy and Backtracking in this lecture.  There is a question at the end for you to think through how greedy would work

06:00

Let's do a step by step walk through on Greedy with a concrete example and how backtracking is used to find a match.

Greedy
1 question
05:58

Let's review what does Lazy mean in regex and how backtracking is done differently compared to greedy

00:55

Let's review the important concepts of Greedy and Lazy with a hands-on demo

06:00

Let's look at another example of backtracking and how regular expression does a thorough evaluation of all paths to find a match

02:03

In this lecture you will learn about the concept of groups.  Groups helps you in variety of ways from simplifying patterns to automatically capturing values for sub patterns.

03:38

Let's discuss two different ways in which groups can be accessed from code and what are the advantages and disadvantages.

02:39

Let's discuss how to turn-off group capture and when you want to do this.

Group
3 questions
01:27

Let's look at concrete example on type of pattern where we can safely disable group capture and different ways of turning off group capture

03:34

Learn about backreference and how to refer a group that is already captured in the pattern.  Let's discuss when and why this is useful.

00:52

Let's review backreference with a demo on how to detect duplicate words. In addition, we will also use substitution to remove the duplicate word

02:45

Look ahead and look behind allows you to mix two different logic in one pattern.  It also helps you solve more complex problems

03:08

Learn about positive look ahead with a concrete example

01:11

Let's do a hands-on demo on positive lookahead and how to use interactive tool to visually verify what the look ahead pattern is doing

01:29

Let's review negative look ahead, how and when to use it with a concrete example

03:05

Let's discuss on positive lookbehind and when to use it

02:10

Let's discuss negative lookbehind and where to use it

00:40

Summary of all the key concepts we learned in this lecture.  This lecture covered key points that will greatly improve your understanding of regular expression engine

Section 6: Project 1 - Robocopy Log Parsing with Regular Expression
02:24

We will apply all the concepts we learned by parsing an unstructured free form text file.  Let's review sample log files generated by robocopy tool containing successful transfer and error.  These log files will be parsed using regular expression.

Le'ts define the scope of the problem and what we want to accomplish in this project.

06:37

Let's review the patterns we want to define and how to build a solution using regular expression.  In this lecture, we will look at the code to confirm how we can offload all the text parsing work to regular expression engine and collect the results back.

Includes practical tips on writing smaller patterns instead of one giant catch all pattern

01:23

When building complex patterns, we want to interactively and gradually build the pattern.

00:39

Review everything we did in the project and what we learned out of it.

Section 7: Performance - Issues and Techniques for improving performance
01:07

I want to introduce you to performance considerations you need to be aware of when using regular expression.  We will study the issues and discuss techniques to address potential issues.

05:08

Let's discuss greedy capture and backtracking with a road trip analogy

04:12

What can we do to fix excessive backtracking?  Let's expand our analogy to fix the problem

02:05

Let's discuss with road trip analogy on how Lazy mode would work.

02:43

Concrete example of a regular expression that degrades in performance with every additional character.  Shows why partial and no match have worst performance.

05:44

Let's discuss why the pattern is exhibiting exponential run time? What is going on behind the scenes?

06:34

Let's do a hands-on demo to visually see the exponential delay and review of techniques to address them.  Demonstration of fixes to performance issues!

01:46

How to use Timeout capability in .Net Regex to limit time taken by a search.  Insurance policy against potential uncertainties in input when used in production.

02:00

Hands-on code demo of timeout capability

05:02

Let's discuss steps performed to prepare a pattern and execute it.

We will compare and contrast static method versus instance method based interaction with Regex object model. 

04:14

Overview of instance based interaction with Regex object model

08:11

Hands-on demo with code to review performance characteristics of static methods and scenarios under which the performance degrades.

03:38

Hands-on demo with code to review performance characteristics of instance methods and scenarios under which the performance degrades.

We will conclude with recommendations on when to use static versus instance methods as well as compile to assembly option.

02:05

Review of RightToLeft text parsing option and when this would be useful.

03:23

Hands on demo with code to explain how right to left works

Section 8: Project 2 - Sensor Data Parsing and Preparation
10:32

Let's review how sensors typically collect data and how they are organized.  Sensors are low cost monitors that produce a steady stream of data.  We will look at a data center example that collects temperature and humidity every 30 seconds.  We will read 1 year worth of sensor data and parse and prepare the data using regular expression.

In this project, we will take advantage of instance based invocation as well as static based invocation. Includes tips on how to dynamically determine parameter names from pattern and avoid hard-coding of names when iterating through match and groups.

00:34

Review the capabilities implemented and what we learned with this project

Section 9: Project 3 - Health Care Electronic Medical Record
07:32

In this project, we will use regular expression to extract information from hospital medical report files sent as HTML payload

Section 10: Project 4: Network Configuration Data Parser
Article

Download the PDF associated with this lecture for the problem description

Section 11: Conclusion
Course Wrapup
00:13

Students Who Viewed This Course Also Viewed

  • Loading
  • Loading
  • Loading

Instructor Biography

Chandra Lingam, Data Scientist and Software Engineer

Chandra Lingam spent 15 years at Intel, developing and managing systems that handled hundreds of terabytes of worldwide factory data.  Chandra is an expert in large scale data management, search and machine learning.  He has a Master's degree in Computer Science from ASU and Bachelor's degree in Computer Science from Thiagarajar College of Engineering, Madurai.  

Chandra is the author of popular iOS educational apps Geometry Test, Math Stripes and Arithmetic Test.

Ready to start learning?
Take This Course