Efficient Regular Expressions with applications in Splunk

Name: Efficient Regular Expressions with applications in Splunk
Rating: 3.7 (14 reviews)

Fast regex creation from simple patterns to positive look-aheads

Created byAndrew Landen

Last updated 10/2020

English

What you'll learn

Create efficient and accurate regex from beginner to advanced
Identify patterns in machine data
Create schema-on-the-fly in Splunk

Course content

3 sections • 21 lectures • 4h 2m total length

Introduction to regular expressions5:40
Regular expressions are used to capture textual patterns for extraction and masking. Masked data protects privacy in support of data privacy laws and protecting PII (Personally Identifiable Information). Data extraction enables reports, analytics, and insights regarding human and machine activity, trends and future behavior.
Personal introduction to the course10:10
Introducing you to the instructor, Andrew Landen, and the course, Efficient Regular Expressions with applications in Splunk.
Splunk Introduction10:51
An introduction into Splunk (big data, schema-on-the-fly) usage of regex.

Regex101 and regex introduction31:26
regex101 introduction to basics of regex
Anchors and other basic regex techniques (examples)22:58
Learn to use unique anchor characters with basic regex to quickly extract data. Regex anchors of "^" at the start of the line and "$" at the end of the line are a different topic for a later lesson, and unrelated to this use of the word "anchor".
sed and look aheads (examples)9:00
Learn how to use regex in sed to replace patterns for masking and with lookaheads to continue capturing only if certain other patterns are also seen.
Optional and non-capture groups (examples)12:44
Learn how to use optional and non-capture groups when you need to specify a more complex pattern that may or may not exist and does not need to be captured. Capturing always takes more resources.
Back references and lazy/greedy matching (examples)4:58
Learn to use back references and lazy quantifiers.
Back references, group iterations, and look arounds (examples)7:55
More work with back references, iterations and look arounds.
Character classes and their negatives (examples)9:26
Introductions to and applications of character classes, their shortcuts, and their negatives.
Regex coding
Branch reset groups6:51
Using branch reset groups to reuse the same capture group number for different iterations.
Avoiding Explosive quantifiers with Possessive quantifiers and atomic groups7:23
Explosive quantifiers can easily yield catastrophic backtracking/infinite matching. Here we will learn how to use Possessive quantifiers and atomic groups to prevent backtracking into the group and keep the number of steps down. We also cover how to spot issues in alternation and other groupings related to explosive quantifiers.
Application of the Explosive Quantifier lesson8:13
Application of the explosive quantifier lesson
Using Positional Anchors and Custom Boundaries6:28
Learn how to identify a position based on a pattern immediately before and/or after without moving position of the engine so that parts of those patterns can also be matched as needed.
Advanced Regex Concepts

Using Splunk to extract, mask, and filter at the SPL search line24:43
rex extracts fields from raw and fields with the max_match=0 to enable multi-value matches at the SPL line
adding mode=sed enables masking options with sed
regex enables pattern based filtering of events based on matches to either raw or fields
Using the auto-extract tool and managing the field extraction knowledge object9:04
While the auto-extract creates bad regex, it allows you to see your regex applied directly to raw and it enables a quick method to add the extraction with the correct permissions to the correct sourcetype and get loaded into memory for fast application.
Using sed in Splunk to do character substitution4:43
Character substitution may be more interesting than useful and it does follow the related sed discussion to a limited degree, but it is always good to have another tool in the belt.
Multiple simultaneous automatic field extractions in Splunk10:06
Here we setup automatic extractions for multiple field extractions in a single regex with Splunk.
Multi-match in Splunk7:30
Matching multiple optional fields in a single regex in Splunk
Recursion8:42
Using (?R) and (?1) and \1 for pattern recursion and capture matching of Palindromes and Nested Parenthesis
Extracting Fields in Splunk from Configuration Files24:07
Extracting fields from si_conf for dependence relationships mapping

Requirements

Pattern recognition

Description

In this course, you will learn to apply regular expressions to search, filter, extract and mask data efficiently and effectively in Splunk following a workshop format on real data.

Regular expressions enable (with good crafting) very efficient and effective parsing of text for patterns. The most important skills for regex lie in pattern recognition, regex technique mastery, and simplicity for "step" minimization. The simpler regex with solid leading anchors tends to be the more efficient. Increased regex understanding enables access to more effective techniques for keeping it simple. Pattern recognition connects regex code to solve the problem.

We will rely on the regex101 website to assist in crafting, verifying and explaining the process. Splunk will be used to showcase practical applications with big data. A test after the main course section will test some of the more basic levels of understanding.

With as easy as it is to craft terrible regular expressions, the goal of this course is to shine a light on the efficiency of different regex techniques so that you can track the progression and efficiency of your skills. Textual pattern matching can be a very interesting and complicated subject, but the foundations of efficiency and quality control can both greatly improve the speed and effectiveness of your regex.

Who this course is for:

Developers
Big Data
Database Engineers
Splunkers
Linux admins
Dev Ops

Efficient Regular Expressions with applications in Splunk

What you'll learn

Explore related topics

Course content

Introduction3 lectures • 27min

Regex Best Practices11 lectures • 2hr 7min

Regex in Splunk7 lectures • 1hr 29min

Requirements

Description

Who this course is for: