Elasticsearch 5 and Elastic Stack - In Depth and Hands On!
4.6 (155 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
1,449 students enrolled
Wishlisted Wishlist

Please confirm that you want to add Elasticsearch 5 and Elastic Stack - In Depth and Hands On! to your Wishlist.

Add to Wishlist

Elasticsearch 5 and Elastic Stack - In Depth and Hands On!

Search, analyze, and visualize big data on a cluster with Elasticsearch, Logstash, Beats, Kibana, and more.
4.6 (155 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
1,449 students enrolled
Last updated 7/2017
English
Curiosity Sale
Current price: $10 Original price: $150 Discount: 93% off
30-Day Money-Back Guarantee
Includes:
  • 8 hours on-demand video
  • Full lifetime access
  • Access on mobile and TV
  • Certificate of Completion
What Will I Learn?
  • Install and configure Elasticsearch on a cluster
  • Create search indices and mappings
  • Search full-text and structured data in several different ways
  • Import data into Elasticsearch using several different techniques
  • Integrate Elasticsearch with other systems, such as Spark, Kafka, relational databases, S3, and more
  • Aggregate structured data using buckets and metrics
  • Use Logstash and the "ELK stack" to import streaming log data into Elasticsearch
  • Use Filebeats and the Elastic Stack to import streaming data at scale
  • Analyze and visualize data in Elasticsearch using Kibana
  • Manage operations on production Elasticsearch clusters
  • Use cloud-based solutions including Amazon's Elasticsearch Service and Elastic Cloud
View Curriculum
Requirements
  • You need access to a Windows, Mac, or Ubuntu PC with 20GB of free disk space
  • You should have some familiarity with web services and REST
  • Some familiarity with Linux will be helpful
  • Exposure to JSON-formatted data will help
Description

Elasticsearch is a powerful tool not only for powering search on big websites, but also for analyzing big data sets in a matter of milliseconds! It's an increasingly popular technology, and a valuable skill to have in today's job market. This comprehensive course covers it all, from installation to operations, with 60 lectures including 8 hours of video.

We'll cover setting up search indices on an Elasticsearch cluster, and querying that data in many different ways. Fuzzy searches, partial matches, search-as-you-type, pagination, sorting - you name it. And it's not just theory, every lesson has hands-on examples where you'll practice each skill using a virtual machine running Elasticsearch on your own PC.

We cover, in depth, the often-overlooked problem of importing data into an Elasticsearch index. Whether it's via raw RESTful queries, scripts using Elasticsearch API's, or integration with other "big data" systems like Spark and Kafka - you'll see many ways to get Elasticsearch started from large, existing data sets at scale. We'll also stream data into Elasticsearch using Logstash and Filebeat - commonly referred to as the "ELK Stack" (Elasticsearch / Logstash / Kibana) or the "Elastic Stack".

Elasticsearch isn't just for search anymore - it has powerful aggregation capabilities for structured data. We'll bucket and analyze data using Elasticsearch, and visualize it using the Elastic Stack's web UI, Kibana.

You'll learn how to manage operations on your Elastic Stack, using X-Pack to monitor your cluster's health, and how to perform operational tasks like scaling up your cluster, and doing rolling restarts. We'll also spin up Elasticsearch clusters in the cloud using Amazon Elasticsearch Service and the Elastic Cloud.

Elasticsearch is positioning itself to be a much faster alternative to Hadoop, Spark, and Flink for many common data analysis requirements. It's an important tool to understand, and it's easy to use! Dive in with me and I'll show you what it's all about.

Who is the target audience?
  • Any technologist who wants to add Elasticsearch to their toolchest for searching and analyzing big data sets.
Students Who Viewed This Course Also Viewed
Curriculum For This Course
62 Lectures
07:59:19
+
Installing and Understanding Elasticsearch
5 Lectures 43:07

We'll talk about why Elasticsearch is important and what you can expect from this course. Then, we'll install a virtual Ubuntu machine right on your own desktop PC, install Elasticsearch on it, and search the complete works of William Shakespeare!

Preview 17:12

Let's look at the components of the Elastic Stack from a 30,000-foot level, and see how they all fit together.

Preview 05:44

We'll cover the logical concepts of Elasticsearch, how indices work, and different ways to interact with Elasticsearch.

Using Elasticsearch
09:01

Let's talk about how Elasticsearch scales horizontally on a cluster, using primary and replica shards.

Preview 06:46

Quiz time! Let's see what you learned about Elasticsearch at a conceptual level.

Quiz: Elasticsearch Concepts and Architecture
04:24
+
Mapping and Indexing Data
11 Lectures 01:25:02

We'll walk though setting up SSH on your server, and how to connect to it from your desktop.

Connecting to your Cluster
06:54

The Movielens dataset will be used throughout the course; let's just familiarize ourselves with it.

Getting to Know the Movielens Data Set
04:16

We'll define a mapping, or a schema, in Elasticsearch for our movie data prior to importing it.

Create a Mapping for MovieLens
15:01

We'll insert a single movie into our index the "hard way" - using a JSON request over a REST query.

Import a Single Movie via JSON / REST
05:00

We'll use Elasticsearch's JSON-based bulk API for inserting many movies into our index with a single request.

Insert Many Movies at Once
05:32

Learn how to atomically update an existing document in Elasticsearch, and how it's handled under the hood (it may surprise you!)

Updating Data in Elasticsearch
06:13

Let's practice deleting a document, and see what really happens under the hood.

Deleting Data in Elasticsearch
02:43

Time to try it yourself without my guidance! Practice what you've learned so far.

[Exercise] Insert, Update, and Delete a Fictitious Movie
04:27

What happens if two different clients try to update a document at the same time? We'll practice a way to avoid these possible sources of contention.

Preview 08:45

Let's dive into the nuances of how Elasticsearch breaks up your text into search terms, and how you can control it.

Using Analyzers and Tokenizers
12:35

What do you do with relational data? You don't have to completely de-normalize it with Elasticsearch; let's create an example of grouping movies together into franchises with parent/child data modeling.

Data Modeling with Elasticsearch
13:36
+
Searching with Elasticsearch
11 Lectures 01:15:06

Query-string search requests allow quick experimentation without constructing full JSON requests. Let's see how it works, and its limitations.

Using Query-String Search
09:13

We'll practice some more with the preferred interface for Elasticsearch, using JSON search request bodies.

Using JSON Search
09:54

Full-text search queries sometimes produce unexpected results. We'll illustrate this by searching for Star Wars movies, and see how phrase search can help produce what we're really after.

Full-Text vs. Phrase Search
06:04

Practice using URI and full-body search queries to find Star Wars films released after 1980. Do a phrase search for bonus points!

[Exercise] Search for New Star Wars Films Two Different Ways
03:49

We'll practice generating subsets of search results, which can be used for pagination to the end user - and cover the limitations of pagination.

Pagination
06:14

We'll practice sorting our results, in this case by release date, and discuss the nuances involved.

Sorting
07:24

Filters are a very efficient way to refine your search results, but can become complex. Let's practice with a search for sci-fi films that don't include the term "Trek" released between 2010 and 2015.

Using Filters
04:24

Try it yourself! I'll get you started, and then show you how I did it.

[Exercise] Search for Science Fiction Movies Before 1960, Sorted by Title
03:09

Make your full-text search resilient to typos using fuzzy queries. We'll see how to specify just how "fuzzy" you want your queries to be, and get some results back in spite of some misspellings.

Fuzzy Queries
06:16

The inverted index can be leveraged to power partial matching, where you can search for some prefix of a search term. We'll illustrate this by searching for movies released in any year that begins with 201.

Partial Matching
05:26

We'll see what N-Grams are, how they can be used for prefix searching, and how to implement a search-as-you-type system using N-Grams.

Preview 13:13
+
Importing Data Into Your Index - Big or Small
10 Lectures 01:29:22

We'll write Python scripts to import data using Elasticsearch's REST interface directly, and using a higher-level API.

Importing Data from Scripts
15:09

If you're comfortable with programming, try building upon the previous lecture to create your own script to import movie tags into a new "tags" index. I'll show you my solution to compare yours with.

[Exercise] Import Movie Tags Into a New Index with a Python Script.
03:32

Logstash is an extremely handy tool for importing existing and streaming data from server logs into an Elasticsearch index. Let's see what it's about and how it works.

Preview 04:31

Let's install Logstash on our Ubuntu system and configure it.

Installing Logstash
07:49

As an example, we'll use Logstash to parse and insert log entries from a real Apache access log.

Preview 04:51

Logstash supports a very wide variety of sources and destinations. Let's set up a MySQL instance with MovieLens data, and use Logstash to copy this data into an Elasticsearch index.

Importing Data from MySQL using Logstash
14:34

As another example, we'll import data stored on Amazon's S3 service (a cloud-based distributed file system) into our Elasticsearch index using Logstash.

Importing Data from AWS S3 using Logstash
08:14

Kafka serves a similar role to Logstash, in that it collects and publishes data. We'll see how to connect it to your Elasticsearch cluster, which may come in handy if you have an existing Kafka setup publishing streaming data that you want to index.

Integrating Kafka with Elasticsearch
09:00

Apache Spark can also read and write to Elasticsearch. We'll see how you can use Spark to crunch big data in complex ways, and output the results into an Elasticsearch index.

Integrating Spark and Hadoop with Elasticsearch
12:21

See if you can build upon the code in the previous lecture to write a Spark driver script that imports movie ratings into a new "spark" index with a "ratings" type.

[Exercise] Import Movie Ratings from Spark to Elasticsearch
09:21
+
Aggregation
5 Lectures 44:52

Learn how simple aggregations work, why they're important, and practice by finding how many 5-star ratings are in our data, and what the average rating for Star Wars is.

Buckets and Metrics
12:04

Practice generating histogram data from aggregations - break down the distribution of movie ratings, and of movie release years.

Histograms
07:36

Elasticsearch has special capabilities for aggregating time-based data. Let's break down web server hits by hour, and Googlebot hits by hour.

Aggregating Time Series Data
07:16

Use aggregations to search for a spike in error response codes in my web server's access log data, and narrow it down to a specific hour.

[Exericse] When Did my Site Go Down?
04:05

The ability to nest different aggregations inside each other can lead to very complex and powerful queries. Let's use a nested aggregation to find the average rating of each Star Wars movie.

Preview 13:51
+
Using Kibana
3 Lectures 18:03

Let's get Kibana installed and configured,

Installing Kibana
06:05

Although Kibana is often used to visualize log data, it can do so much more. Let's actually turn the power of Kibana into gaining some new insights into the works of William Shakespeare.

Preview 08:44

Practice using Kibana, by visualizing the plays with the most lines in them.

[Exercise] Find the Shakespeare Plays with the Most Lines
03:14
+
Analyzing Log Data with the Elastic Stack
4 Lectures 23:08

The Elastic Stack is more than just Elasticsearch, Logstash, and Kibana these days. Let's look at the Beats framework, and how it fits in.

The ELK Stack and Elastic Stack
04:12

We'll install FileBeat and configure it to directly import data into an index from an access log.

Install, Configure, and Use FileBeat
07:32

We'll see how Kibana's built-in dashboards can quickly produce all the visualizations you need for your server log data.

Preview 07:05

Use Kibana to narrow down the origin of a spike of 404 error codes in my web log data.

[Exercise] Narrow Down the Source of 404 Errors
04:19
+
Elasticsearch Operations
8 Lectures 01:01:44

The number of primary shards in your index cannot change - how do you select the right number of shards with future growth in mind?

How Many Shards Should I Use?
05:48

We'll add new indices as a scaling strategy, and see how it works.

Preview 07:41

How do you choose the optimal hardware configuration for your Elasticsearch cluster? Let's talk about the different considerations.

Choosing Your Hardware
03:04

An important configuration setting in production is the size of the memory heap dedicated to Elasticsearch. How do you balance this against memory needed by the OS and file system cache?

Heap Sizing
03:16

X-Pack is a paid add-on to the Elastic Stack that provides monitoring and alarming capabilities for your Elasticsearch cluster. Let's use a free trial to play around with it and see what it can do.

Monitoring with X-Pack
11:55

Let's simulate hardware failures and see how a properly configured cluster can be resilient to the failure of any given node.

Practicing Failover
12:24

We'll back up our indices to a snapshot saved to disk, and practice restoring our index from it.

Snapshots
09:30

Often you'll need to restart all the machines in  your cluster in order to update software or the OS. Learn how to do this safely and without disruption to the clients of your cluster.

Rolling Restarts
08:06
+
Elasticsearch in the Cloud
3 Lectures 34:04

We'll set up an Amazon ES cluster, configure it, and see if it works - and talk about managing security with it.

Preview 10:19

The security considerations of cloud-based services makes Logstash integration a little more complicated. Let's see how to connect Logstash to your AWS-based Elasticsearch cluster securely.

Preview 14:09

Elastic.co offers its own cloud-based service build on top of AWS, called Elastic Cloud. Let's see what it offers.

Using Elastic Cloud
09:36
+
You Made It!
2 Lectures 04:51

There's more to explore with Elasticsearch - let's cover additional resources, and where you can go from here.

I Made It! Now What?
03:45

Visit my website for discounts on my other big data and data science / machine learning courses - and let's stay in touch on social media!

Bonus Lecture: Special Offers On My Other Courses.
01:06
About the Instructor
Sundog Education by Frank Kane
4.5 Average rating
15,332 Reviews
73,626 Students
9 Courses
Training the World in Big Data and Machine Learning

Sundog Education's mission is to make highly valuable career skills in big data, data science, and machine learning accessible to everyone in the world. Our consortium of expert instructors shares our knowledge in these emerging fields with you, at prices anyone can afford. 

Sundog Education is led by Frank Kane and owned by Frank's company, Sundog Software LLC. Frank spent 9 years at Amazon and IMDb, developing and managing the technology that automatically delivers product and movie recommendations to hundreds of millions of customers, all the time. Frank holds 17 issued patents in the fields of distributed computing, data mining, and machine learning. In 2012, Frank left to start his own successful company, Sundog Software, which focuses on virtual reality environment technology, and teaching others about big data analysis.

Frank Kane
4.5 Average rating
14,940 Reviews
69,922 Students
7 Courses
Founder, Sundog Education

Frank spent 9 years at Amazon and IMDb, developing and managing the technology that automatically delivers product and movie recommendations to hundreds of millions of customers, all the time. Frank holds 17 issued patents in the fields of distributed computingdata mining, and machine learning. In 2012, Frank left to start his own successful company, Sundog Software, which focuses on virtual reality environment technology, and teaching others about big data analysis.