
This video provides an overview of the entire course.
In this video, we will get started with a holistic recap of ElasticSearch for the purpose of review.
In this video, we will familiarize ourselves with the components of an ElasticSearch cluster.
In this video, we will understand how Apache Lucene fits into the ElasticSearch technology.
In this video, we will gain understanding of most common type of index.
Learn how to optimize cluster performance with indices.
In this video, we will look at how to optimize search performance by making indices smaller.
Learn How to leverage scoring to improve search results.
Understand how boosting affects query results.
The goal of this video is to gain an understanding of the often confusing concept of query rescoring.
Setting up a production ready cluster in ElasticSearch is the most often overlooked challenge to learning ElasticSearch. Configurations aside, understanding where to start relative to your needs can be an enormous task.
Proper settings in ElasticSearch can help to void split-brain or ping timeouts.
Unbeknownst to many newbies, some configurations must be made to ensure that your cluster is successfully running in production.
While many engineers understand the basics of authentication and authorization, it's necessary to exceed the basics to properly implement security in an ElasticSearch cluster.
Of all the security constructs (realms) in ElasticSearch, arguably the most popular is the Native Realm. Understanding the native realm is critical to one's ability to effectively implement security in ElasticSearch.
While authentication deals with login credentials, authorization handles what an authenticated person can and can't do. Understanding authorization is critical for cluster security and management.
All information passed between nodes, including passwords are done in plain text. For this reason, in production, node to node encryption is required.
One viable security option involves controlling the IP addresses that can access a production cluster, as well as setting up nodes to communicate via private IP addresses.
The only way to ensure a healthy cluster is to perform monitoring; in doing so, knowing what to monitor and how to identify potential problems are very important.
While monitoring directly from the production cluster remains an option, a far more productive way to monitor in ElasticSearch is to set up a dedicated monitoring cluster. This ensures that even if your production cluster goes offline, stats will still be available.
The fact remains that even the most optimized ElasticSearch cluster can go down or experience setbacks. As such, having an effective backup provides an immediate way to rollback and restore data.
Repositories and Snapshots are the two necessary components when backing up in ElasticSearch. Both require deeper understanding.
Machine Learning has become the standard in data analysis today. ElasticSearch 5.5 introduced a turnkey Machine Learning integration which adds substantial power to data analysis.
To implement Machine Learning in ElasticSearch it is important to understand exactly how to the interface with one another.
As experience is the best teacher, the most effective way to learn how to implement machine learning in ElasticSearch is to do it.
The best way to close out this course is with helpful tips for production and security.
Here we round things up.
ElasticSearch is a Lucene-based search engine for distributed search and analytics. This course will serve as a hands-on guide as you explore the features of ElasticSearch 5.0. You will go beyond the basics and master advanced concepts in ElasticSearch distributed searching, indexing, optimization, administration and much more.
You will be able to harness knowledge of advanced topics such as the ElasticSearch optimization and administration which include the inner workings of ElasticSearch, advanced queries, sharding for optimal performance, low level index configurations, and the best of ElasticSearch administration. Packed with easy to follow examples, this video course will ensure that you have a firm understanding of advanced topics of ElasticSearch 5.0, and how to effectively optimize your cluster.
About The Author
Ethan Anthony is a San Francisco based Data Scientist who specializes in distributed data centric technologies. He is also the Founder of XResults, where the vision is to harness the power of big data to deliver intuitive customer facing solutions, largely to non-technical professionals. Ethan is Harvard educated in the areas of data science and software engineering. He began using ElasticSearch in 2012 and has since delivered solutions based on the Elastic Stack to a broad range of clientele. Ethan has also consulted globally with firms in a cross-section of industry verticals, from the US to the Far East.