Elasticsearch Architecture

Sundog Education by Frank Kane
A free video tutorial from Sundog Education by Frank Kane
Founder, Sundog Education. Machine Learning Pro
4.5 instructor rating • 22 courses • 453,350 students

Lecture description

Let's talk about how Elasticsearch scales horizontally on a cluster, using primary and replica shards.

Learn more from the full course

Elasticsearch 6 and Elastic Stack - In Depth and Hands On!

Search, analyze, and visualize big data on a cluster with Elasticsearch, Logstash, Beats, Kibana, and more.

08:03:25 of on-demand video • Updated May 2019

  • Install and configure Elasticsearch 6 on a cluster
  • Create search indices and mappings
  • Search full-text and structured data in several different ways
  • Import data into Elasticsearch using several different techniques
  • Integrate Elasticsearch with other systems, such as Spark, Kafka, relational databases, S3, and more
  • Aggregate structured data using buckets and metrics
  • Use Logstash and the "ELK stack" to import streaming log data into Elasticsearch
  • Use Filebeats and the Elastic Stack to import streaming data at scale
  • Analyze and visualize data in Elasticsearch using Kibana
  • Manage operations on production Elasticsearch clusters
  • Use cloud-based solutions including Amazon's Elasticsearch Service and Elastic Cloud
English Let's talk about elasticsearch architecture and how it actually scales itself out to run on an entire cluster of computers, they can scale up as needed. So the main trick is that an index in elastic search is split into what we call shards and every shard is basically a self-contained instance of Lucene in and of itself. So the idea is that if you have a cluster of computers you can spread these shards out across multiple different machines as you need more capacity you can just throw more machines into your cluster and add more shards to that entire index so that it can spread that load out more efficiently. So the way it works is once you actually talk to a given server on your cluster for elasticsearch what it figures out what document you're actually interested in. It can hash that to a particular shard id. So we'll have some mathematical function that can very quickly figure out which shard owns the given document and it can redirect you to the appropriate shard on your cluster very quickly. So that's the basic idea. We just distribute our index among very many different shards and a different shard can live on different computers within your cluster. Let's talk about the concept of primary and replica shards. This is how elastic search maintains resiliency to failure. One big problem that you have when you have a cluster of computers is that those computers can fail sometimes and you need to deal with that. So let's look at this example. We have an index that has two primary shards and two replicas. So in this example we were going to have three nodes and a node is basically an installation of a classic search usually you'll see one node installed per physical server in your cluster and you can actually do more than one if you want to be a little bit unusual to do. But the design is such that if any given node in your cluster goes down you won't even see it as an end user. You know you can handle that failure. So let's take a close look at what's going on here. In this example I have two primary shards. That means those are basically the primary copies of my index data. And that's where write requests are going to be routed to initially that data will then be replicated to the replica shards which can also handle read requests whenever we want to. So let's take a look at how this is set up, elasticsearch figured this all out for you automatically. It's kind of like what a elasticsearch gives you. So if I say I want an index with two primaries in two replicas it's gonna set things up like this if you gave it three different nodes. So let's look at an example here. Let's say that node one were to fail for some reason you know it had some disk failure or the power supply burned out. Who knows could be anything. So in this case we're going to lose primary shard one and replica shards 0 but it's not a big deal because we have a replica of shard one sitting on Node 2 and another replica sitting on Node 3. So what would happen if node one just suddenly went away is elasticsearch would figure that out. And it would elect one of the replica nodes on two or three to be the new primary. And you know since we have those replicas sitting there it's fine. You know we can keep on accepting new data and we can keep on servicing reader requests because we're now down to one primary and one replica and that should be able to get us by until we can restore that capacity that we lost with node number one. Similarly let's say node number three goes away. In that example we lost our primary node 0 but it's OK because we had a replica sitting on node one and Node 2 and elasticsearch can just basically promote one of those replicas to be the new primary. And it can get by until we can restore the capacity that we lost so you can see using a scheme like this. We can have a very fault tolerant system. In fact we could lose multiple nodes you know I mean node two is just serving replica nodes at this point so we could in fact even tolerate Node 1 and No 2 going away at the same time in which case we'd be left with a primeary on Node 3 for both of the shards that we care about. So it's pretty clever how that works. You know there are some things to note here you know first of all it's a good idea to have an odd number of nodes for this sort of resiliency that we're talking about but it's pretty cool. Right. And the idea is that you would just round robin your request as an application among all the different nodes in your cluster. It would spread out the load to that initial traffic. Let's talk a little bit more about what exactly happens when you write new data or read data from your cluster. So let's say you're indexing a new document into elasticsearch that's going to be a write request. Now when you do that whatever node you talk to will say okay here's where the primary shard lives for this document you're trying to index. I'm going to redirect you to where that primary shard lives. OK. So you will go write that data, index it into the primary Shard whatever node that lives on and then that will automatically get replicated to any replicas for that shard. Now when you read that's a little bit quicker they can just route it to the primary shard or to any replica of that shard. OK. So that can spread out the load of reads even more efficiently. So the more replicas you have you're actually increasing your read capacity for the entire cluster. It's only the write capacity that's going to be bottleneck by the number of primary shards you have. Now this kind of sucky thing is that you cannot change the number of primary shards in your cluster or later on. You need to define that right when you're creating your index up front and here by the way is what the syntax for that would look like through a rest request. We would specify a put verb on our rest request with the index name followed by a setting structure and json that defines the number of primary shards and the number of replicas. OK. Now this isn't as bad as it sounds because a lot of applications of elasticsearch are very read heavy. You know if you're actually powering a search index on a big Web site like Wikipedia or something like that you're going to get a lot more read requests from the world than you're going to have indexes for new documents, so it's not quite as bad as it sounds and a lot of applications. Often times you can just add more replicas to your cluster later on to add more read capacity. It's adding more write capacity that gets a little bit hairy. That's not the end of the world if you do need to add more write capacity. You can always re-index your data into a new index and copy it over if you need to but you want to plan ahead and make sure you have enough primary shards upfront to handle any growth that you might reasonably expect in the near future. We'll talk about how to plan for that more toward the end of the course. By the way just as a refresher let's also talk about what actually goes on with this particular put request for it. Defining the number of shards. So in this example we're saying we want three primary shards and one replica. How many shards do we actually end up with here. Well the answer is actually six. So we're saying we want three primary shards and one replica of each of those primary shards So you see how that adds up. We have three times on this three plus the three original primaries gives us six, if we had two replicas we would end up with nine total shards right, three primaries and then a total of six replicas to give us two replica shards for each primary shards So that's how that math works out there can be a little bit confusing sometimes but that's that's the idea. But anyway that's the general idea of how elasticsearch scales and how its architecture works. Important concepts here are primary and replica shards and how a elasticsearch will automatically distribute those shards across different nodes that live on different servers in your cluster to provide resiliency against failure of any given node. Pretty cool stuff.