Installing Elasticsearch [Step by Step]

Sundog Education by Frank Kane
A free video tutorial from Sundog Education by Frank Kane
Founder, Sundog Education. Machine Learning Pro
4.5 instructor rating • 22 courses • 454,266 students

Lecture description

We'll talk about why Elasticsearch is important and what you can expect from this course. Then, we'll install a virtual Ubuntu machine right on your own desktop PC, install Elasticsearch on it, and search the complete works of William Shakespeare!

Learn more from the full course

Elasticsearch 6 and Elastic Stack - In Depth and Hands On!

Search, analyze, and visualize big data on a cluster with Elasticsearch, Logstash, Beats, Kibana, and more.

08:03:25 of on-demand video • Updated May 2019

  • Install and configure Elasticsearch 6 on a cluster
  • Create search indices and mappings
  • Search full-text and structured data in several different ways
  • Import data into Elasticsearch using several different techniques
  • Integrate Elasticsearch with other systems, such as Spark, Kafka, relational databases, S3, and more
  • Aggregate structured data using buckets and metrics
  • Use Logstash and the "ELK stack" to import streaming log data into Elasticsearch
  • Use Filebeats and the Elastic Stack to import streaming data at scale
  • Analyze and visualize data in Elasticsearch using Kibana
  • Manage operations on production Elasticsearch clusters
  • Use cloud-based solutions including Amazon's Elasticsearch Service and Elastic Cloud
English (upbeat music) -: Hi, I'm Frank Kane, from Sundog Education. I've used my decade of experience at Amazon.com and IMDB.com to teach a hundred thousand people around the world about big data and machine learning. Elasticsearch is a hot technology you need to know about in the field of big data. It's not just used for powering full text searches on big websites anymore. Increasingly, it's being used as a real-time alternative to more complex systems like Hadoop and Spark. Elasticsearch can aggregate and graph structured data quickly, and at massive scale. In this course you'll gain hands-on experience with Elasticsearch, all the way from installation to advanced usage. We'll create search indices and mappings, import data into Elasticsearch in several different ways, aggregate structure data, and use hosted Elasticsearch clusters from Amazon and Elastic.co. You'll also get your hands dirty with the entire Elastic Stack, including Elasticsearch, Logstash, X-Pack, Kibana, and the Beats framework. Together, these technologies form a complete system for collecting, aggregating, monitoring, and visualizing your big data. I designed this course for any technologist who wants to add Elasticsearch and the Elastic Stack to their tool chest for analyzing big data, and we all know these are highly valuable skills to have in today's job market. Let's dive right in! In the real world, you'll probably be using Elasticsearch on a cluster of Linux machines. So we'll be using Linux in this course, Ubuntu in particular. Now if you don't have an Ubuntu system handy, that's totally okay. I'm going to walk you through setting up a virtual machine on your Windows or Mac PC that lets you run Ubuntu inside your existing operating system. It's actually really easy to do. Once we've got an Ubuntu machine up and running, we'll install Elasticsearch, and just for fun, we'll create a search index of the complete works of William Shakespeare, and mess around with it. After that, we'll take a step back and talk about Elasticsearch and its architecture at a high level, so you have all the basics you need for later sections of this course. Roll up your sleeves and let's get to work. Alright, let's do this! Let's go ahead and install Elasticsearch right on your own PC. Now Elasticsearch is gonna be running on an Ubuntu Linux system for this course, and if you don't already have an Ubuntu system sitting around, that's okay. What we're gonna is show you how to install VirtualBox on your Mac or Windows PC, and that will allow you to install Ubuntu running right on your own desktop within a little virtual environment. Once we have Ubuntu installed inside VirtualBox, we'll install Elastic search on it, and after that, we'll install the complete works of William Shakespeare into Elasticsearch, and see if we can successfully search that. So that's a lot to do in this one lecture. Let's dive right into it. Talk about system requirements really briefly. Pretty much any PC should be able to handle this. You don't need a ton of resources for Elasticsearch. If you do run into trouble, however, make sure that you have virtualization enabled in your BIOS settings on your PC, and specifically make sure that Hyper-V virtualization is off, if that is an option in your BIOS. You can just go through these steps and see if you run into trouble. These are basically troubleshooting steps. Also, beware that the anti-virus program called Avast is known to conflict with VirtualBox, so you'll need to switch to a different one, or turn it off while using this course if you're gonna be using Avast. Now if you head over to sundog-education.com/elasticsearch, for there you'll find step-by-step instructions for what we're about to do, as well as troubleshooting tips if you run into trouble, and you'll also find a link to the course slides there as well. So be sure to head over there for reference materials and any troubleshooting steps you may need. With that, let's dive in and just get this done. Let's get you set up, so let's go ahead and download VirtualBox, which is what we're gonna use to run your Ubuntu image on your desktop PC. Just head over to virtualbox.org, and there should be a big, friendly download button. Go ahead and select the operating system you're on. For me, that's Windows. 118 megabytes later, that should come down, and it's just your standard Windows installer. Go ahead and click that, and you can accept the defaults, nothing real special here. Accept any security warnings, and be aware that it will interrupt your network interfaces while it's installing. Let's go ahead and start it up now that it's done installing, and if you do run into any trouble with installing VirtualBox, head on over to sundog-education.com/elasticsearch, and there'll be some troubleshooting tips for you there. So here we have it. Next thing we need to do is download a Ubuntu image so that we can actually install that in our virtual machine. So head on over to Ubuntu.com, just like that, U-B-U-N-T-U.com. Head over to Downloads and Server. We want to get the latest ISO image for the Ubuntu server. So just hit the download button there and down it comes. Pretty big download, it's gonna be about 800, 900 megabytes or so. Now that our Ubuntu disk image is downloaded, I'm gonna switch back to VirtualBox here, the Oracle VM VirtualBox Manager, and click the New button, and give this thing a name. Let's call it, I don't know, Ubuntu-elasticsearch, whatever you want. It's going to be a Linuxed system, and it's going to be an Ubuntu 64-bit system. Hit next, and set this to the middle somewhere. I have a 16 gigabyte machine, so I'm gonna go ahead and allocate half of that memory to this disk image, for Ubuntu. If you have less than that, just, you know, pick around halfway, but I wouldn't go below two gigabytes if I were you. So I'm sticking with eight gigabytes here. Go ahead and create a virtual hard disk, and except the default for the format. Dynamically allocated is fine. Let's give this 20 gigabytes of space. We do need a little bit of an extra there to work with. If you want to make sure that that's being stored on a disk where you have space for it, you can click on that icon there and make sure that's stored on a drive that has sufficient free disk space. Hit Create, and now hit Start, and navigate to where you downloaded that ISO file for Ubuntu, so for me that's in my Downloads folder, and hit start. That should kick off the installer for the Ubuntu operating system itself. So now that I'm in this installer, I can use the Enter key to accept the defaults or the arrow keys to change to a different language if I want to. I'll hit Enter to install Ubuntu's server, and that will kick off the installer for Ubuntu itself. I'm gonna go ahead and accept the defaults here. English, United States, that is in fact where I am. Go ahead and change that if you need to. I'm not gonna detect the keyboard layout either, and stick with the default English layout. Next we need to give our host a name. Ubuntu's fine, doesn't really matter. We need to type in the name for your user, so for me that's Frank Kane. For you, it's probably something else. Hit Tab to hit the Continue button, then hit Enter. And you need a username. I'm gonna use fkane for myself, but again, use whatever you want for your account. Hit Tab when you're done, and hit Enter to continue, and enter a password that you'll remember. Again, Tab to the Continue button and reenter it again to make sure you didn't fat finger it. We don't need to encrypt things. You do have my timezone correct, so I'll accept that. Go ahead and accept the guided partitioning. Accept all the defaults here. We will Tab to say Yes to write those changes to disk. Remember we're in a sandbox here, so we're not really messing with our primary disk system for Windows here. Hit Tab, then continue, Tab then Yes. I'm not behind a proxy, so I'm gonna hit Tab, and say continue. I'll go ahead and set no automatic updates. It's not that important in this case. We're gonna install our own software, so I'm gonna hit Tab and select Continue here, just stick with the standard system software. Go ahead and let it install the GRUB boot loader. Hit Enter here. Don't worry, it's not really messing with your real master boot record on your disk. It has its own little sandbox environment. Ubuntu has finished installing! Let's hit Continue and go ahead and let it start up and boot up for the first time. Here we go! Just let it do its thing here. We have a login prompt, so let's go ahead and type in our account and password that we set up during installation, and we're in! We actually have an Ubuntu system up and running within our desktop. How cool is that! I think that's kind of awesome. Now if you run into any trouble, you can go ahead and refer to our website there, sundog-education/elasticsearch for troubleshooting tips. All the latest tips and tricks will be there if you have any difficulties, but hopefully you got to this point without a problem. So the next thing we need to do is actually open up some network ports so we can communicate with our server from our desktop environment, so to do that, go back to the VirtualBox Manager here. Select our image here, ubuntu-elasticsearch, and hit Settings, then select Network, and then open up the Advanced, and then Port Forwarding. Hit the Add button here. We're gonna create a port for Elasticsearch itself on 127.0.0.1, on port 9200 for the Host and the Guest ports, just like that. Hit the Add button again, and we'll also add a port for Kibana, which we'll talk about later. It's the web UI for Elasticsearch, also on 127.0.0.1. Port for this is 5601. Just like that. And finally, we'll pen up a port for SSH, 'cause we need to connect to this thing somehow. That's also gonna be 127.0.0.1, and this time, Port 22. So everything should look like this at this point. Double check, and if it looks good, hit OK. OK again, and we're done with that for now. So Elasticsearch 6 requires Java version 8, so the first thing we need to do is install the Java environment. Just type in the following: sudo apt-get install openjdk-8-jre-headless-y. You'll have to reauthenticate and give that a minute or two to come down and install. Once that's done, we'll also type in sudo apt-get install openjdk-8-jdk-headless-y and let that come down as well. Alright, the Java development kit has been installed. Now we can install the Elasticsearch itself, so first we need to actually update our repositories so that Ubuntu knows where to find it. Now follow along carefully. There's a lot to type here and all it takes is one little typo to mess things up. So be careful as you follow along. Start by saying wget dash lowercase q, big letter O. That's not the number zero. Space, dash, space, okay, these spaces are important. Https://artifacts.elastic.co That's just co, no com, /gpg-KEY (in uppercase), dash elasticsearch (in lower case), space, pipe, space, sudo, space apt-key, space add, space dash. Okay, you should see and OK prompt. Next up, sudo apt-get install apt-transport-https, this is sort of a safety thing to do. It might not actually do anything. Okay, that's good, and now we will say echo "deb https://artifacts.elastic.co /packages/6.x cause we're installing Elasticsearch 6, and the Elastic Stack 6, slash apt stable main", and then we're not done yet, pipe, and another space, sudo, space tee (T-E-E), dash a/etc/apt/sources.list.d /elastic-6.x.list. Alright, so far so good. Now we can actually install Elasticsearch itself. To do that we can say sudo apt-get update (to update our repositories) double ampersands, sudo apt-get install elasticsearch, and this is where the magic happens. Off it goes! Now we just need to update the default configuration for Elasticsearch a little bit. So to do that, type in sudo vi /etc/elasticsearch/elasticsearch.yml. I want you to scroll down to where it says network.host. I want you to hit the I key. That will enter Insert Mode in this text editor, and now we can use the cursor to go over to the N in Network and hit Backspace to get rid of that comment, and scroll over to the end of 192.168.0.0.1. What we're gonna do is change that to 0.0.0.0, just like that, and that will open up Elasticsearch to other hosts, so we can actually access it from our Windows, or your PC's desktop. With that done, hit Escape to get out of Insert Mode, and then type in :wq. That writes the file and quits the editor. Alright, so now we have Elasticsearch set up. We just need to run it, and set it up so that it runs automatically whenever we boot up our virtual image here. So to do that, we can say sudo /bin/systemctl (for system control) daemon, (make sure you spell that right) dash reload. The next step is to actually enable Elasticsearch as a service. Sudo /bin/systemctl enable elasticsearch.service, and finally we can start it with sudo /bin/systemctl start elasticsearch.service. So we'll take a few seconds for Elasticsearch to spin up, but yeah, it's basically up and running now. Let's see if it's actually working yet. So we can just test it by typing in curl 127.0.0.1, that's the IP address of your local host, :9200, because 9200 is the port that Elasticsearch runs on, and we should see something like this. So hey, we actually got a response back from Elasticsearch, and at the end, you should see a little thing that says, "tagline : You Know, For Search." So this indicates that Elasticsearch is actually properly installed and up and running on your virtual image here, on your virtual Ubuntu machine. So congratulations! You've actually set up an Ubuntu server on a virtual machine, and set up Elasticsearch properly on it, and it's actually sitting there, waiting for you to use it. So just to get a little bit of a pay off from all this effort, let's do something fun. Let's actually install the complete works of William Shakespeare, and index it in our Elasticsearch index. So to do that, do the following: wget http://media.sundog-soft.com (make sure you don't forget the dash there) /es6/shakes-mapping.json. This is just retrieving a little file for the course that tells Elasticsearch how to store the Shakespeare data, what the various field types are, and how to store it and whatnot at the backend. Now to submit that to Elasticsearch, we can use the Curl Command like so: curl (curl is just a command that simulates an http request, by the way.) We have to say -h to send the appropriate header. So -h, then "Content-Type, (Pay attention to capitalization.) : application/json. Then we can say -XPUT, to put this mapping file in to 127.0.0.1:9200/Shakespeare, indicating that we're creating a new index called Shakespeare, -: -data-binary @shakes-mapping.json. So this is basically submitting the contents of that JSON file into Elasticsearch. We got back an acknowledgement, looks good. Hit enter a couple times to get a clean prompt. So now we can retrieve the Shakespeare data itself with the following command: wget http://media.sundog-soft.com /es6/Shakespeare_6.0.json. This is just a Shakespeare data set that comes from the Elastic website itself as a demo. You wanna take a quick peek at it, you can, with less shakespeare 6.0.json. I'm using the Tab key, by the way, to autocomplete that file name. You can see there's a bunch of Shakespearean data there, one entry for each line in every play, so lots of data. Kind of funny how it all came down so quickly. It's a good reminder of how cavalierly we throw around large amounts of data these days, huh? So next we'll actually submit the complete works of William Shakespeare into our Elasticsearch index here. To do that, we can just type in the following: curl-h 'Content-Type: application/json' -XPOST 'localhost:9200/shakespeare /doc/_bulk (for the bulk API of Elasticsearch) ?pretty, to make sure we nicely formatted results back, single quote, and then --data-binary @shakespeare_6.0.json. And that will take a few minutes to insert. We are, after all, taking the complete works of William Shakespeare and indexing it into our search engine, into Elasticsearch. So let's just wait for that to finish, and we'll come back when it's done. Alright, that took a little bit of time, but hey, it is the complete works of William Shakespeare, after all. Let's get a little bit of a payoff and actually query all that data now. To do that, we can just type in the following: curl -H 'Content-Type :application/json" -XGET (because now we're getting information back from Elasticsearch) '127.0.0.1:9200/shakespeare (for the index name) /_search (cause we're doing a search query) ?pretty to get nicely formatted results, end quote, and then -d' (meaning that we're gonna send the following data as part of the request. This will be in JSON format, so we'll start with the curly bracket. Then we will say "query": { "match_phrase": { "text_entry": "to be or not to be. So let's find out what play that actually came from, huh, that famous line. And we need to close off those curly brackets, one, two, three, and then a single quote to close off this command. Let's see what happens. Hey, it worked! So it turns out "To be or not to be? That is the question," is a line from the play Hamlet on apparently speech number 19, line number 3.1.64, if you care. But hey, how cool is that? We've actually installed Ubuntu. We've installed Elasticsearch in Ubuntu. We've indexed the complete works of William Shakespeare, and we've queried that index all in one little lesson here. So I hope you feel a sense of accomplishment after all of that. It all gets a lot easier from now. Now we have everything set up and installed, all we have to do is play with it, and that's what we're gonna do for the next several hours in this course. So congratulations for getting this far, and let's move on.