Installing Scala and Spark on Linux (Ubuntu)

Jose Portilla
A free video tutorial from Jose Portilla
Head of Data Science, Pierian Data Inc.
4.6 instructor rating • 34 courses • 2,467,280 students

Lecture description

Full guide to installing Spark and Scala on a Linux Ubuntu Platform

Learn more from the full course

Scala and Spark for Big Data and Machine Learning

Learn the latest Big Data technology - Spark and Scala, including Spark 2.0 DataFrames!

10:06:42 of on-demand video • Updated September 2019

  • Use Scala for Programming
  • Use Spark 2.0 DataFrames to read and manipulate data
  • Use Spark to Process Large Datasets
  • Understand hot to use Spark on AWS and DataBricks
English [Auto] Hello, everyone, and welcome to the Escala and Sparke Ubuntu installation lecture, and this lecture will be walking you through how to install Scala and its dependencies such as Java and also how to install Sparke, as well as the Atom text editor. Let's get started. All right, here I am at my Ubuntu desktop. First thing you need to do is open up your terminal if you don't know how to do that. You can just always search your computer and then search for terminal and it should bring it up. This case already have it open. So I'll just go ahead and close that here at the terminal. We need to install Java first so you can go ahead and check to make sure Java is already installed by just typing Java. Version to enter. And if you get something like this, it means you need to install Java and JDK in order to make sure that Schola installs correctly. First thing you want to do is say pseudo apps get and then update. Keep in mind, when we're using pseudo, you're going to need to enter your password so you don't have a password for your Ubuntu program. You're going to need to get that. Then just put in your password, since this is a pseudo command, it will work and it will make sure that SUDEP get is fully updated. So that will update our package index, which will make sure that we actually grab the latest versions of everything we need. And another quick note I want to make is you can check out a written instruction file or a link to a written instruction file in the resource section for this lecture. So in case you just want to read, instead of actually watching this video, you can just rip through the commands once you've gone ahead and said pseudo get update. The next thing I want to do is actually install the default JDK. So say pseudo apte get. Install default, JDK hit Enter, it'll ask you if you want to continue to say yes or put in a for us and it will install JDK, I'm going to jump ahead in time to this finish installation. All right, so Java has finished downloading and installing now we need to download and install Schola. So in your terminal you can type this, you will say pseudo. At GET. Install Schola. Enter, you'll get a prompt that says, do you want to continue to apply for yes, and it will download install Schola for us. Let's jump in time to this finished installation. All right, so we finished installing Schola to make sure that everything installed correctly go ahead and type Schola into your terminal and you should see a scholar REPL or read, evaluate, print loop. And then if you want to really make sure that everything's working and say Prince l.n. double quotes. Hello hit enter and you should say hello back. Right now we have Scallan stall but we need to go on to the next step and install Sparke. If you have everything running like this so far, you are good. So do Colen. Cue to quit this schol interpretor and you can type clear for your terminal. So so far we've installed Java and we've installed Schola. Now it's time to install Sparke. In order to install Sparke, we need to have Ghiz installed, so to make sure we have get installed, just say pseudo. AAPT or APTs get install, gitti get to enter, say yes to continue and let that install. I'm going to jump again in time to this finished installation. In order to install Sparke, we need to download the entire package, so open up a browser in order to do this. In this case, I'm just using Firefox Web browser and go to spark that Apache dot org may pull this up. Spark, the Apache dog hit Enter. You should see this page up and up. And then once that's loaded, click on download and make sure that you're using the latest version of Sparke that's greater than 2.0. If you want, you can use two point zero point one 2.0 point two, etc, but it has to be greater than 2.0 to make sure everything we do in this course works. You want a pre-built for Hadoop, two point seven and later and you can do a direct download or you can select an Apache Mirah if the direct download is slow, but direct download should be fine for us. Go ahead and click here to download this TGS and then OK. And let that download, I'm going to jump ahead in time where this is finished downloading. All right, my download has completed. I have this spark two point zero point one, and it's pre-built for Hadoop two point seven. Note that it's A to Z file. So we need to extract that. I have it under my downloads folder. I will move it to home just so it's easier to find later on. So under home now I can see I have it this dotted folder in case it's not under a Dalitz for you. You may also want to check your top or temporary folder that should be available to you in your browser. If you just go over here, you have a click here to display progress of ongoing downloads that should also help you find it. Chrome also has a very similar thing where on the bottom over here it will show you where files downloaded are actually saved. Now click to terminal. And go home, you can do this by just saying CWD to change directory to your home directory and now you should see that spark, that thought TDE file is there. Let's say thaat X, V, F and then begin typing Sparke, you can use tab to autocomplete this and this should extract everything that we need. So let that extract, I will jump forward in time. All right, now, that has finished extracting that spark folder, I should be able to just say ls list and see that extract that spark folder so you will change directory to spark and you can use tab to autocomplete this. List of folders here and you should see Ben as a folder, say, c.D, to Ben. Then go ahead and list these again, you should see things such as Sparks of Sparks, sequel, Spark, Shell Sparks, etc. We're going to show you how to work with those later on. Right now, we're going to keep things simple and just open up the spark shell, which is basically like a reed evaluate print loop sort of terminal deal with Spark instead of just Schola. So to do this, a dot forward slash spark dash shell. Hit enter, and this should eventually bring up the spark cell, and as you're running this, you may see some warnings, but don't worry, we'll show you later on how to set logging levels so you don't see a bunch of warnings as you're starting up a spark shell. Right now, we'll keep things simple. You should see that it's starting a spark session, spark context. And it also gives you a little weblink here. You can go ahead. And if you scroll up here, grab that http url and it has a web interface for you to check out, which is actually really cool. So to make sure everything's working here under Sparke. Type in Prince L.N.. Then type hello, world, this has to be in double quotes, hit enter. You should see back hello world. That means everything's working correctly for us, which is exactly what we need. You now have Sparke Scholar and Java all set up on your Ubuntu computer type in Kolan Queue to quit out of this. Next, let's show you how to install the Atom text editor, which is the first ID will be working with to start off with. When we're learning Schola we will actually not even be using an editor will go line by line and Schola or Spark Schell's prompt. But later on we will expand our knowledge to working up a text letter and then after that we will actually show you how to work with intelligent. But for now, keeping things simple, let's just go. Atem text editor. Enter click on the first link, it should be just Atum, the AIO, and this will eventually take you to a download page. And then once you're at this website, just say, download that, Deb, and this will begin to download this Debian package for Adam, say save file and then let that download. I'm going to jump ahead in time for this to finish downloading. All right, now that that has finished downloading, let's find that file right here, Adam, and it's say the file, you can open up its file location. In this case, it's right here under downloads, once you've located the DB file, you can right click and you should be able to see open with software install. Click on that. And this should eventually install your Adam software. So you should see Ubuntu software have it load up. And then click install. And this will go ahead and install the Adama's text editor. Once you have Adam text Ed installed, you may need to put in your password, so put that in and authenticate. And now that we have Adam installed in open, there's one last thing we need to do, and that's actually to install some packages that will help us program with Schola and Spark. You can click here under install a package open installer. And here you are going to search for Schola. Enter, it's going to be in searching packages for Schola and then we want the skull language support and Adam that's just should just be language that Schola click install and that will install a skull language support for this text editor, meaning things such as code completion and text highlighting. Once that has finished installing, we want to install one more package. So go and search for terminal, and we will install a package which will allow us to open a terminal right here in Atem, there's lots of options for this. I prefer this platform, you idee terminal. So let's install that one. All right, now that we have this terminal plug in, download it, let's show you how to create an example Schol script and click down here. There's a plus that will open up a new terminal and then we will go ahead and create an example schol script. So you'll just come up here. You can close any of these save file you file. And let's just say Prince Ellen. My first Schola script double quotes there, so this is just a print line command and we'll learn how to do a lot more Schola later on and then down here where it says plain text click on that and then type in Schola. This will encode this to be a scholar and then you should see the syntax highlighting. Now we haven't saved this yet so you can do controller commands and then save this. We will save this as my first script that Schola keep things simple, I'm just putting it right in my home directory. Save that and then let's show you how to run this file. I will say call that spark shell. In this case, I need to call the whole folder path, so I will say spark wherever that was, and then begin and then spark shell and you, depending on where you are located in your terminal, you may need to put in like home user spark etc enter that should load up the spark shell. As long as your folder locations are exactly the same as what I showed you earlier, this will then load up the spark shell. And once that has loaded up, we can just say, Colin load and then type in the name of your script. In this case, it's my first script that Schola hit Enter, and you should see my first script as the output perfect. So if you have any questions on this, download an installation process. Feel free to post to the Q&A forums, but make sure you do a search the Q&A forums and check out the written guide. As long as you followed everything exactly as I showed you, you should have been able to follow along with the exact same steps. All right. Thanks, everyone, and I'll see you at the next lecture.