Note : This course is built on top of the "Real World Vagrant For Distributed Computing - Toyin Akin" course
This course enables you to package a complete Spark Development environment into your own custom 2.3GB vagrant box.
Once built you no longer need to manipulate your Windows machine in order to get a fully fledged Spark environment to work. With the final solution, you can boot up a complete Apache Spark environment in under 3 minutes!!
Install any version of Spark you prefer. We have codified for 1.6.2 or 2.0.1. but it's pretty easy to extend this for a new version.
Why Apache Spark ...
Apache Spark run programs up to 100x faster than Hadoop MapReduce in memory, or 10x faster on disk.
Apache Spark has an advanced DAG execution engine that supports cyclic data flow and in-memory computing.
Apache Spark offers over 80 high-level operators that make it easy to build parallel apps. And you can use it interactively from the Scala, Python and R shells.
Apache Spark can combine SQL, streaming, and complex analytics.
Apache Spark powers a stack of libraries including SQL and DataFrames, MLlib for machine learning, GraphX, and Spark Streaming. You can combine these libraries seamlessly in the same application.
For testing or playing around with Apache Spark. Don't contaminate your computer! Keep your PC or Mac clean by simply building and running a custom spark development environment within a Virtual Machine.
Suggested Spark Udemy curriculum courses to follow ...
Here we make sure that we are all on the same page. We boot up a vanilla virtual machine. Good grasp of basic Vagrant commands are essential.
Here I show you have to simply modify your Vagrant file to switch to a graphical Centos (RHEL) Linux Virtual Machine. We will navigate briefly within this Virtual Machine. This is the image we will be configuring. Amazing that you have a Graphical O/S in under 1.2GB!
Here we tune the Development Virtual Machine by configuring the clock (sync with external servers), hostname, firewall and some O/S low level settings.
Java has already been configured to be installed. Here we add maven to the picture.
Here, the Scala programming language will be installed.
At last! We install the Eclipse tool for Scala and show that the same development environment that you know and love on Windows/Mac, is that same on Centos. Oh yes, you can still code in Java!
Scala has already been configured to be installed. Here we add sbt to the picture.
We go through downloading the Spark binaries and adding the automation and configuration of Spark within Vagrant.
Here we add a working Spark example. This example uses a combination of tools. sbt, sbt-eclipse, ScalaIDE and Spark.
Here we perform some cosmetic changes. Change keyboard layout as well as create a link to the ScalaIDE and then generate our final development Virtual Machine. All the hard work above will be contained in a ~ 2.3GB vagrant box file.
Hard to believe... This new environment will now boot up in under 2.5 minutes (On my machine anyway!) We also execute the Spark example within the ScalaIDE. Ensuring everything works. You can now give this final box and vagrant file to your colleague and they can have a Spark Environment up and running in under 2.5 minutes.
I spent 6 years at "Royal Bank of Scotland" and 5 years at the investment bank "BNP Paribas" developing and managing Interest Rate Derivatives services as well as engineering and deploying In Memory DataBases (Oracle Coherence), NoSQL and Hadoop clusters (Cloudera) into production.
In 2016, I left to start my own training, POC-D. "Proof Of Concept - Delivered", which focuses on delivering training on IMDB (In Memory Database), NoSQL, BigData and DevOps technology.
From Q3 2017, this will also include FinTech Training in Capital Markets using Microsoft Excel (Windows), JVM languages (Java/Scala) as well as .NET (C#, VB.NET, C++/CLI, F# and IronPythyon)
I have a YouTube Channel, publishing snippets of my videos. These are not courses. Simply ad-hoc videos discussing various distributed computing ideas.
Check out my website and/or YouTube for more info
See you inside ...