Note : This course is built on top of the "Real World Vagrant For Distributed Computing - Toyin Akin" course"NoSQL", "Big Data", "DevOps" and "In Memory Database" technology are a hot and highly valuable skill to have – and this course will teach you how to quickly create a distributed environment for you to deploy these technologies on.
A combination of VirtualBox and Vagrant will transform your desktop machine into a virtual cluster. However this needs to be configured correctly. Simply enabling multinode within Vagrant is not good enough. It needs to be tuned. Developers and Operators within large enterprises, including investment banks, all use Vagrant to simulate Production environments.
After all, if you are developing against or operating a distributed environment, it needs to be tested. Tested in terms of code deployed and the deployment code itself.
You'll learn the same techniques these enterprise guys use on your own Microsoft Windows computer/laptop.
Vagrant provides easy to configure, reproducible, and portable work environments built on top of industry-standard technology and controlled by a single consistent workflow to help maximize the productivity and flexibility of you and your team.
This course will use VirtualBox to carve out your virtual environment. However the same skills learned with Vagrant can be used to provision virtual machines on VMware, AWS, or any other provider.
If you are a developer, this course will help you will isolate dependencies and their configuration within a single disposable, consistent environment, without sacrificing any of the tools you are used to working with (editors, browsers, debuggers, etc.). Once you or someone else creates a single Vagrantfile, you just need to vagrant up and everything is installed and configured for you to work. Other members of your team create their development environments from the same configuration. Say goodbye to "works on my machine" bugs.
If you are an operations engineer, this course will help you build a disposable environment and consistent workflow for developing and testing infrastructure management scripts. You can quickly test your deployment scripts and more using local virtualization such as VirtualBox or VMware. (VirtualBox for this course). Ditch your custom scripts to recycle EC2 instances, stop juggling SSH prompts to various machines, and start using Vagrant to bring sanity to your life.
If you are a designer, this course will help you with distributed installation of software in order for you to focus on doing what you do best: design. Once a developer configures Vagrant, you do not need to worry about how to get that software running ever again. No more bothering other developers to help you fix your environment so you can test designs. Just check out the code, vagrant up, and start designing.
Introduction to HortonWork 2.5. Includes Spark 2.0 and Zeppelin !!
Here we describe the Vagrant Topology that we are going to build. We also make a small change to the format of our /etc/hosts file. Download the Vagrant-VirtualBox-HDP-RESOURCES.zip flle to view the vagrant file we will be starting with.
Here we configure our local Hortonworks HDP 2.5 repository. We do not want to be connecting to the internet when deploying our Hadoop cluster. Especially if we are trying to simulate PRODUCTION!! Thus we download the HDP binaries ahead of time.
Here we download the HDP binaries and verify our repository is working after we execute a Vagrant up command.
Here we tune our Centos O/S settings. This will be valid across all the Virtual Machines. We tune or configure ntp, firewall, swappiness and TCP buffers.
Here, We configure a Local WebServer to handle our Hortonworks and HDP rpm components.
In this video a low amount of memory was configured in order to get the webserver running. We do increase this in later videos. I would advise increasing the memory to 2GB or even 3GB for this video if you want to avoid sluggish webserver behaviour.
NOTE : I have also included, as a resource, a minimal vagrant file with associated scripts that can be used to test access to the /hdp250/ webserver directory. This is in order to eliminate user errors up to this stage (for the web server). In order to run it, set the memory on line #2 of the vagrant file.. Modify the path to the repository directory on line #4 of the vagrant file, then vagrant up. You can access the hdp250 location via : 10.10.87.10/hdp250/
Here we Install the Apache Ambari Server. The Hadoop Management Tool.
Here we finally configure the Ambari Server and Agent. Run the Virtual Machines and then build the final deployable "Images" for Ambari Server and Agent. We no longer need to Install these management tools when we run our cluster. We simply boot up the Virtual Machines and start the services from these new "images". Much quicker.
Here we register the new images (Manager and Agent) with Vagrant
Here we perform "vagrant up" on our final multi Virtual Machine Topology for our Hadoop Cluster. We will be booting up three Hadoop master nodes and four Hadoop slave nodes. We also have the Ambari Server node
Bonus : We perform some Cluster troubleshooting.
Here we detail the steps of registering the Ambari agents to the Ambari Server and performing a Hadoop installation.
Quick run through via the second pass. All green !!
Congratulations, you made it! We take a quick look at Zeppelin
I spent 6 years at "Royal Bank of Scotland" and 5 years at the investment bank "BNP Paribas" developing and managing Interest Rate Derivatives services as well as engineering and deploying In Memory DataBases (Oracle Coherence), NoSQL and Hadoop clusters (Cloudera) into production.
In 2016, I left to start my own training, POC-D. "Proof Of Concept - Delivered", which focuses on delivering training on IMDB (In Memory Database), NoSQL, BigData and DevOps technology.
From Q3 2017, this will also include FinTech Training in Capital Markets using Microsoft Excel (Windows), JVM languages (Java/Scala) as well as .NET (C#, VB.NET, C++/CLI, F# and IronPythyon)
I have a YouTube Channel, publishing snippets of my videos. These are not courses. Simply ad-hoc videos discussing various distributed computing ideas.
Check out my website and/or YouTube for more info
See you inside ...