Real World Vagrant - Hortonworks Data Platform 2.5
4.4 (4 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
169 students enrolled
Wishlisted Wishlist

Please confirm that you want to add Real World Vagrant - Hortonworks Data Platform 2.5 to your Wishlist.

Add to Wishlist

Real World Vagrant - Hortonworks Data Platform 2.5

Build a Distributed Cluster of Hortonworks 2.5 Manager and Agent nodes with a single command! Includes Spark 2.0!
4.4 (4 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
169 students enrolled
Created by Toyin Akin
Last updated 4/2017
English
Curiosity Sale
Current price: $10 Original price: $90 Discount: 89% off
30-Day Money-Back Guarantee
Includes:
  • 3.5 hours on-demand video
  • 3 Articles
  • 4 Supplemental Resources
  • Full lifetime access
  • Access on mobile and TV
  • Certificate of Completion
What Will I Learn?
  • Simply run a single command on your desktop, go for a coffee, and come back with a running distributed environment for cluster deployment
  • Quickly build an environment where Hortonworks and Hadoop software can be installed
  • Ability to automate the installation of software across multiple Virtual Machines
View Curriculum
Requirements
  • Basic programming or scripting experience is required.
  • You will need a desktop PC and an Internet connection. The course is created with Windows in mind.
  • The software needed for this course is freely available
  • You will require a computer with a Virtualization chipset support - VT-x. Most computers purchased over the last five years should be good enough
  • Optional : Some exposure to Linux and/or Bash shell environment
  • 64-bit Windows operating system required (Would recommend Windows 7 or above)
  • This course is not recommened if you have no desire to work with/in distributed computing
Description

Note : This course is built on top of the "Real World Vagrant For Distributed Computing - Toyin Akin" course

"NoSQL", "Big Data", "DevOps" and "In Memory Database" technology are a hot and highly valuable skill to have – and this course will teach you how to quickly create a distributed environment for you to deploy these technologies on. 

A combination of VirtualBox and Vagrant will transform your desktop machine into a virtual cluster. However this needs to be configured correctly. Simply enabling multinode within Vagrant is not good enough. It needs to be tuned. Developers and Operators within large enterprises, including investment banks, all use Vagrant to simulate Production environments. 

After all, if you are developing against or operating a distributed environment, it needs to be tested. Tested in terms of code deployed and the deployment code itself.

You'll learn the same techniques these enterprise guys use on your own Microsoft Windows computer/laptop.

Vagrant provides easy to configure, reproducible, and portable work environments built on top of industry-standard technology and controlled by a single consistent workflow to help maximize the productivity and flexibility of you and your team.

This course will use VirtualBox to carve out your virtual environment. However the same skills learned with Vagrant can be used to provision virtual machines on VMware, AWS, or any other provider.

If you are a developer, this course will help you will isolate dependencies and their configuration within a single disposable, consistent environment, without sacrificing any of the tools you are used to working with (editors, browsers, debuggers, etc.). Once you or someone else creates a single Vagrantfile, you just need to vagrant up and everything is installed and configured for you to work. Other members of your team create their development environments from the same configuration. Say goodbye to "works on my machine" bugs.

If you are an operations engineer, this course will help you build a disposable environment and consistent workflow for developing and testing infrastructure management scripts. You can quickly test your deployment scripts and more using local virtualization such as VirtualBox or VMware. (VirtualBox for this course). Ditch your custom scripts to recycle EC2 instances, stop juggling SSH prompts to various machines, and start using Vagrant to bring sanity to your life.

If you are a designer, this course will help you with distributed installation of software in order for you to focus on doing what you do best: design. Once a developer configures Vagrant, you do not need to worry about how to get that software running ever again. No more bothering other developers to help you fix your environment so you can test designs. Just check out the code, vagrant up, and start designing.

Who is the target audience?
  • Software engineers who want to expand their skills into the world of distributed computing
  • System Engineers that want to expand their skillsets beyond the single server
  • Developers who want to write/test their code against a valid distributed enviroment
Students Who Viewed This Course Also Viewed
Curriculum For This Course
18 Lectures
03:18:23
+
Deploy a multinode Hadoop Cluster with HaortonWorks 2.5
2 Lectures 19:47

Introduction to HortonWork 2.5. Includes Spark 2.0 and Zeppelin !!

Preview 19:47

Suggested course curriculum to follow ...
Preview 00:00
+
Setup our Local HortonWorks Infrastructure - On Vagrant
7 Lectures 01:10:09
Base Vagrant file
00:43

Here we describe the Vagrant Topology that we are going to build. We also make a small change to the format of our /etc/hosts file. Download the Vagrant-VirtualBox-HDP-RESOURCES.zip flle to view the vagrant file we will be starting with.


Preview 19:41

Here we configure our local Hortonworks HDP 2.5 repository. We do not want to be connecting to the internet when deploying our Hadoop cluster. Especially if we are trying to simulate PRODUCTION!! Thus we download the HDP binaries ahead of time.

Here we configure our local Hortonworks HDP 2.5 repository.
09:25

Here we download the HDP binaries and verify our repository is working after we execute a Vagrant up command.

Download the HDP binaries and verify our repository is working
10:47

HDP-UTILS-1.1.0.21 April 2017 update
00:21

Here we tune our Centos O/S settings. This will be valid across all the Virtual Machines. We tune or configure ntp, firewall, swappiness and TCP buffers.

Tune the O/S of the Virtual Machines. ntp, firewall, swappiness, TCP buffers ...
10:47

Here, We configure a Local WebServer to handle our Hortonworks and HDP rpm components.

In this video a low amount of memory was configured in order to get the webserver running. We do increase this in later videos. I would advise increasing the memory to 2GB or even 3GB for this video if you want to avoid sluggish webserver behaviour.


NOTE : I have also included, as a resource, a minimal vagrant file with associated scripts that can be used to test access to the /hdp250/ webserver directory. This is in order to eliminate user errors up to this stage (for the web server). In order to run it, set the memory on line #2 of the vagrant file.. Modify the path to the repository directory on line #4 of the vagrant file, then vagrant up. You can access the hdp250 location via : 10.10.87.10/hdp250/

Configure a Local WebServer to handle our Hortonworks and HDP rpm components.
18:25
+
Install and Package the HortonWork Components
4 Lectures 49:41

Here we Install the Apache Ambari Server. The Hadoop Management Tool.

Part I - Install Apache Ambari Server. The Hadoop Management Tool.
11:53

Here we Install the Apache Ambari Server. The Hadoop Management Tool.
Part II - Install Apache Ambari Server. The Hadoop Management Tool.
11:45

Here we finally configure the Ambari Server and Agent. Run the Virtual Machines and then build the final deployable "Images" for Ambari Server and Agent. We no longer need to Install these management tools when we run our cluster. We simply boot up the Virtual Machines and start the services from these new "images". Much quicker.

Packgae the Final Deployable Images for Ambari Server and Agent.
18:04

Here we register the new images (Manager and Agent) with Vagrant

Register the new image with Vagrant
07:59
+
Boot up the Final Multi Virtual Machine Topology for our Hadoop Cluster
2 Lectures 25:05

Here we perform "vagrant up" on our final multi Virtual Machine Topology for our Hadoop Cluster. We will be booting up three Hadoop master nodes and four Hadoop slave nodes. We also have the Ambari Server node

Preview 13:35

Bonus : We perform some Cluster troubleshooting.

Bonus : We perform some Cluster troubleshooting
11:30
+
Install Hadoop - HDP
2 Lectures 24:31

Here we  detail the steps of registering the Ambari agents to the Ambari Server and performing a Hadoop installation.

First Pass - Install Hadoop ...
18:05

Quick run through via the second pass. All green !!

Second Pass - Install Hadoop ...
06:26
+
Conclusion
1 Lecture 09:12

Congratulations, you made it! We take a quick look at Zeppelin

Conclusion
09:12
About the Instructor
Toyin Akin
3.8 Average rating
135 Reviews
1,374 Students
15 Courses
Big Data Engineer, Capital Markets FinTech Developer

I spent 6 years at "Royal Bank of Scotland" and 5 years at the investment bank "BNP Paribas"  developing and managing Interest Rate Derivatives services as well as engineering and deploying In Memory DataBases (Oracle Coherence), NoSQL and Hadoop clusters (Cloudera) into production.

In 2016, I left to start my own training, POC-D. "Proof Of Concept - Delivered", which focuses on delivering training on IMDB (In Memory Database), NoSQL, BigData and DevOps technology. 

From Q3 2017, this will also include FinTech Training in Capital Markets using Microsoft Excel (Windows), JVM languages (Java/Scala) as well as .NET (C#, VB.NET, C++/CLI, F# and IronPythyon)

I have a YouTube Channel, publishing snippets of my videos. These are not courses. Simply ad-hoc videos discussing various distributed computing ideas.

Check out my website and/or YouTube for more info

See you inside ...