Real World Vagrant - Automate a Cloudera Manager Build
4.5 (14 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
238 students enrolled
Wishlisted Wishlist

Please confirm that you want to add Real World Vagrant - Automate a Cloudera Manager Build to your Wishlist.

Add to Wishlist

Real World Vagrant - Automate a Cloudera Manager Build

Build a Distributed Cluster of Cloudera Manager and any number of Cloudera Manager Agent nodes with a single command!
4.5 (14 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
238 students enrolled
Created by Toyin Akin
Last updated 1/2017
English
Curiosity Sale
Current price: $10 Original price: $90 Discount: 89% off
30-Day Money-Back Guarantee
Includes:
  • 3 hours on-demand video
  • 2 Articles
  • 3 Supplemental Resources
  • Full lifetime access
  • Access on mobile and TV
  • Certificate of Completion
What Will I Learn?
  • Simply run a single command on your desktop, go for a coffee, and come back with a running distributed environment for cluster deployment
  • Quickly build an environment where Cloudera and Hadoop software can be installed
  • Ability to automate the installation of software across multiple Virtual Machines
View Curriculum
Requirements
  • Basic programming or scripting experience is required.
  • You will need a desktop PC and an Internet connection. The course is created with Windows in mind.
  • The software needed for this course is freely available
  • You will require a computer with a Virtualization chipset support - VT-x. Most computers purchased over the last five years should be good enough
  • Optional : Some exposure to Linux and/or Bash shell environment
  • 64-bit Windows operating system required (Would recommend Windows 7 or above)
  • This course is not recommened if you have no desire to work with/in distributed computing
  • This course is built on top of - "Real World Vagrant For Distributed Computing"
Description

Note : This course is built on top of the "Real World Vagrant For Distributed Computing - Toyin Akin" course

"NoSQL", "Big Data", "DevOps" and "In Memory Database" technology are a hot and highly valuable skill to have – and this course will teach you how to quickly create a distributed environment for you to deploy these technologies on. 

A combination of VirtualBox and Vagrant will transform your desktop machine into a virtual cluster. However this needs to be configured correctly. Simply enabling multinode within Vagrant is not good enough. It needs to be tuned. Developers and Operators within large enterprises, including investment banks, all use Vagrant to simulate Production environments. 

After all, if you are developing against or operating a distributed environment, it needs to be tested. Tested in terms of code deployed and the deployment code itself.

You'll learn the same techniques these enterprise guys use on your own Microsoft Windows computer/laptop.

Vagrant provides easy to configure, reproducible, and portable work environments built on top of industry-standard technology and controlled by a single consistent workflow to help maximize the productivity and flexibility of you and your team.

This course will use VirtualBox to carve out your virtual environment. However the same skills learned with Vagrant can be used to provision virtual machines on VMware, AWS, or any other provider.

If you are a developer, this course will help you will isolate dependencies and their configuration within a single disposable, consistent environment, without sacrificing any of the tools you are used to working with (editors, browsers, debuggers, etc.). Once you or someone else creates a single Vagrantfile, you just need to vagrant up and everything is installed and configured for you to work. Other members of your team create their development environments from the same configuration. Say goodbye to "works on my machine" bugs.

If you are an operations engineer, this course will help you build a disposable environment and consistent workflow for developing and testing infrastructure management scripts. You can quickly test your deployment scripts and more using local virtualization such as VirtualBox or VMware. (VirtualBox for this course). Ditch your custom scripts to recycle EC2 instances, stop juggling SSH prompts to various machines, and start using Vagrant to bring sanity to your life.

If you are a designer, this course will help you with distributed installation of software in order for you to focus on doing what you do best: design. Once a developer configures Vagrant, you do not need to worry about how to get that software running ever again. No more bothering other developers to help you fix your environment so you can test designs. Just check out the code, vagrant up, and start designing.

.

Here I present a curriculum as to the current state of my Cloudera courses.

My Hadoop courses are based on Vagrant so that you can practice and destroy your virtual environment before applying the installation onto real servers/VMs.

.

For those with little or no knowledge of the Hadoop eco system Udemy course : Big Data Intro for IT Administrators, Devs and Consultants

.

I would first practice with Vagrant so that you can carve out a virtual environment on your local desktop. You don't want to corrupt your physical servers if you do not understand the steps or make a mistake. Udemy course : Real World Vagrant For Distributed Computing

.

I would then, on the virtual servers, deploy Cloudera Manager plus agents. Agents are the guys that will sit on all the slave nodes ready to deploy your Hadoop services Udemy course : Real World Vagrant - Automate a Cloudera Manager Build

.

Then deploy the Hadoop services across your cluster (via the installed Cloudera Manager in the previous step). We look at the logic regarding the placement of master and slave services. Udemy course : Real World Hadoop - Deploying Hadoop with Cloudera Manager

.

If you want to play around with HDFS commands (Hands on distributed file manipulation). Udemy course : Real World Hadoop - Hands on Enterprise Distributed Storage.

.

You can also automate the deployment of the Hadoop services via Python (using the Cloudera Manager Python API). But this is an advanced step and thus I would make sure that you understand how to manually deploy the Hadoop services first. Udemy course : Real World Hadoop - Automating Hadoop install with Python!

.

There is also the upgrade step. Once you have a running cluster, how do you upgrade to a newer hadoop cluster (Both for Cloudera Manager and the Hadoop Services). Udemy course : Real World Hadoop - Upgrade Cloudera and Hadoop hands on


Who is the target audience?
  • Software engineers who want to expand their skills into the world of distributed computing
  • System Engineers that want to expand their skillsets beyond the single server
  • Developers who want to write/test their code against a valid distributed enviroment
Students Who Viewed This Course Also Viewed
Curriculum For This Course
17 Lectures
03:11:16
+
Vagrant for Big Data Testing
2 Lectures 08:14

Here we try to justify the use of using vagrant to automate a Cloudera Manager build with Vagrant

Preview 08:14

Suggested course curriculum to follow ...
Preview 00:00
+
Setup our Vagrantfile so that we can build our box templates
9 Lectures 01:52:21
Base Vagrant file
00:43

Here we walk through a simple Vagrant Script

Here we walk through a simple Vagrant Script
09:16

Even though we use the vagrant hostmanager to manage the /etc/hosts file. We take control and handle the guest /etc/hosts file ourselves.

Modify the hosts file to make it Cloudera friendly
16:50

In this lecture, we download the Cloudera Manager rpms and create a local repository. As we will be automating the installation of the Cloudera components, the installation will be non interactive..

Here we download the Cloudera Manager rpms and create a local repository
19:56

In this video, we configure the Centos O/S. firewall, ntp, tcp buffers and swappiness settings. We do as much as possible to satisfy the requirements of the best practice for tuning the O/S for Hadoop nodes.

Here we configure the Centos O/S. firewall, ntp, tcp buffers and swappiness
13:27

Here, we setup a local webserver to house Cloudera's CDH Parcels. CDH parcels hold the binaries for the Hadoop cluster. Cloudera's Parcels are alternatives to rpms.

Setup Local Webserver to house Cloudera's CDH Parcels
19:25

In this lecture, we find Cloudera's Online Parcel Repository and download a Parcel.

Here we find Cloudera's Online Parcel Repository and download a Parcel
07:06

Here, we complete the Cloudera setup, by automating the Installation Cloudera Manager and Agents

Automate the Installation Cloudera Manager and Agents
18:11

Here, we quickly validate our vagrant template file by walking through the test Cloudera Manager UI

Quickly validate our template by walking through the Cloudera Manager UI
07:27
+
Package the Manager and Agent into varant image templates
5 Lectures 01:05:03

Here we create two Cloudera vagrant boxes. We export out the Manager and Agent Virtual Machines. These will become our base boxes to boot up our cluster. No need to install components anymore!

Package the Cloudera Components. Manager and Agent
12:12

Here we boot up a Cluster topology using the new Cloudera vagrant base boxes

Preview 18:37

Here we have our first pass of deploying an Hadoop Cluster.

First pass - Deploying an Hadoop Cluster.
19:45

Here we quickly go through our final pass of installing the cluster

Second pass - Deploying an Hadoop Cluster
12:27

We we look at the issues you may face when deploying services that require access to the embedded postgres database. Services such as HIVE. We detail the solution.

Bonus - Hadoop Services that require access to a database.
02:02
+
Conclusion
1 Lecture 05:39

Final words

Conclusion
05:39
About the Instructor
Toyin Akin
3.8 Average rating
135 Reviews
1,374 Students
15 Courses
Big Data Engineer, Capital Markets FinTech Developer

I spent 6 years at "Royal Bank of Scotland" and 5 years at the investment bank "BNP Paribas"  developing and managing Interest Rate Derivatives services as well as engineering and deploying In Memory DataBases (Oracle Coherence), NoSQL and Hadoop clusters (Cloudera) into production.

In 2016, I left to start my own training, POC-D. "Proof Of Concept - Delivered", which focuses on delivering training on IMDB (In Memory Database), NoSQL, BigData and DevOps technology. 

From Q3 2017, this will also include FinTech Training in Capital Markets using Microsoft Excel (Windows), JVM languages (Java/Scala) as well as .NET (C#, VB.NET, C++/CLI, F# and IronPythyon)

I have a YouTube Channel, publishing snippets of my videos. These are not courses. Simply ad-hoc videos discussing various distributed computing ideas.

Check out my website and/or YouTube for more info

See you inside ...