Hadoop Starter Kit
4.5 (5,898 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
57,561 students enrolled
Wishlisted Wishlist

Please confirm that you want to add Hadoop Starter Kit to your Wishlist.

Add to Wishlist

Hadoop Starter Kit

Hadoop learning made easy and fun. Learn HDFS, MapReduce and introduction to Pig and Hive with FREE cluster access.
4.5 (5,898 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
57,561 students enrolled
Last updated 2/2017
English
English
Price: Free
Includes:
  • 3.5 hours on-demand video
  • 9 Supplemental Resources
  • Full lifetime access
  • Access on mobile and TV
  • Certificate of Completion
What Will I Learn?
  • Understand the Big Data problem in terms of storage and computation
  • Understand how Hadoop approach Big Data problem and provide a solution to the problem
  • Understand the need for another file system like HDFS
  • Work with HDFS
  • Understand the architecture of HDFS
  • Understand the MapReduce programming model
  • Understand the phases in MapReduce
  • Envision a problem in MapReduce
  • Write a MapReduce program with complete understanding of program constructs
  • Write Pig Latin instructions
  • Create and query Hive tables
View Curriculum
Requirements
  • Basic linux commands
  • Basic Java knowledge is only needed to understand MapReduce programming in Java. Pig, Hive and other lessons does not need Java knowledge
Description

The objective of this course is to walk you through step by step of all the core components in Hadoop but more importantly make Hadoop learning experience easy and fun.

By enrolling in this course you can also get free access to our multi-node Hadoop training cluster so you can try out what you learn right away in a real multi-node distributed environment.

ABOUT INSTRUCTOR(S)

We are a group of Hadoop consultants who are passionate about Hadoop and Big Data technologies. 4 years ago when we were looking for Big Data consultants to work in our own projects we did not find qualified candidates because the big data industry was very new and hence we set out to train qualified candidates in Big Data ourselves giving them a deep and real world insights in to Hadoop.

WHAT YOU WILL LEARN IN THIS COURSE

In the first section you will learn about what is big data with examples. We will discuss the factors to consider when considering whether a problem is big data problem or not. We will talk about the challenges with existing technologies when it comes to big data computation. We will breakdown the Big Data problem in terms of storage and computation and understand how Hadoop approaches the problem and provide a solution to the problem.

In the HDFS, section you will learn about the need for another file system like HDFS. We will compare HDFS with traditional file systems and its benefits. We will also work with HDFS and discuss the architecture of HDFS.

In the MapReduce section you will learn about the basics of MapReduce and phases involved in MapReduce. We will go over each phase in detail and understand what happens in each phase. Then we will write a MapReduce program in Java to calculate the maximum closing price for stock symbols from a stock dataset.

In the next two sections, we will introduce you to Apache Pig & Hive. We will try to calculate the maximum closing price for stock symbols from a stock dataset using Pig and Hive.

Who is the target audience?
  • This course is for anyone who wants to learn about Big Data technologies.
  • No advanced programming knowledge is needed
  • This course is for anyone who wants to learn about distributed computing and Hadoop
Students Who Viewed This Course Also Viewed
Curriculum For This Course
15 Lectures
03:19:50
+
Welcome & Let's Get Started
1 Lecture 02:55

In this short video, we will give an overview of the course and walk through the each section of this course.

Course Introduction
02:55
+
Introduction to Big Data
2 Lectures 32:40

In this lesson, we will learn -

  1. What is Big Data and some examples of Big Data
  2. The problems that come with Big Data in terms of storage and computation
  3. What Hadoop can offer in terms of solutions to the Big Data problems
  4. Compare traditional solutions with Hadoop
What Is Big Data?
17:54

In this lesson, we will -

  1. Take a sample big data problem,
  2. Analyze the problem and understand the complexities in terms of storage and computation
  3. Finally, we will work on a solution together
Understanding Big Data Problem
14:46

Test your understanding of Big Data
7 questions
+
HDFS
3 Lectures 43:45

In this lesson, we will learn -

  1. What is a file system and it's features
  2. Existing file systems
  3. Limitations of existing file systems in distributed computing
  4. How HDFS is different from local file system
  5. Basics of HDFS
  6. Benefits of HDFS
HDFS - Why Another Filesystem?
13:29

In this lesson, we will see -

  1. Practical differences between HDFS and local file system
  2. Manipulate files and directories in HDFS
  3. Commands to check or update permissions, replications and file system check
  4. Physical blocks location and preview at hdfs-site.xml


HDFS commands location in cluster - /hirw-starterkit/hdfs/commands

Working With HDFS
17:26

In this lesson, we will learn about -

  1. Data Node
  2. Name Node
  3. Information held by Name Node
  4. HDFS configuration files
  5. Topology - Node, Rack, Cluster
HDFS Architechture
12:50

Test your understanding of HDFS
6 questions
+
MapReduce
4 Lectures 56:14

In this lesson we will learn MapReduce using a good illustrative example. You will not be bored with Word Count problem, we promise !!! This lesson covers the following -

  1. The basics of MapReduce
  2. Introduction to Phases of MapReduce phases
  3. Introduction to technical terms like Mapper, Reducer, InputSplit etc.
Introduction To MapReduce
08:51

In this lesson we will -

  1. Dive deeper in to each phase of MapReduce
  2. Learn the difference between InputSplit vs Block
  3. Significance of Shuffle phase
  4. Partitioner, Combiner etc


Dissecting MapReduce Components
18:05

In this lesson we will write a MapReduce program in Java to calculate the maximum closing of stock symbol from a stocks dataset. We will walk through every single line code and understand the programming concepts involved in writing MapReduce code.

Location of code, jar, readme file in cluster - /hirw-starterkit/mapreduce/stocks

Dissecting MapReduce Program (Part 1)
12:05

In this lesson we will write a MapReduce program in Java to calculate the maximum closing of stock symbol from a stocks dataset. We will walk through every single line code and understand the programming concepts involved in writing MapReduce code.

Location of code, jar, readme file in cluster - /hirw-starterkit/mapreduce/stocks

Dissecting MapReduce Program (Part 2)
17:13

Test your understanding of MapReduce
6 questions
+
Apache Pig
1 Lecture 12:05

This lesson will give you a very good introduction to Apache Pig. We will write Pig Latin instructions to calculate the maximum closing of stock symbol from a stocks dataset.

Location of pig script, readme file in cluster - /hirw-starterkit/pig/stocks

Introduction to Apache Pig
12:05
+
Apache Hive
1 Lecture 08:28

This lesson will give you a very good introduction to Apache Hive. We will create Hive table and calculate the maximum closing of stock symbol from a stocks dataset with a simple query.

Location of Hive script in cluster - /hirw-starterkit/pig/stocks

Introduction to Apache Hive
08:28

Test your understanding of Pig & Hive
3 questions
+
Hadoop Administrator In Real World (Upcoming Course)
2 Lectures 37:15

In this lecture we will learn about the benefits of Cloudera Manager, differences between Packages and Parcels and lifecycle of Parcels.

Cloudera Manager - Introduction
13:08

In this lecture we will see how to install a 3 node Hadoop cluster on AWS using Cloudera Manager

Cloudera Manager - Installation
24:07
+
Our Hadoop Developer course
1 Lecture 06:28
BONUS: Hadoop In Real World Course: Become an Expert Hadoop Developer
06:28
About the Instructor
Hadoop In Real World
4.5 Average rating
6,661 Reviews
60,027 Students
3 Courses
Expert Big Data Consultants

We are a group of Senior Hadoop Consultants who are passionate about Hadoop and Big Data technologies. We have experience across several key domains from finance and retail to social media and gaming. We have worked with Hadoop clusters ranging from 50 all the way to 800 nodes.

We have been teaching Hadoop for several years now. Check out our FREE and successful Hadoop Starter Kit course at Udemy.