
In this short video, we will give an overview of the course and walk through the each section of this course.
In this lesson, we will learn -
In this lesson, we will -
In this lesson, we will learn -
In this lesson, we will see -
HDFS commands location in cluster - /hirw-starterkit/hdfs/commands
In this lesson, we will learn about -
In this lesson we will learn MapReduce using a good illustrative example. You will not be bored with Word Count problem, we promise !!! This lesson covers the following -
In this lesson we will -
In this lesson we will write a MapReduce program in Java to calculate the maximum closing of stock symbol from a stocks dataset. We will walk through every single line code and understand the programming concepts involved in writing MapReduce code.
Location of code, jar, readme file in cluster - /hirw-starterkit/mapreduce/stocks
In this lesson we will write a MapReduce program in Java to calculate the maximum closing of stock symbol from a stocks dataset. We will walk through every single line code and understand the programming concepts involved in writing MapReduce code.
Location of code, jar, readme file in cluster - /hirw-starterkit/mapreduce/stocks
This lesson will give you a very good introduction to Apache Pig. We will write Pig Latin instructions to calculate the maximum closing of stock symbol from a stocks dataset.
Location of pig script, readme file in cluster - /hirw-starterkit/pig/stocks
This lesson will give you a very good introduction to Apache Hive. We will create Hive table and calculate the maximum closing of stock symbol from a stocks dataset with a simple query.
Location of Hive script in cluster - /hirw-starterkit/pig/stocks
In this lecture we will learn about the benefits of Cloudera Manager, differences between Packages and Parcels and lifecycle of Parcels.
In this lecture we will see how to install a 3 node Hadoop cluster on AWS using Cloudera Manager
The objective of this course is to walk you through step by step of all the core components in Hadoop but more importantly make Hadoop learning experience easy and fun.
By enrolling in this course you can also get free access to our multi-node Hadoop training cluster so you can try out what you learn right away in a real multi-node distributed environment.
ABOUT INSTRUCTOR(S)
We are a group of Hadoop consultants who are passionate about Hadoop and Big Data technologies. 4 years ago when we were looking for Big Data consultants to work in our own projects we did not find qualified candidates because the big data industry was very new and hence we set out to train qualified candidates in Big Data ourselves giving them a deep and real world insights in to Hadoop.
WHAT YOU WILL LEARN IN THIS COURSE
In the first section you will learn about what is big data with examples. We will discuss the factors to consider when considering whether a problem is big data problem or not. We will talk about the challenges with existing technologies when it comes to big data computation. We will breakdown the Big Data problem in terms of storage and computation and understand how Hadoop approaches the problem and provide a solution to the problem.
In the HDFS, section you will learn about the need for another file system like HDFS. We will compare HDFS with traditional file systems and its benefits. We will also work with HDFS and discuss the architecture of HDFS.
In the MapReduce section you will learn about the basics of MapReduce and phases involved in MapReduce. We will go over each phase in detail and understand what happens in each phase. Then we will write a MapReduce program in Java to calculate the maximum closing price for stock symbols from a stock dataset.
In the next two sections, we will introduce you to Apache Pig & Hive. We will try to calculate the maximum closing price for stock symbols from a stock dataset using Pig and Hive.