Big Data and Hadoop for Beginners - with Hands-on!
4.2 (548 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
22,446 students enrolled
Wishlisted Wishlist

Please confirm that you want to add Big Data and Hadoop for Beginners - with Hands-on! to your Wishlist.

Add to Wishlist

Big Data and Hadoop for Beginners - with Hands-on!

Everything you need to know about Big Data, and Learn Hadoop, HDFS, MapReduce, Hive & Pig by designing Data Pipeline.
4.2 (548 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
22,446 students enrolled
Created by Andalib Ansari
Last updated 3/2017
English
Current price: $10 Original price: $200 Discount: 95% off
2 days left at this price!
30-Day Money-Back Guarantee
Includes:
  • 3 hours on-demand video
  • 1 Article
  • 7 Supplemental Resources
  • Full lifetime access
  • Access on mobile and TV
  • Certificate of Completion
What Will I Learn?
  • Understand different technology trends, salary trends, Big Data market and different job roles in Big Data
  • Understand what Hadoop is for, and how it works
  • Understand complex architectures of Hadoop and its component
  • Hadoop installation on your machine
  • Understand how MapReduce, Hive and Pig can be used to analyze big data sets
  • High quality documents
  • Demos: Running HDFS commands, Hive queries, Pig queries
  • Sample data sets and scripts (HDFS commands, Hive sample queries, Pig sample queries, Data Pipeline sample queries)
  • Start writing your own codes in Hive and Pig to process huge volumes of data
  • Design your own data pipeline using Pig and Hive
  • Understand modern data architecture: Data Lake
  • Practice with Big Data sets
View Curriculum
Requirements
  • Basics knowledge of SQL and RDBMS would be a plus
  • Machine- Mac or Linux/Unix or Windows
Description

The main objective of this course is to help you understand Complex Architectures of Hadoop and its components, guide you in the right direction to start with, and quickly start working with Hadoop and its components.

It covers everything what you need as a Big Data Beginner. Learn about Big Data market, different job roles, technology trends, history of Hadoop, HDFS, Hadoop Ecosystem, Hive and Pig. In this course, we will see how as a beginner one should start with Hadoop. This course comes with a lot of hands-on examples which will help you learn Hadoop quickly.

The course have 6 sections, and focuses on the following topics:

Big Data at a Glance: Learn about Big Data and different job roles required in Big Data market. Know big data salary trends around the globe. Learn about hottest technologies and their trends in the market.

Getting Started with Hadoop: Understand Hadoop and its complex architecture. Learn Hadoop Ecosystem with simple examples. Know different versions of Hadoop (Hadoop 1.x vs Hadoop 2.x), different Hadoop Vendors in the market and Hadoop on Cloud. Understand how Hadoop uses ELT approach. Learn installing Hadoop on your machine. We will see running HDFS commands from command line to manage HDFS.

Getting Started with Hive: Understand what kind of problem Hive solves in Big Data. Learn its architectural design and working mechanism. Know data models in Hive, different file formats supported by Hive, Hive queries etc. We will see running queries in Hive.

Getting Started with Pig: Understand how Pig solves problems in Big Data. Learn its architectural design and working mechanism. Understand how Pig Latin works in Pig. You will understand the differences between SQL and Pig Latin. Demos on running different queries in Pig.

Use Cases: Real life applications of Hadoop is really important to better understand Hadoop and its components, hence we will be learning by designing a sample Data Pipeline in Hadoop to process big data. Also, understand how companies are adopting modern data architecture i.e. Data Lake in their data infrastructure.

Practice: Practice with huge Data Sets. Learn Design and Optimization Techniques by designing Data Models, Data Pipelines by using real life applications' data sets. 

Check out some of our reviews from real students:

"I liked the hands-on approach. very helpful."

"Overall definitely worth the money for what you get, I learnt so much about Big Data."

"I absolutely recommend taking this course."

"Loved it. Saved lots of time searching information on the internet."

"Very informative, and the course gave me what I was looking for. Thanks!"

"Big Data introduction can be daunting with several new keywords and components that one needs to understand. But, this course very clearly explains to a beginner about the architecture and different tools that can be leveraged in a big data project. It also has indications on the scope of big data in the industry, different roles one can perform in the big data space and also cover various commercial distributions of big data. Overall, a great course for a beginner to get started on the fundamentals of big data. Use Case is a bonus !"

Who is the target audience?
  • This course can be opted by anyone (students, developer, manager) who is interested to learn big data. This course assumes everyone as a beginner, and teaches all fundamentals of Big Data, Hadoop and its complex architecture.
Students Who Viewed This Course Also Viewed
Curriculum For This Course
31 Lectures
03:07:55
+
Welcome to the Course
1 Lecture 03:49

a brief introduction about the course, and what you need to get started.

Preview 03:49
+
Big Data at a Glance
5 Lectures 28:03

This high level introduction will help you understand what Big Data is for, how they are being generated, who are using it, and how we can use it.

Introduction to Big Data
09:23

This lecture discusses about different job roles required in the Big Data Industry. It will also help you understand what are the skills you need to have for a specific job role in Big Data.

Job Roles in Big Data
06:30

Understand salaries trend across different job roles in Big Data.

Salary Analysis
02:55

Understand why Big Data is so disrupting. Learn what are the latest technology trends in the market, and how Big Data is playing an important role.

Technology Trends in the Market
06:30

Being a beginner, this lecture talks about how you should start with Big Data, and how you should proceed.

Advice for Big Data Beginners
02:45
+
Getting Started with Hadoop
8 Lectures 49:36

In this lecture, we will learn about history of Hadoop, Hadoop Data Storage engine and Hadoop Data Processing engine with a very nice and simple demo to understand how it works.

Introduction to Hadoop
08:23

In this lecture, we will understand what are the components in Hadoop Ecosystem, and how they work with each other.

Hadoop Ecosystem
05:01

Here we will learn about  architecture of Hadoop, different versions of Hadoop ( i.e. Hadoop 1.x & Hadoop 2.x), and also we will understand what are the enhancements and improvements have been done in Hadoop 2.x with respect to Hadoop 1.x

Hadoop 1.x vs Hadoop 2.x
14:13

Data Processing, Cleaning and Transformation are important parts when it comes to dealing with any amount of data. In this lecture, we will understand how Hadoop uses ELT approach in comparison with traditional ETL approach. 

ETL vs ELT
03:19

There are various Hadoop distributions available by different vendors in the market. We will briefly cover about them, and understand how they are easy to use and move to production. 

Different Hadoop Vendors
04:20

a very simple and easy guide to install Hadoop on your machine (Mac/Windows/Others)

Hadoop Installation
14 pages

We will learn important Hadoop commands to work with HDFS. This will help you when you will be doing some POCs or working on Production.

Preview 09:09

In this lecture, we will learn benefits of working with Hadoop on cloud, and how it is easy to install, manage and scale at Cloud.

Hadoop on Cloud
05:11
+
Getting Started with Hive
7 Lectures 43:17

We will briefly cover how Hive is used to process large volumes of data, and how it works.

Introduction to Hive
02:41

a deep dive into Hive where we will learn about Hive architecture, and how it works internally.

Hive Architecture
02:28

Understand Hive Data Models with detailed explanations which you would need to know when you start working with Hive.

Hive Data Model
07:55

We will briefly cover about different file formats that Hive understand and comparison between them.

File Formats in Hive (Text, Parquet, RCFile, ORC)
04:40

Hive being a data warehouse solution built of top of Hadoop. In this lecture, we will learn about how Hive queries are similar to SQL, and how it is easy to write a query in Hive.

SQL vs HQL
03:46

Understand how we can build custom functions in Hive to process huge volumes of data.

UDF & UDAF in Hive
02:57

A very nice demo on Hive to understand how Hive works on top of Hadoop. Different exercises for you to play with Hive.

Hive Demo
18:50
+
Getting Started with Pig
7 Lectures 31:37

A very high level of introduction to Pig built on top of Hadoop to process huge volumes of data.

Introduction to Pig
02:57

Deep dive into Pig Architecture..

Pig Architecture
01:39

Learn about Data Models in Pig which you would need to know when you are starting to work with Pig.

Pig Data Model
02:17

We will cover about Pig Latin which is a Data Flow language in Pig which is used to design Data Pipelines to process big data.

How Pig Latin Works
02:57

Understand Similarities and Differences between SQL and Pig Latin.
 

SQL vs PIG
05:32

In this lecture, we will learn what UDF is, and how it can be used to design custom functions to process Big data.

UDF in Pig
03:26

A very nice demo on Pig to understand how Pig is used to process huge volumes of data. A lot exercises for you to play with Pig.

Pig Demo
12:49
+
Use Cases
2 Lectures 13:23

In this lecture, we will cover about real life applications of Pig and Hive. We will understand it by designing a Data Pipeline using them. 

Designing Data Pipeline using Pig and Hive
07:59

Understand how various organizations are adopting modern data architecture (i.e. Data Lake) on their productions.

Data Lake
05:24
+
Practice
1 Lecture 04:20

In this exercise we will be analyzing Taxi Trips data by designing a Data Warehouse using Hive. There will be Billions of rows in the tables to analyze. By doing this exercise, you will be learning:

  • Designing Optimized Data Model in Hive
  • Query Optimization Techniques 
  • ETL process to load data into Dimension and Fact Tables
  • Automated Data Pipeline Techniques and much more..
Practice-1: Analyzing Taxi Trips Data
04:20
About the Instructor
Andalib Ansari
4.2 Average rating
546 Reviews
22,446 Students
1 Course
Big Data Consultant

Andalib Ansari is a Big Data consultant based out of Mumbai. He helps companies and people solve business problems using Big Data technologies. Also, one of his passion, to guide and train people on different Big Data tools and technologies.

He is having a very decent exposure of Big Data tools and technologies, and have worked with various clients, top level Mobile Network Operators (MNO), from Latin America and the US to solve different business problems for different use-cases, and designed optimized Data Pipelines using Big Data technologies on the cloud.