Complete Big Data & Hadoop Masterclass
5.0 (2 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
17 students enrolled
Wishlisted Wishlist

Please confirm that you want to add Complete Big Data & Hadoop Masterclass to your Wishlist.

Add to Wishlist

Complete Big Data & Hadoop Masterclass

Become a master in Big Data - Future based Job Market Skill
5.0 (2 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
17 students enrolled
Last updated 6/2017
English
Curiosity Sale
Current price: $10 Original price: $25 Discount: 60% off
30-Day Money-Back Guarantee
Includes:
  • 5 hours on-demand video
  • Full lifetime access
  • Access on mobile and TV
  • Certificate of Completion
What Will I Learn?
  • Hadoop, MapReduce, HDFS, Spark, Pig, Hive, HBase, MongoDB, Cassandra, Flume - the list goes on! Over 25 technologies.
View Curriculum
Requirements
  • An interest to learn and face interesting challenges is enough!
Description

The world of Hadoop and "Big Data" can be intimidating - hundreds of different technologies with cryptic names form the Hadoop ecosystem. With this course, you'll not only understand what those systems are and how they fit together - but you'll go hands-on and learn how to use them to solve real business problems!

Learn and master the most popular big data technologies in this comprehensive course, taught by a former engineer and senior manager from Amazon and IMDb. We'll go way beyond Hadoop itself, and dive into all sorts of distributed systems you may need to integrate with.

  • Install and work with a real Hadoop installation right on your desktop with Hortonworks and the Ambari UI
  • Manage big data on a cluster with HDFS and MapReduce
  • Write programs to analyze data on Hadoop with Pig and Spark
  • Store and query your data with SqoopHiveMySQLHBaseCassandraMongoDBDrillPhoenix, and Presto
  • Design real-world systems using the Hadoop ecosystem
  • Learn how your cluster is managed with YARNMesosZookeeperOozieZeppelin, and Hue
  • Handle streaming data in real time with KafkaFlumeSpark StreamingFlink, and Storm

Understanding Hadoop is a highly valuable skill for anyone working at companies with large amounts of data.

Almost every large company you might want to work at uses Hadoop in some way, including Amazon, Ebay, Facebook, Google, LinkedIn, IBM,  Spotify, Twitter, and Yahoo! And it's not just technology companies that need Hadoop; even the New York Times uses Hadoop for processing images.

This course is comprehensive, covering over 25 different technologies in over 14 hours of video lectures. It's filled with hands-on activities and exercises, so you get some real experience in using Hadoop - it's not just theory.

You'll find a range of activities in this course for people at every level. If you're a project manager who just wants to learn the buzzwords, there are web UI's for many of the activities in the course that require no programming knowledge. If you're comfortable with command lines, we'll show you how to work with them too. And if you're a programmer, I'll challenge you with writing real scripts on a Hadoop system using Scala, Pig Latin, and Python.

You'll walk away from this course with a real, deep understanding of Hadoop and its associated distributed systems, and you can apply Hadoop to real-world problems. Plus a valuable completion certificate is waiting for you at the end! 

Please note the focus on this course is on application development, not Hadoop administration. Although you will pick up some administration skills along the way.

I hope to see you in the course soon!

What are the requirements?

  • You will need access to a PC running 64-bit Windows, MacOS, or Linux with an Internet connection, if you want to participate in the hands-on activities and exercises. You must have at least 8GB of RAM on your system; 10GB or more is recommended. If your PC does not meet these requirements, you can still follow along in the course without doing hands-on activities.
  • Some activities will require some prior programming experience, preferably in Python or Scala.
  • A basic familiarity with the Linux command line will be very helpful.

What am I going to get from this course?

  • Design distributed systems that manage "big data" using Hadoop and related technologies.
  • Use HDFS and MapReduce for storing and analyzing data at scale.
  • Use Pig and Spark to create scripts to process data on a Hadoop cluster in more complex ways.
  • Analyze relational data using Hive and MySQL
  • Analyze non-relational data using HBase, Cassandra, and MongoDB
  • Query data interactively with Drill, Phoenix, and Presto
  • Choose an appropriate data storage technology for your application
  • Understand how Hadoop clusters are managed by YARN, Tez, Mesos, Zookeeper, Zeppelin, Hue, and Oozie.
  • Publish data to your Hadoop cluster using Kafka, Sqoop, and Flume
  • Consume streaming data using Spark Streaming, Flink, and Storm

What is the target audience?

  • Software engineers and programmers who want to understand the larger Hadoop ecosystem, and use it to store, analyze, and vend "big data" at scale.
  • Project, program, or product managers who want to understand the lingo and high-level architecture of Hadoop.
  • Data analysts and database administrators who are curious about Hadoop and how it relates to their work.
  • System architects who need to understand the components available in the Hadoop ecosystem, and how they fit together.

-Rakesh N Chinta

Who is the target audience?
  • Anyone who wants to learn: Hadoop, MapReduce, HDFS, Spark, Pig, Hive, HBase, MongoDB, Cassandra, Flume - the list goes on! Over 25 technologies.
Students Who Viewed This Course Also Viewed
Curriculum For This Course
46 Lectures
05:00:28
+
Introduction into Hadoop
5 Lectures 26:37


hadoop and other solutions
07:25

distributed and architecture - An overview
02:54

Hadoop versions and releases
05:16
+
Hadoop Ecosystem setup
4 Lectures 51:51
Setup hadoop
28:57

Linux Ubuntu - Tips and Tricks
04:34

HDFS commands
10:32

Running a MapReduce function
07:48
+
The hadoop architecture
5 Lectures 26:42
hdfs concepts
04:35

hdfs architecture
06:35

hdfs read and write
04:54

hdfs concepts and applications
04:04

special commands
06:34
+
Understanding Hadoop MapReduce
7 Lectures 53:24
Map Reduce introduction
06:05

Understanding Mapreduce
05:12

Understanding Mapreduce part 2
05:19

Running your first MapReduce function
10:31

Combiner and Tool Runner
11:05

Recap Map, Reduce and architecture part 1
07:27

Recap Map, Reduce and architecture part 2
07:45
+
MapReduce types and formats
4 Lectures 22:42
MapReduce types and formats
05:37

Experiments and Defaults
07:11

IO Format Classes
06:16

Experiments with file Concepts
03:38
+
Classic Mapreduce and Yarn
8 Lectures 47:52
Anatomy of MapReduce Job Run
04:22

Job run classic mapreduce
07:54

Failure scenario of MapReduce
03:45

Job Run and Yarn
09:45

Failure Scenario of Yarn
05:18

Job scheduling in MapReduce
05:06

Shuffle and sort
04:32

Performance tuning and features
07:10
+
Advanced MapReduce Concepts
8 Lectures 38:05
looking at counters
06:21

hands on counter
03:32

sorting ideas with partition function
07:19

sorting ideas with partition function - continuation
05:31

Map side join operation
04:42

Reduce side join operation
04:29

side distribution of the data
03:47

hadoop streaming and hadoop pipe
02:24
+
Introduction to the hadoop ecosystem
4 Lectures 32:19
introduction to PIG
09:24

introduction to HIVE
10:07

introduction to SQOOP
08:43

Knowing Sqoop
04:05
+
Final Frontier and Certification / Exam Preparation/ Tips and strategies
1 Lecture 00:56
Final Frontier and Certification / Exam Preparation/ Tips and strategies
00:56
About the Instructor
Rakesh Naga Chinta
5.0 Average rating
2 Reviews
19 Students
2 Courses
CEO of Hubstrike, Entrepreneur, Intern at Google, Author.


Rakesh Naga Chinta is an Entrepreneur, SDE Intern at Google, Strategic Business Analyst, Author of several best-selling books.

A Harvard Alumni, with a burning passion for problem-solving and Entrepreneurship.

Currently working as product managing intern at google, where his skills are tested and sharpened every single day.

CEO and founder of  several companies such as RILAD, Hubstrike, Internetout, Techcodebit, Appscalar, Midscore, Appendscore, etc.