Complete Hadoop Framework including kafka,spark and mongo db
4.1 (120 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
924 students enrolled
Wishlisted Wishlist

Please confirm that you want to add Complete Hadoop Framework including kafka,spark and mongo db to your Wishlist.

Add to Wishlist

Complete Hadoop Framework including kafka,spark and mongo db

Complete hands on learning on hadoop framework and its ecosystems including advanced concepts like apache spark, kafka
4.1 (120 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
924 students enrolled
Last updated 3/2017
Current price: $10 Original price: $20 Discount: 50% off
5 hours left at this price!
30-Day Money-Back Guarantee
  • 18 hours on-demand video
  • 18 Supplemental Resources
  • Full lifetime access
  • Access on mobile and TV
  • Certificate of Completion
What Will I Learn?
  • Importance of hadoop framework in BigData analytics
  • Understanding Hadoop Framework in detail
  • Hands on experience on data ingestion techniques : Apache Sqoop and Apache Flume
  • Hands on experience on MapReduce Programming and its hidden concepts
  • Hands on experience on Apache Hive Programming, Performance tuning, UDF's
  • Understand and work with Pig
  • Realtime data streaming analysis with Apache Spark and its ecosystems
  • Understand and work with Apache Kafka
  • Process workflow automation using Oozie
  • Understand and work with MongoDb
  • Case Studies , practical explanations and Interview Questions
View Curriculum
  • Be familiar with sql concepts, programming basics
  • Download Cloudera quickstart VM CDH 5.8 and install VMWare workstation player. Environment setup guidance will be covered in our lectures

Data Analytics is the practice of using data to drive business strategy and performance. It includes a range of approaches and solutions, from looking backward to evaluate what happened in the past to looking forward to do scenario planning and predictive modelling.Data Analytics spans all of the functional businesses to address a continuum of opportunities in Information Management, Performance Optimisation and Analytic Insights. Organizations now realize the inherent value of transforming these big data into actionable insights. Data science is the highest form of big data analytics that produce the most accurate actionable insights, identifying what will happen next and what to do about it. 

Hadoop is an open-source software framework for storing data and running applications on clusters of commodity hardware. It provides massive storage for any kind of data, enormous processing power and the ability to handle virtually limitless concurrent tasks or jobs. Hadoop is not just an effective distributed storage system for large amounts of data, but also, importantly, a distributed computing environment that can execute analyses where the data is.

In this course, detailed explanation about hadoop framework  and its ecosystems has been provided. All the concepts are explained in detail with examples and business use cases as case studies.Also, latest technologies in big data area like apache spark, apache kafka, Mongo DB are explained. In addition, Interview questions  with respect to each ecosystem and resume preparation tips are included.

Who is the target audience?
  • This course is addressed to the students who has some prior knowledge on programming, sql concepts.
  • Any one who is interested to pursue their career as a hadoop developer
Students Who Viewed This Course Also Viewed
Curriculum For This Course
90 Lectures
Course Introduction
1 Lecture 02:27

This video provides detailed overview of this course and the topics that will be covered as a part of this course. Also , a brief note about trainer's profile and experience is also mentioned.

Preview 02:27
BigData Introduction
3 Lectures 39:16

This video explains in detail about the Big Data Introduction, Challenges of BigData, sources of bigdata along with real time example scenarios

BigDataIntroduction and Sources of BigData

This video provides information about hadoop introduction, different roles in hadoop and distributors of hadoop etc.,Realtime example has been discussed to demonstrate the roles in hadoop.

Hadoop Introduction

This video explains the overview about hadoop ecosystems. Each ecosystem introduction and their importance in big data stack. 

Hadoop Ecosystems Overview
HDFS Architecture
4 Lectures 52:24

This video explains about distributed architecture ., how big data addresses storage issue using hdfs architecture and daemon services of hadoop1 architecture.

HDFS Architecture1

In continuation to the previous video, this video explains about the hdfs architecture in detail

HDFS Architecture2

In continuation to the previous video, this video explains about the concepts of edge node, cluster nodes, responsibilities of job tracker and namenode in detail

Preview 16:00

This video explains about the disadvantages of hadoop1 architecture and introduces Yarn architecture and its deamon services. Also , how hadoop2 architecture has overcome the limitations of hadoop1 is explained in detail.

Hadoop2 - YARN Architecture
Environment Setup and Hadoop Linux Commands
3 Lectures 47:53

This video explains the process of setting up hadoop in pseudo distribution mode. Also I have explained the softwares required to download for hadoop quickstart virtual environment setup.

Environment Setup and Hadoop ecosystems

This video provides the information about the linux commands that are used to interact with hdfs.Using these basic linux commands user can interact with hdfs to store the big data and also to implement his business logic on the data in hdfs.

Hadoop Linux commands

This video provides the way how we connect to a cluster node or edge node remotely from a window desktop using putty.exe and also I have explained the file transfer from windows machine to datanode using winscp.

Remote Desktop connection to cluster node via Putty and FileTransfer via Winscp
1 Lecture 02:00

This quiz contains questions on hdfs architecture and the deamon services in hadoop1 and hadoop2. Through this quiz student can check their understanding on hdfs architecture which are discussed in previous sections.

HDFS Architecture Quiz
20 questions

This video provides the detailed explanation about how hadoop handles namenode failure. 

HDFS high availability
Data Ingestion Using Apache Sqoop and Apache Flume
8 Lectures 01:48:11
Data Ingestion from local to hdfs

This video explains the second approach in ingesting the data from a remote machine to edgenode/clusternode using sftp protocol in linux and using winscp in windows.

Data Ingestion from remote machine to edge node or clusternode

This video provides handson experience on data ingestion from RDMS mysql database to hdfs

Preview 20:31

This video provides practical demo on incremental append scenario in sqoop.

Incremental Append in sqoop

EnclosedBY and escapedBy in sqoop

This video provides the practical demonstration of sqoop commands like querying,columnar records sqooping,importing all the tables from a database etc.,

Sqoop Commands and other attributes

This video provides the information about apache flume and its components, architecture, properties used etc.,

Apache Flume Introduction

This video provides a practical demo on data ingestion of streaming data from  an external source folder to hdfs using spoolDir source property in flume

Apache Flume Demo
Apache Hive
17 Lectures 03:07:06

This video provides the introduction to hive and the way how managed tables can be created in hive and how to load the data in to those tables etc.,

Hive Introduction and Managed Tables

This video provides a hands on demonstration of external tables creation in hive

External Tables in Hive

This video provides a diagrammatic explanation of hive architecture and its components. Also the way how we execute hive queries in GUI mode using HUE manager.

Hive Architecture

This video provides a hands on demo on how we partition the data using hive and advantages of using partitioning concept in hive. Also we will discuss about types of partitioning we have in hive

Hive Partitioning

This video provides detailed hands on demo on dividing the data into buckets and the properties to set to load the data into bucketing tables

Hive Bucketing

Various properties have to be set in order to enable certain features in hive which are disabled by default. For example dynamic partitioning, data loading into bucketing tables needs extra properties to be set. They will be dealt in this video

SET properties in hive

Inpu dataset can be xml documents also. We will be learning how to process xml documents in hive as a part of this demo

Xml parsing in hive

Through this video you can learn json file processing in hive with a detailed example

Json file processing in hive

Apart from hive cli which is deprecated we have beeline shell to connect to hive server . This video will provide you the hands on approach to demonstrate connectivity to beeline shell and its usage

Beeline Mode in hive

we have various file formats in hive depending on the compression techniques used and size., all these file formats are explained in detail in this video

Various File Formats in Hive (Text,RC,ORC,Sequence)

This video demonstrates various file formats in hive with  an example

Demo for File formats in Hive

This video provides a hands on demonstration about complex datatypes in hive like structs, unions, array and map .

Complex data ypes in hive

This video explain about the properties to be set for enabling update and delete operations in hive

Update and delete operations in hive

Hive converts joins over multiple tables into a single map/reduce job if for every table the same column is used in the join clauses. This video demonstrates mapside join in hive

Hive Joins

Hive Join Demo2

This video demonstrates the way how we execute hive scripts

Hive UDFs

Performance tuning techniques in Apache Hive
Quiz 2
0 Lectures 00:00

self-assessment quiz on sqoop,flume and hive ecosystems

Quiz on Apache sqoop, Flume and Hive ecosystems
18 questions
Apache Pig
7 Lectures 01:26:39

This video helps you to understand the importance of pig in performing data cleansing operations in hadoop and its basic commands to start with

Pig Introduction

Working with pig Commands

This video explains the usage of group and cogroup commands in apache pig

Group and CoGroup in PIG

SPLIT command in PIG

This video provides the detailed explanation about FILTER,JOIN,RANK,FLATTEN,ORDERBY,DISTINCT commands in pig


In this video you will learn how to create and execute a pig script

Executing a pigScript

In this video you will learn how to work with user defined functions in pig

Working with pigUDFs
Core Java Programming
4 Lectures 38:13

This video helps you to understand the basic building blocks of programming and java concepts

Introduction to core java programming and its importance in hadoop

Basic building blocks of core java programming are explained in detail in video along with eclipseIDE environment usage

Java Programming basics

This video explains about inheritance,polymorphism,abstraction and encapsulation properties in java with sample hands on demo. Also interfaces are explained with an example program

Object Oriented Programming Features in Java

AccessSpecifiers,Final and Static keywords,ExceptionHandling in Java
7 More Sections
About the Instructor
Srikanth Gorripati
4.1 Average rating
121 Reviews
931 Students
2 Courses
Software Developer

 I have wide range of experience in leveraging the technical concepts through a simple video lectures in order to make the listeners gain the practical knowledge on the technical concepts of Database systems and Data Analytics areas. I am working as a software professional in a leading MNC for the last 6 years. I have completed my Masters in Information Technology. I have provided various guest lectures on Java,Android , Hadoop and Mongo DB and trained close to 22000 associates / students / professionals in the last 6 years. I am a certified Hadoop Developer and Java Developer. Currently i am working as a Big Data Hadoop Developer.