Projects in Hadoop and Big Data - Learn by Building Apps

A Practical Course to Learn Big Data Technologies While Developing Professional Projects
4.0 (94 ratings) Instead of using a simple lifetime average, Udemy calculates a
course's star rating by considering a number of different factors
such as the number of ratings, the age of ratings, and the
likelihood of fraudulent ratings.
3,422 students enrolled
$19
$40
52% off
Take This Course
  • Lectures 43
  • Length 10 hours
  • Skill Level Intermediate Level
  • Languages English
  • Includes Lifetime access
    30 day money back guarantee!
    Available on iOS and Android
    Certificate of Completion
Wishlisted Wishlist

How taking a course works

Discover

Find online courses made by experts from around the world.

Learn

Take your courses with you and learn anywhere, anytime.

Master

Learn and practice real-world skills and achieve your goals.

About This Course

Published 10/2015 English

Course Description

The most awaited Big Data course on the planet is here. The course covers all the major big data technologies within the Hadoop ecosystem and weave them together in real life projects. So while doing the course you not only learn the nuances of the hadoop and its associated technologies but see how they solve real world problems and how they are being used by companies worldwide.

This course will help you take a quantum jump and will help you build Hadoop solutions that will solve real world problems. However we must warn you that this course is not for the faint hearted and will test your abilities and knowledge while help you build a cutting edge knowhow in the most happening technology space. The course focuses on the following topics

Add Value to Existing Data - Learn how technologies such as Mapreduce applies to Clustering problems. The project focus on removing duplicate or equivalent values from a very large data set with Mapreduce.

Hadoop Analytics and NoSQL - Parse a twitter stream with Python, extract keyword with apache pig and map to hdfs, pull from hdfs and push to mongodb with pig, visualise data with node js . Learn all this in this cool project.

Kafka Streaming with Yarn and Zookeeper - Set up a twitter stream with Python, set up a Kafka stream with java code for producers and consumers, package and deploy java code with apache samza.

Real-Time Stream Processing with Apache Kafka and Apache Storm - This project focus on twitter streaming but uses Kafka and apache storm and you will learn to use each of them effectively.

Big Data Applications for the Healthcare Industry with Apache Sqoop and Apache Solr - Set up the relational schema for a Health Care Data dictionary used by the US Dept of Veterans Affairs, demonstrate underlying technology and conceptual framework. Demonstrate issues with certain join queries that fail on MySQL, map technology to a Hadoop/Hive stack with Scoop and HCatalog, show how this stack can perform the query successfully.

Log collection and analytics with the Hadoop Distributed File System using Apache Flume and Apache HCatalog - Use Apache Flume and Apache HCatalog to map real time log stream to hdfs and tail this file as Flume event stream. , Map data from hdfs to Python with Pig, use Python modules for analytic queries

Data Science with Hadoop Predictive Analytics - Create structured data with Mapreduce, Map data from hdfs to Python with Pig, run Python Machine Learning logistic regression, use Python modules for regression matrices and supervise training

Visual Analytics with Apache Spark on Yarn - Create structured data with Mapreduce, Map data from hdfs to Python with Spark, convert Spark dataframes and RDD’s to Python datastructures, Perform Python visualisations

Customer 360 degree view, Big Data Analytics for e-commerce - Demonstrate use of EComerce tool ‘Datameer’ to perform many fof the analytic queries from part 6,7 and 8. Perform queries in the context of Senitment analysis and Twiteer stream.

Putting it all together Big Data with Amazon Elastic Map Reduce - Rub clustering code on AWS Mapreduce cluster. Using AWS Java sdk spin up a Dedicated task cluster with the same attributes.


So after this course you can confidently built almost any system within the Hadoop family of technologies. This course comes with complete source code and fully operational Virtual machines which will help you build the projects quickly without wasting too much time on system setup. The course also comes with English captions. So buckle up and join us on our journey into the Big Data.

What are the requirements?

  • Working knowledge of Hadoop is expected before starting this course
  • Basic programming knowledge of Java and Python will be great

What am I going to get from this course?

  • Understand the Hadoop Ecosystem and Associated Technologies
  • Learn Concepts to Solve Real World Problems
  • Learn the Updated Changes in Hadoop
  • Use Code Examples Present Here to Create Your own Big Data Services
  • Get fully functional VMs fine tuned and created specifically for this course.

What is the target audience?

  • Students who want to use Hadoop and Big Data in their Workplace and want to learn the implementation details for big data technologies.

What you get with this course?

Not for you? No problem.
30 day money back guarantee.

Forever yours.
Lifetime access.

Learn on the go.
Desktop, iOS and Android.

Get rewarded.
Certificate of completion.

Curriculum

Section 1: Introduction
Introduction
Preview
03:32
13 pages

Source VMs for the Projects

Section 2: Add Value to Existing Data with Mapreduce
Introduction to the Project
Preview
15:00
Build and Run the Basic Code
14:08
Understanding the Code
13:54
Dependencies and packages
14:43
Section 3: Hadoop Analytics and NoSQL
Introduction to Hadoop Analytics
15:46
Introduction to NoSQL Database
15:28
Solution Architecture
14:50
Installing the Solution
09:11
Section 4: Kafka Streaming with Yarn and Zookeeper
Introduction to Kafka Yarn and Zookeeper
14:29
Code Structure
15:22
Creating Kafka Streams
15:17
Yarn Job with Samza
15:35
Section 5: Real Time Stream processing with Apache Kafka and Apache Storm
Real Time Streaming
15:09
Hortonbox Virtual Machine
14:58
Running in Cluster Mode
15:30
Submitting the Storm Jar
13:58
Section 6: Big Data Applications for the Healthcare Industry with Apache Sqoop and Apache S
Introduction to the Project
14:14
Introduction to HDDAccess
14:46
Sqoop, Hive and Solr
13:56
Hive Usage
16:06
Section 7: Log collection and analytics with the Hadoop Distributed File System using Apach
Apache Flume and HCatalog
15:18
Install and Configure Apache Flume
14:51
Visualisation of the Data
14:51
Embedded Pig Scripts
13:36
Section 8: Data Science with Hadoop Predictive Analytics
Introduction to Data Science
14:48
Source Code Review
14:52
Setting Up the Machine
15:09
Project Review
15:10
Section 9: Visual Analytics with Apache Spark on Yarn
Project Setup
15:30
Setting Up Java Dependencies
15:24
Spark Analytics with PySpark
15:36
Bringing it all together
13:50
Section 10: Customer 360 degree view, Big Data Analytics for e-commerce
Ecommerce and Big Data
14:59
Installing Datameer
15:43
Analytics and Visualizations
15:50
Demonstration
13:29
Section 11: Putting it all together Big Data with Amazon Elastic Map Reduce
Introduction to the Project
15:55
Configuration
15:28
Setting Up Cluster on EMR
15:01
Dedicated Task Cluster on EMR
15:28
Section 12: Summary
Summary
02:03

Students Who Viewed This Course Also Viewed

  • Loading
  • Loading
  • Loading

Instructor Biography

Eduonix Learning Soultions, 1+ Million Students Worldwide | 200+ Courses

Eduonix creates and distributes high quality technology training content. Our team of industry professionals have been training manpower for more than a decade. We aim to teach technology the way it is used in industry and professional world. We have professional team of trainers for technologies ranging from Mobility, Web to Enterprise and Database and Server Administration.

Instructor Biography

Instructor Biography

Ready to start learning?
Take This Course