Udemy
    •  
    •  
    •  
    •  
    •  
    •  
    •  
    •  
Turn what you know into an opportunity and reach millions around the world.
Learn More
Your cart is empty.
Keep shopping
Learn How to Create Hadoop MapReduce Jobs in Python
Rating: 3.4 out of 5(37 ratings)
658 students

Learn How to Create Hadoop MapReduce Jobs in Python

Hadoop MapReduce Jobs Using Python
Created byInflame Tech
Last updated 8/2020
English

What you'll learn

  • Understand what is Hadoop?
  • Understand MapReduce i.e. Heart of Big Data and Hadoop.
  • Running Jobs using Python.
  • Design and Implement Mapper and Reducer phase in Python
  • Execute and Run Hadoop Streaming Jobs
  • Integrate Mapper phase and Reducer phase with Java Driver Class

Course content

1 section37 lectures4h 40m total length
  • 1.1 prerequisites0:44
  • 1.2 Course Module2:30
  • 1.3 Why MapReduce with Python1:45
  • 2.1 What is Apache Hadoop7:47
  • 2.2 Comparison with RDBMS4:10
  • 2.3 HDFS in Hadoop7:29
  • 2.4 Cluster modes of Hadoop2:42
  • 2.5 HDFS and MapReduce4:47
  • 3.1 MapReduce Model3:39
  • 3.2 Why MapReduce5:25
  • 3.3 Map and Reduce Operation5:19
  • 3.4 Data Flow In MapReduce5:35
  • 3.5 MapReduce Daemons7:07
  • 4.1 Introduction to Hadoop Streaming4:31
  • 4.2 Streaming Command Options7:28
  • 4.3 Generic Command Options3:11
  • 4.4 MapReduce Sample Program-128:56
  • 4.5 MapReduce Sample Program-221:52
  • 5.1 Chaining of MR Jobs10:34
  • 5.2 Custom Combiner17:51
  • 5.3 GenericOptionParser3:28
  • 5.4 Distributed Cache11:17
  • 6.1 JUnit Testing9:27
  • 6.2 Analysis of IRIS dataset11:11
  • 6.3 Built-in and Custom Counters in Hadoop12:58
  • 6.4 Custom Partititioner5:47
  • 6.5 Hadoop Sequence File Format6:09
  • 6.6 Read Write Sequence File5:27
  • 7.1 Hadoop Data Types1:46

    Hadoop Data Types

  • 7.2 Processing of XML File6:49

    Processing of XML File

  • 7.3 Data Compression with Hadoop15:14

    Data Compression with Hadoop

  • 7.4 Data Serialization using Avro-Theo9:21

    Data Serialization Using Avro Theo

  • 8.1 Limitations of Hadoop 1.x7:28

    Limitation of Hadoop 1.X

  • 8.2 Hadoop 2.x with YARN7:27

    Hadoop 2.x with YARN

  • 8.3 YARN and its Processing Application4:54

    YARN and its Processing Application

  • 8.4 YARN MR Application Execution Flow5:26

    YARN MapReduce Application Execution Flow

  • 8.5 Hadoop 2.x Cluster Architecture2:56

    Hadoop 2.x Cluster Architecture

Requirements

  • Basics of Computer Science
  • Basics of Hadoop would be benificial but not required
  • Basics of Object Oriented Programming

Description


Apache Hadoop is an open-source software framework for distributed storage and distributed processing of very large data sets on computer clusters built from commodity hardware. MapReduce is the heart of Apache Hadoop. MapReduce is a framework which allows developers to develop hadoop jobs in different languages. So in this course we'll learn how to create MapReduce Jobs with Python.This course will provide you an in-depth knowledge of concepts and different approaches to analyse datasets using Python Programming. 

This course on MapReduce Jobs with Python will help you to understand MapReduce Jobs Programming in Python, how to set up an environment for the running MapReduce Jobs in Python, how to submit and execute MapReduce applications in Python environment. We will start from beginning and then dive into the advanced concepts of MapReduce.

Who this course is for:

  • Big Data Professionals
  • Hadoop Developers
  • Python Developers who want to go in the field of Big Data
  • Students who are interested in Hadoop MapReduce