Learn How to Create Hadoop MapReduce Jobs in Python

Name: Learn How to Create Hadoop MapReduce Jobs in Python
Rating: 3.4 (37 reviews)

Hadoop MapReduce Jobs Using Python

Created byInflame Tech

Last updated 8/2020

English

What you'll learn

Understand what is Hadoop?
Understand MapReduce i.e. Heart of Big Data and Hadoop.
Running Jobs using Python.
Design and Implement Mapper and Reducer phase in Python
Execute and Run Hadoop Streaming Jobs
Integrate Mapper phase and Reducer phase with Java Driver Class

Course content

1 section • 37 lectures • 4h 40m total length

1.1 prerequisites0:44
1.2 Course Module2:30
1.3 Why MapReduce with Python1:45
2.1 What is Apache Hadoop7:47
2.2 Comparison with RDBMS4:10
2.3 HDFS in Hadoop7:29
2.4 Cluster modes of Hadoop2:42
2.5 HDFS and MapReduce4:47
3.1 MapReduce Model3:39
3.2 Why MapReduce5:25
3.3 Map and Reduce Operation5:19
3.4 Data Flow In MapReduce5:35
3.5 MapReduce Daemons7:07
4.1 Introduction to Hadoop Streaming4:31
4.2 Streaming Command Options7:28
4.3 Generic Command Options3:11
4.4 MapReduce Sample Program-128:56
4.5 MapReduce Sample Program-221:52
5.1 Chaining of MR Jobs10:34
5.2 Custom Combiner17:51
5.3 GenericOptionParser3:28
5.4 Distributed Cache11:17
6.1 JUnit Testing9:27
6.2 Analysis of IRIS dataset11:11
6.3 Built-in and Custom Counters in Hadoop12:58
6.4 Custom Partititioner5:47
6.5 Hadoop Sequence File Format6:09
6.6 Read Write Sequence File5:27
7.1 Hadoop Data Types1:46
Hadoop Data Types
7.2 Processing of XML File6:49
Processing of XML File
7.3 Data Compression with Hadoop15:14
Data Compression with Hadoop
7.4 Data Serialization using Avro-Theo9:21
Data Serialization Using Avro Theo
8.1 Limitations of Hadoop 1.x7:28
Limitation of Hadoop 1.X
8.2 Hadoop 2.x with YARN7:27
Hadoop 2.x with YARN
8.3 YARN and its Processing Application4:54
YARN and its Processing Application
8.4 YARN MR Application Execution Flow5:26
YARN MapReduce Application Execution Flow
8.5 Hadoop 2.x Cluster Architecture2:56
Hadoop 2.x Cluster Architecture

Requirements

Basics of Computer Science
Basics of Hadoop would be benificial but not required
Basics of Object Oriented Programming

Description

Apache Hadoop is an open-source software framework for distributed storage and distributed processing of very large data sets on computer clusters built from commodity hardware. MapReduce is the heart of Apache Hadoop. MapReduce is a framework which allows developers to develop hadoop jobs in different languages. So in this course we'll learn how to create MapReduce Jobs with Python.This course will provide you an in-depth knowledge of concepts and different approaches to analyse datasets using Python Programming.

This course on MapReduce Jobs with Python will help you to understand MapReduce Jobs Programming in Python, how to set up an environment for the running MapReduce Jobs in Python, how to submit and execute MapReduce applications in Python environment. We will start from beginning and then dive into the advanced concepts of MapReduce.

Who this course is for:

Big Data Professionals
Hadoop Developers
Python Developers who want to go in the field of Big Data
Students who are interested in Hadoop MapReduce