Master Big Data - Apache Spark/Hadoop/Sqoop/Hive/Flume
4.6 (60 ratings)
Course Ratings are calculated from individual students’ ratings and a variety of other signals, like age of rating and reliability, to ensure that they reflect course quality fairly and accurately.
2,233 students enrolled

Master Big Data - Apache Spark/Hadoop/Sqoop/Hive/Flume

In-depth course on Big Data - Apache Spark , Hadoop , Sqoop , Flume & Apache Hive, Big Data Cluster setup
4.6 (60 ratings)
Course Ratings are calculated from individual students’ ratings and a variety of other signals, like age of rating and reliability, to ensure that they reflect course quality fairly and accurately.
2,233 students enrolled
Created by Navdeep Kaur
Last updated 5/2020
English
English [Auto-generated]
Current price: $12.99 Original price: $19.99 Discount: 35% off
7 hours left at this price!
30-Day Money-Back Guarantee
This course includes
  • 7 hours on-demand video
  • 12 downloadable resources
  • Full lifetime access
  • Access on mobile and TV
  • Certificate of Completion
Training 5 or more people?

Get your team access to 4,000+ top Udemy courses anytime, anywhere.

Try Udemy for Business
What you'll learn
  • Hadoop distributed File system and commands. Lifecycle of sqoop command. Sqoop import command to migrate data from Mysql to HDFS. Sqoop import command to migrate data from Mysql to Hive. Working with various file formats, compressions, file delimeter,where clause and queries while importing the data. Understand split-by and boundary queries. Use incremental mode to migrate the data from Mysql to HDFS. Using sqoop export, migrate data from HDFS to Mysql. Using sqoop export, migrate data from Hive to Mysql. Understand Flume Architecture. Using flume, Ingest data from Twitter and save to HDFS. Using flume, Ingest data from netcat and save to HDFS. Using flume, Ingest data from exec and show on console. Flume Interceptors.
Requirements
  • No
Description

In this course, you will start by learning what is hadoop distributed file system and most common hadoop commands required to work with Hadoop File system.


Then you will be introduced to Sqoop Import

  • Understand lifecycle of sqoop command.

  • Use sqoop import command to migrate data from Mysql to HDFS.

  • Use sqoop import command to migrate data from Mysql to Hive.

  • Use various file formats, compressions, file delimeter,where clause and queries while importing the data.

  • Understand split-by and boundary queries.

  • Use incremental mode to migrate the data from Mysql to HDFS.


Further, you will learn Sqoop Export to migrate data.

  • What is sqoop export

  • Using sqoop export, migrate data from HDFS to Mysql.

  • Using sqoop export, migrate data from Hive to Mysql.



Further, you will learn about Apache Flume

  • Understand Flume Architecture.

  • Using flume, Ingest data from Twitter and save to HDFS.

  • Using flume, Ingest data from netcat and save to HDFS.

  • Using flume, Ingest data from exec and show on console.

  • Describe flume interceptors and see examples of using interceptors.

  • Flume multiple agents

  • Flume Consolidation.


In the next section, we will learn about Apache Hive

  • Hive Intro

  • External & Managed Tables

  • Working with Different Files - Parquet,Avro

  • Compressions

  • Hive Analysis

  • Hive String Functions

  • Hive Date Functions

  • Partitioning

  • Bucketing


Finally You will learn about Apache Spark

  • Spark Intro

  • Cluster Overview

  • RDD

  • DAG/Stages/Tasks

  • Actions & Transformations

  • Transformation & Action Examples

  • Spark Data frames

  • Spark Data frames - working with diff File Formats & Compression

  • Dataframes API's

  • Spark SQL

  • Dataframe Examples

  • Spark with Cassandra Integration


Who this course is for:
  • Who want to learn big data in detail
Course content
Expand all 79 lectures 06:55:09
+ Hadoop Introduction
4 lectures 38:34
Yarn Cluster Overview
07:41
Cluster Setup on Google Cloud
20:55
Environment Update
00:42
+ Sqoop Import
13 lectures 01:01:22
Managing Target Directories
02:38
Working with Different Compressions
06:17
Conditional Imports
04:26
Split-by and Boundary Queries
08:27
Field delimeters
03:18
Incremental Appends
03:14
Sqoop Hive Import
03:31
Sqoop List Tables/Database
04:13
Sqoop Import Practice1
04:57
Sqoop Import Practice2
04:17
Sqoop Import Practice3
03:32
+ Sqoop Export
2 lectures 06:09
Export from Hdfs to Mysql
03:39
Export from Hive to Mysql
02:30
+ Apache Flume
8 lectures 40:06
Flume Introduction & Architecture
02:32
Exec Source and Logger Sink
03:41
Moving data from Twitter to HDFS
09:25
Moving data from NetCat to HDFS
04:39
Flume Interceptors
01:56
Flume Interceptor Example
04:53
Flume Multi-Agent Flow
06:49
Flume Consolidation
06:11
+ Apache Hive
14 lectures 01:05:37
Hive Introduction
03:41
Hive Database
03:04
Hive Managed Tables
06:23
Hive External Tables
02:26
Hive Inserts
05:30
Hive Analytics
04:21
Working with Parquet
03:29
Compressing Parquet
04:27
Working with Fixed File Format
03:04
Alter Command
06:12
Hive String Functions
06:21
Hive Date Functions
05:39
Hive Partitioning
07:16
Hive Bucketing
03:44
+ Spark Introduction
4 lectures 23:35
Spark Intro
03:46
Resilient Distributed Datasets
02:52
Cluster Overview
06:51
DAG Overview
10:06
+ Spark : Transformation & Actions
10 lectures 48:36
Map/FlatMap Transformation
04:28
Filter/Intersection
04:00
Union/Distinct Transformation
02:23
GroupByKey/ Group people based on Birthday months
05:53
ReduceByKey / Total Number of students in each Subject
06:44
SortByKey / Sort students based on their rollno
06:03
MapPartition / MapPartitionWithIndex
06:20
Change number of Partitions
03:34
Join / join email address based on customer name
03:06
Spark Actions
06:05
+ Spark RDD Practice
6 lectures 42:19
Scala Tuples
03:05
Filter Error Logs
10:22
Frequency of word in Text File
08:35
Population of each city
03:53
Orders placed by Customers
09:20
average rating of movie
07:04
+ Spark Dataframes & Spark SQL
14 lectures 01:05:49
Dataframe Intro
02:16
Dafaframe from Json Files
08:42
Dataframe from Parquet Files
07:26
Dataframe from CSV Files
05:14
Dataframe from Avro File
07:13
Working with XML
03:22
Working with Columns
05:23
Working with String
04:05
Working with Dates
03:47
Dataframe Filter API
02:50
DataFrame API Part1
04:51
DataFrame API Part2
06:25
Spark SQL
01:41
Working with Hive Tables in Spark
02:34
+ Spark with Cassandra
4 lectures 23:02
Creating Spark RDD from Cassandra Table
09:13
Processing Cassandra data in Spark
08:18
Cassandra Rows to Case Class
02:33
Saving Spark RDD to Cassandra
02:58