Learn Apache Spark with Python
3.7 (2 ratings)
Course Ratings are calculated from individual students’ ratings and a variety of other signals, like age of rating and reliability, to ensure that they reflect course quality fairly and accurately.
148 students enrolled

Learn Apache Spark with Python

A Complete Guide and Integration of Apache Spark Framework and Python Programming
3.7 (2 ratings)
Course Ratings are calculated from individual students’ ratings and a variety of other signals, like age of rating and reliability, to ensure that they reflect course quality fairly and accurately.
148 students enrolled
Last updated 2/2019
English
Current price: $11.99 Original price: $99.99 Discount: 88% off
3 days left at this price!
30-Day Money-Back Guarantee
This course includes
  • 8 hours on-demand video
  • Full lifetime access
  • Access on mobile and TV
  • Certificate of Completion
Training 5 or more people?

Get your team access to Udemy's top 3,000+ courses anytime, anywhere.

Try Udemy for Business
What you'll learn
  • Introduction to Pyspark
  • Filtering RDDs

  • Install and run Apache Spark on a desktop computer or on a cluster

  • Understand how Spark SQL lets you work with structured data
  • Understanding Spark with Examples and many more
Course content
Expand all 74 lectures 07:51:45
+ Module 2 Introduction to Big Data and Hadoop
18 lectures 01:24:05
Big Data Overview
03:08
Facts about Big Data
03:32
Big Data Scenarios
02:16
Apache Hadoop Framework
03:00
Top Hadoop Users
02:20
HISTORY OF HADOOP
02:04
Difference between RDBMS and Hadoop
01:20
Cluster Modes in Hadoop
01:07
Hadoop Ecosystem
05:23
HDFS Daemons and Mapreduce daemons
02:57
HADOOP CLUSTER ARCHITECTURE
03:08
Top Reasons Why you should Learn Hadoop
07:49
Hadoop distributions and compatibilities
00:46
Hadoop Ecosystem in Detail
18:40
Hadoop Distributed File System
08:01
HDFS Files and Blocks
04:06
HDFS components and architecture
09:21
HDFS File Read and Write
05:07
+ Module 3 Apache Spark Framework
10 lectures 48:51
Batch and Real Time Analytics
03:29
Why Spark when Hadoop is Already there
01:49
Introduction to Apache Spark
03:53
Features of Apache Spark
03:12
Users and Use Cases of Apache Spark
11:00
Job Execution Flow and Spark Execution
00:51
Spark Unified Stack
07:32
Complete Picture of Apache Spark
04:39
Apache spark Architecture
08:22
Top Companies Using Spark
04:04
+ Module 4 Python Programming Language
15 lectures 02:29:09
Getting Started with Python
00:18
Introduction to Python
06:16
Advantages and facts about python
07:10
First python program
01:35
Program execution and python IDE
04:25
Built in types in python
02:30
Numbers Data Type in Python
05:23
String and List Data Type
18:58
Dictionary, Tuples and Sets
22:19
Variables and assignment
11:16
Hands-On
18:38
Hands-On
12:04
Hands-On
16:00
Hands-On
17:17
Hands-On
05:00
+ Module 5 Advanced Part of Apache Spark with Python
16 lectures 01:03:36
Downloading and Setup of winutils
01:40
Setting up Environment Variables
02:15
Running the first Spark Program
01:36
Downloading and Extracting movie ratings datasets
01:43
Running Ratings Counter Spark Program
04:36
Understanding key value pairs with an example
08:18
Filtering RDD using an example
09:48
Finding maximum temperature by location
03:01
Map vs FlatMap
02:09
Understanding FlatMap using Word Count example
07:56
Sorting the word count results
04:26
Total Amount Spent Example
07:55
Sorting the Total Amount Spent Example result
02:11
+ Module 6 Deep Dive Into Spark with Python
6 lectures 01:18:40
Most popular movie example
06:31
Understanding Broadcast Variables with an example
08:05
Finding Similar Movies Example
23:26
Finding Most Popular Superhero example
10:37
Superhero Degrees of Separation Part1
08:13
Superhero Degrees of Separation Part 2
21:48
+ Module 7 SparkSQL in Apache Spark with Python
3 lectures 19:36
Executing SQL commands
08:40
Using SQL style functions instead of queries
02:36
Using DataFrames instead of RDDs
08:20
+ Module 8 MLib in Apache Spark with Python
2 lectures 22:33
Using MLlib to produce movie recommendations
10:03
Using Dataframe with MLlib using an example
12:30
Requirements
  • Basic Knowledge of Big Data
  • Basics of Java and OOPs
  • Basic Computer Programming Terminologies
Description

Apache Spark is the hottest Big Data skill today. More and more organizations are adapting Apache Spark for building their big data processing and analytics applications and the demand for Apache Spark professionals is sky rocketing. Learning Apache Spark is a great vehicle to good jobs, better quality of work and the best remuneration packages. 

You might already know Apache Spark as a fast and general engine for big data processing, with built-in modules for streaming, SQL, machine learning and graph processing. It’s well-known for its speed, ease of use, generality and the ability to run virtually everywhere. And even though Spark is one of the most asked tools for data engineers, also data scientists can benefit from Spark when doing exploratory data analysis, feature extraction, supervised learning and model evaluation. 

The course will cover many more topics of Apache Spark with Python including-

  • What makes Spark a power tool of Big Data and Data Science?

  • Learn the fundamentals of Spark including Resilient Distributed Datasets, Spark Actions and Transformations

  • Explore Spark SQL with CSV, JSON and mySQL (JDBC) data sources

  • Convenient links to download all source code

Who this course is for:
  • Java Developers who want to upgrade their skills to light weight language python to handle Big data.
  • Hadoop developers who want to learn a fast processing engine SPARK
  • Python developers who want to upgrade their skills to handle and process Big data using Apache Spark.
  • Any professionals or students who want to learn Big data.