Apache Spark 3 - Spark Programming in Scala for Beginners
4.6 (288 ratings)
Course Ratings are calculated from individual students’ ratings and a variety of other signals, like age of rating and reliability, to ensure that they reflect course quality fairly and accurately.
1,834 students enrolled

Apache Spark 3 - Spark Programming in Scala for Beginners

Data Engineering using Spark Structured API
4.6 (285 ratings)
Course Ratings are calculated from individual students’ ratings and a variety of other signals, like age of rating and reliability, to ensure that they reflect course quality fairly and accurately.
1,828 students enrolled
Last updated 7/2020
English
English
Current price: $13.99 Original price: $19.99 Discount: 30% off
5 hours left at this price!
30-Day Money-Back Guarantee
This course includes
  • 7 hours on-demand video
  • 53 downloadable resources
  • Full lifetime access
  • Access on mobile and TV
  • Certificate of Completion
Training 5 or more people?

Get your team access to 4,000+ top Udemy courses anytime, anywhere.

Try Udemy for Business
What you'll learn
  • Apache Spark Foundation and Spark Architecture
  • Data Engineering and Data Processing in Spark
  • Working with Data Sources and Sinks
  • Working with Data Frames, Data Sets and Spark SQL
  • Using IntelliJ Idea for Spark Development and Debugging
  • Unit Testing, Managing Application Logs and Cluster Deployment
Course content
Expand all 60 lectures 06:48:00
+ Installing and Using Apache Spark
5 lectures 22:15
Apache Spark in Local Mode Command Line REPL
04:25
Apache Spark in the IDE - IntelliJ IDEA
06:54
Apache Spark in Cloud - Databricks Community and Notebooks
04:33
Check your knowledge
3 questions
Apache Spark in Hadoop Ecosystem - Zeppelin Notebooks
03:39
+ Spark Execution Model and Architecture
9 lectures 36:11
Execution Methods - How to Run Spark Programs?
05:01
Spark Distributed Processing Model - How your program runs?
03:11
Spark Execution Modes and Cluster Managers
04:56

Spark Execution mode and Cluster manager is one of the most confusing topics. Check your understanding by taking this quiz. Please read all the options carefully and select the most appropriate answer.

Check your knowledge
10 questions
Summarizing Spark Execution Models - When to use What?
02:24
Working with Spark Shell - Demo
04:31
Installing Multi-Node Spark Cluster - Demo
05:38
Working with Notebooks in Cluster - Demo
05:42
Working with Spark Submit - Demo
03:06
Section Summary
01:42
Check your knowledge
10 questions
+ Spark Programming Model and Developer Experience
13 lectures 01:46:42
Creating Spark Project Build Configuration
05:32
Configuring Spark Project Application Logs
10:01
Check your knowledge
5 questions
Creating Spark Session
05:10
Configuring Spark Session
10:53
Check your knowledge
5 questions
Data Frame Introduction
08:17
Data Frame Partitions and Executors
05:24
Spark Transformations and Actions
11:46
Spark Jobs Stages and Task
08:32
Understanding your Execution Plan
09:32
Unit Testing Spark Application
05:27
Debugging Spark Driver and Executor
07:15
Spark Application Logs in a Cluster
13:23
Rounding off Summary
05:30
+ Spark Structured API Foundation
7 lectures 37:32
Introduction to Spark APIs
06:11
Introduction to Spark RDD API
11:56
Dataset Vs Dataframe
06:31
Working with Spark Dataset
05:48
Working with Spark SQL
02:40
Spark SQL Engine and Catalyst Optimizer
02:56
Section Summary
01:30
+ Spark Data Sources and Sinks
8 lectures 58:23
Introduction to Spark Sources and Sinks
06:44
Spark DataFrameReader API
05:00
Reading CSV, JSON and Parquet files
08:04
Creating Spark DataFrame Schema
06:08
Spark DataFrameWriter API
06:09
Writing Your Data and Managing Layout
11:46
Spark Databases and Tables
05:33
Working with Spark SQL Tables
08:59
+ Spark Dataframe and Dataset Transformations
7 lectures 01:00:33
Introduction to Data Transformation
02:44
Working with Dataframe Rows
05:03
Dataframe Rows and Unit Testing
07:16
Dataframe Rows and Unstructured data
07:17
Working with Dataframe Columns
13:36
Creating and Using UDF
09:00
Miscellaneous Transformations
15:37
+ Aggregations in Apache Spark
3 lectures 19:10
Aggregating Dataframes
09:10
Grouping Aggregations
04:33
Windowing Aggregations
05:27
+ Spark Dataframe Joins
5 lectures 45:53
Dataframe Joins and column name ambiguity
07:45
Outer Joins in Dataframe
07:27
Internals of Spark Join and shuffle
09:14
Optimizing your joins
12:05
Implementing Bucket Joins
09:22
Requirements
  • Programming Knowledge Using Scala Programming Language
  • A Recent 64-bit Windows/Mac/Linux Machine with 8 GB RAM
Description

This course does not require any prior knowledge of Apache Spark or Hadoop. We have taken enough care to explain Spark Architecture and fundamental concepts to help you come up to speed and grasp the content of this course.


About the Course

I am creating Apache Spark 3 - Spark Programming in Scala for Beginners course to help you understand the Spark programming and apply that knowledge to build data engineering solutions. This course is example-driven and follows a working session like approach. We will be taking a live coding approach and explain all the needed concepts along the way.

Who should take this Course?

I designed this course for software engineers willing to develop a Data Engineering pipeline and application using the Apache Spark. I am also creating this course for data architects and data engineers who are responsible for designing and building the organization’s data-centric infrastructure. Another group of people is the managers and architects who do not directly work with Spark implementation. Still, they work with the people who implement Apache Spark at the ground level.

Spark Version used in the Course

This Course is using the Apache Spark 3.x. I have tested all the source code and examples used in this Course on Apache Spark 3.0.0 open-source distribution.

Who this course is for:
  • Software Engineers and Architects who are willing to design and develop a Bigdata Engineering Projects using Apache Spark
  • Programmers and developers who are aspiring to grow and learn Data Engineering using Apache Spark