Databricks Fundamentals & Apache Spark Core
4.5 (57 ratings)
Course Ratings are calculated from individual students’ ratings and a variety of other signals, like age of rating and reliability, to ensure that they reflect course quality fairly and accurately.
8,686 students enrolled

Databricks Fundamentals & Apache Spark Core

Learn how to process big-data using Databricks & Apache Spark 2.4 and 3.0.0 - DataFrame API and Spark SQL
Bestseller
4.5 (57 ratings)
Course Ratings are calculated from individual students’ ratings and a variety of other signals, like age of rating and reliability, to ensure that they reflect course quality fairly and accurately.
8,685 students enrolled
Created by Wadson Guimatsa
Last updated 7/2020
English
Current price: $69.99 Original price: $99.99 Discount: 30% off
5 hours left at this price!
30-Day Money-Back Guarantee
This course includes
  • 12 hours on-demand video
  • 5 downloadable resources
  • Full lifetime access
  • Access on mobile and TV
  • Certificate of Completion
Training 5 or more people?

Get your team access to 4,000+ top Udemy courses anytime, anywhere.

Try Udemy for Business
What you'll learn
  • Databricks
  • Apache Spark Architecture
  • Apache Spark DataFrame API
  • Apache Spark SQL
  • Selecting, and manipulating columns of a DataFrame
  • Filtering, dropping, sorting rows of a DataFrame
  • Joining, reading, writing and partitioning DataFrames
  • Aggregating DataFrames rows
  • Working with User Defined Functions
  • Use the DataFrameWriter API
Course content
Expand all 71 lectures 12:08:16
+ Introduction to Databricks and Apache Spark
4 lectures 29:31
Introduction to databricks
08:47
Write your first Apache Spark Code
09:59
Practice: Find customer with the same birthday as you
02:08
+ The DataFrame API: Basics
9 lectures 41:36
Create a DataFrame from a CSV file
06:10
Configure options to read a CSV file
06:20
How to reference columns of a DataFrame
05:20
Understand the DataFrame Schema: Part 1
02:22
Understand the DataFrame Schema: Part 2
04:11
Specify a DataFrame Schema using a DDL-formatted string : Part 1
03:20
Specify a DataFrame Schema using a DDL-formatted string : Part 2
05:37
Spark Architecture: The Organization of a DataFrame
02:57
+ The DataFrame API: Transforming Data
18 lectures 02:46:34
Adding columns to a DataFrame
09:44
Renaming columns of a DataFrame
04:43
Removing columns from a DataFrame
01:27
Filtering rows from a DataFrame
12:09
Joining multiple DataFrames: Part 1
02:44
Joining multiple DataFrames: Part 2
08:45
Aggregation: Count
04:37
Aggregation: Count Distinct
02:08
Aggregation: Get the Min value
09:54
Aggregation: Get the Max value
02:06
Aggregation: Get the Sum and SumDistinct
05:59
Aggregation: Average and Mean
05:37
Aggregation: Grouping data - Part 1
01:09
Aggregation: Grouping data - Part 2
06:45
Practice: Business Query 1
22:06
Practive: Business Query 2
17:37
Apache Spark Architecture: How Apache Spark Transforms data Internally
31:42
User Defined Function
17:22
+ Spark SQL & SQL Fundamentals
20 lectures 03:33:29
Run SQL on a DataFrame: TempView
11:17
Run SQL on a DataFrame: GlobalView
06:18
Databases: List, Create, Delete, Select
09:07
Tables: Unmanaged
10:14
Tables: Managed
14:05
SQL Fundamentals: Select Clause & Select Expression
18:13
SQL Fundamentals: Where Clause, Equality Checks
12:12
SQL Fundamentals: Handling NULLs in Where Clause
05:09
SQL Fundamentals: Aggregations - Sum, Count, AVG, Mean
14:29
SQL Fundamentals: Group By Clause
11:57
SQL Fundamentals: Having Clause
13:47
SQL Fundamentals: Order By Clause
04:56
SQL Fundamentals: Inner Joins
10:10
SQL Fundamentals: Left Outer Joins
10:06
SQL Fundamentals: Right Outer Joins
07:12
SQL Fundamentals: Predicates and Operators, like predicate
05:56
SQL Fundamentals: Case Expressions
04:50
Practice : Business Query 3
18:15
Practice: Business Query 4
18:14
Practice: Business Query 5
07:02
+ Working with different type of data
11 lectures 02:53:28
Converting literals to Spark Types: The lit function
05:51
Working with booleans
18:43
Working with numbers
19:22
Working with strings
22:56
Working with dates and timestamps
26:25
Complex Types: Structs
16:38
Complex Types: Arrays
17:21
Complex Types: Maps
11:54
Handling NULL Values: Drop NULL Values
14:57
Handling NULL Values: Replace NULL Values
09:33
+ Data Sources
4 lectures 01:26:16
DataFrameReader: Read CSV Files
32:23
DataFrameWriter: Write Data
30:14
Create DataFrame manually
07:05
Requirements
  • Basic Scala knowledge
  • Basic SQL knowledge
Description

Welcome to this course on Databricks and Apache Spark 2.4 and 3.0.0

Apache Spark is a Big Data Processing Framework that runs at scale.
In this course, we will learn how to write Spark Applications using Scala and SQL.

Databricks is a company founded by the creator of Apache Spark.
Databricks offers a managed and optimized version of Apache Spark that runs in the cloud.

The main focus of this course is to teach you how to use the DataFrame API & SQL to accomplish tasks such as:

  • Write and run Apache Spark code using Databricks

  • Read and Write Data from the Databricks File System - DBFS

  • Explain how Apache Spark runs on a cluster with multiple Nodes

Use the DataFrame API and SQL to perform data manipulation tasks such as

  • Selecting, renaming and manipulating columns

  • Filtering, dropping and aggregating rows

  • Joining DataFrames

  • Create UDFs and use them with DataFrame API or Spark SQL

  • Writing DataFrames to external storage systems

List and explain the element of Apache Spark execution hierarchy such as

  • Jobs

  • Stages

  • Tasks


Who this course is for:
  • Software developers curious about big-data, data engeneering and data science
  • Beginner data engineer who want to learn how to do work with databricks
  • Beginner data scientist who want to learn how to do work with databricks