Udemy Business

Teach on Udemy

Turn what you know into an opportunity and reach millions around the world.

Learn More

Your cart is empty.

Keep shopping

Databricks Fundamentals & Apache Spark Core

Name: Databricks Fundamentals & Apache Spark Core
Rating: 4.3 (3132 reviews)

Learn how to process big-data using Databricks & Apache Spark 2.4 and 3.0.0 - DataFrame API and Spark SQL

Created byWadson Guimatsa

Last updated 9/2023

English

What you'll learn

Databricks
Apache Spark Architecture
Apache Spark DataFrame API
Apache Spark SQL
Selecting, and manipulating columns of a DataFrame
Filtering, dropping, sorting rows of a DataFrame
Joining, reading, writing and partitioning DataFrames
Aggregating DataFrames rows
Working with User Defined Functions
Use the DataFrameWriter API

Course content

8 sections • 72 lectures • 12h 8m total length

Introduction2:20
Create a Databricks community account2:26
Install the Dataset4:30
Overview of the dataset5:46
Install the notebooks2:20

Create a DataFrame from a CSV file6:10
Configure options to read a CSV file6:20
How to select columns from a DataFrame5:19
How to reference columns of a DataFrame5:20
Understand the DataFrame Schema: Part 12:22
Understand the DataFrame Schema: Part 24:11
Specify a DataFrame Schema using a DDL-formatted string : Part 13:20
Specify a DataFrame Schema using a DDL-formatted string : Part 25:37
Spark Architecture: The Organization of a DataFrame2:57

Adding columns to a DataFrame9:44
Renaming columns of a DataFrame4:43
Removing columns from a DataFrame1:27
Filtering rows from a DataFrame12:09
Joining multiple DataFrames: Part 12:44
Joining multiple DataFrames: Part 28:45
Aggregation: Count4:37
Aggregation: Count Distinct2:08
Aggregation: Get the Min value9:54
Aggregation: Get the Max value2:06
Aggregation: Get the Sum and SumDistinct5:59
Aggregation: Average and Mean5:37
Aggregation: Grouping data - Part 11:09
Aggregation: Grouping data - Part 26:45
Practice: Business Query 122:06
Practice: Business Query 217:37
Apache Spark Architecture: How Apache Spark Transforms data Internally31:42
User Defined Function17:22

Run SQL on a DataFrame: TempView11:17
Run SQL on a DataFrame: GlobalView6:18
Databases: List, Create, Delete, Select9:07
Tables: Unmanaged10:14
Tables: Managed14:05
SQL Fundamentals: Select Clause & Select Expression18:13
SQL Fundamentals: Where Clause, Equality Checks12:12
SQL Fundamentals: Handling NULLs in Where Clause5:09
SQL Fundamentals: Aggregations - Sum, Count, AVG, Mean14:29
SQL Fundamentals: Group By Clause11:57
SQL Fundamentals: Having Clause13:47
SQL Fundamentals: Order By Clause4:56
SQL Fundamentals: Inner Joins10:10
SQL Fundamentals: Left Outer Joins10:06
SQL Fundamentals: Right Outer Joins7:12
SQL Fundamentals: Predicates and Operators, like predicate5:56
SQL Fundamentals: Case Expressions4:50
Practice : Business Query 318:15
Practice: Business Query 418:14
Practice: Business Query 57:02

Requirements

Basic Scala knowledge
Basic SQL knowledge

Description

Welcome to this course on Databricks and Apache Spark 2.4 and 3.0.0

Apache Spark is a Big Data Processing Framework that runs at scale.
In this course, we will learn how to write Spark Applications using Scala and SQL.

Databricks is a company founded by the creator of Apache Spark.
Databricks offers a managed and optimized version of Apache Spark that runs in the cloud.

The main focus of this course is to teach you how to use the DataFrame API & SQL to accomplish tasks such as:

Write and run Apache Spark code using Databricks

Read and Write Data from the Databricks File System - DBFS

Explain how Apache Spark runs on a cluster with multiple Nodes

Use the DataFrame API and SQL to perform data manipulation tasks such as

Selecting, renaming and manipulating columns

Filtering, dropping and aggregating rows

Joining DataFrames

Create UDFs and use them with DataFrame API or Spark SQL

Writing DataFrames to external storage systems

List and explain the element of Apache Spark execution hierarchy such as

Jobs

Stages

Tasks

Who this course is for:

Software developers curious about big-data, data engeneering and data science
Beginner data engineer who want to learn how to do work with databricks
Beginner data scientist who want to learn how to do work with databricks

Databricks Fundamentals & Apache Spark Core

What you'll learn

Explore related topics

Course content

Setup5 lectures • 17min

Introduction to Databricks and Apache Spark4 lectures • 30min

The DataFrame API: Basics9 lectures • 42min

The DataFrame API: Transforming Data18 lectures • 2hr 47min

Spark SQL & SQL Fundamentals20 lectures • 3hr 33min

Working with different type of data11 lectures • 2hr 53min

Data Sources4 lectures • 1hr 26min

Become Apache Spark Certified1 lecture • 1min

Requirements

Description

Who this course is for: