Udemy
    •  
    •  
    •  
    •  
    •  
    •  
    •  
    •  
Turn what you know into an opportunity and reach millions around the world.
Learn More
Your cart is empty.
Keep shopping
Apache Spark, Spark sql & Streaming basic to advance 2024
Rating: 4.0 out of 5(21 ratings)
184 students

Apache Spark, Spark sql & Streaming basic to advance 2024

Spark 2.4 complete guide with practice session, Spark job Performance boost and Structured Streaming with scala basics
Created byAnshul Jain
Last updated 8/2025
English

What you'll learn

  • Apache Spark in Big data Ecosystem
  • Spark Internal Architecture
  • Integration of Spark with Hive Warehouse
  • One course to learn Spark, Spark SQL, D-Stream and Structured Streaming
  • Basic Scala Programming for Spark
  • Spark Streaming Basics with Java
  • Spark Performance boost technique
  • How to Design Project on Big Data using Spark 2.x with Hive 2.8.x
  • Structured Streaming with Spark 3
  • Code Spark or Streaming Application in eclipse and run in yarn cluster
  • Google cloud Big data environment Free setup for practical

Course content

4 sections36 lectures9h 19m total length
  • Introduction to Apache Spark4:11

    Agenda of Course

  • What is Apache Spark7:51

    What is Apache Spark ?

    Why Apache Spark ?

    Who uses Apache Spark?

    Real Use Case of Apache Spark?

  • Spark on Google cloud for Free with hive ,hadoop setup in one click16:20

    Get your google cloud platform free trial for first 90 days and use google cloud machines for practise.

    It will not charge you , we will create account , cluster , and run spark-shell directly

    Thats it setup is done.

    https://console.cloud.google.com/

    other option https://cloud.ibm.com/catalog

    Create Big data cluster and check spark application , hadoop application , shut down cluster

  • Spark Working Architecture 17:07

    Components in Apache Spark

    Actual working of Spark engine

  • Spark Architecture 2: How Job executed and Different Modes of Execution10:01

    Spark in local mode / standalone or client mode & cluster Mode

    How Exactly a Spark job is executed in Spark Engine

  • Setup Spark 2.4 with Hadoop 2.x Local Machine13:57

    Configure local machine having Unix /Linux or Ubuntu OS

    Install Java 1.8.x

    Install Hadoop 2.8.x

    Install Hive 3.x or Hive 2.x

    Install Spark 2.4.x

    Install Scala 2.11.3

    Set properties required to do a standalone setup

  • Spark Shell & Scala Basics for Spark15:14

    Spark Shell in local mode

    Spark shell in google cloud environment

    Scala Basics

  • Spark Session & SparkContext8:24

    What exactly is Spark Context and Spark Session

    Tabs in Spark Job UI

  • Scala Basics 2 for Spark Programming23:43

    For-loop , switch case , var and val in scala programming

    How to read file line by line in Spark Shell  using scala code


  • BroadCast & Accumulators in Spark using Java14:30

    Special variables in spark

    Broadcast and accumulator

  • Spark RDD , Transformation & Action18:34

    What is RDD, RDD Features ?

    Transformation

    Action

  • Spark Word Count Program15:55

    Logic & Demo

    Map, flatMap , reduceByKey

  • DataSet in Spark21:18

    What is Spark dataset ?

    RDD vs DataFrame vs Dataset

    How to analyze spark shell command on Spark UI


  • Read different formats of data by Spark Engine22:53

    Read csv / json /xml or text data in spark

    Using https://github.com/databricks/spark-xml to know more about xml file read and write operation

  • Spark Write data in different formats & configure Spark job parameters15:35

    How to write data into json/parquet/avro/text format on local or hadoop path

Requirements

  • Knowledge of basic java
  • Knowledge of Basic SQL

Description

I am Big Data Solution Designer in IT industry from last few years. I am adding all my learning and experience in this video series. So that you can understand working of Spark eco-system, work like a professional big data engineer and get a good job. Updated course with latest version

Benefits of this course:

Enroll into this course and get end to end knowledge of Apache Spark +Spark-SQL + Spark Streaming + Spark with Hive + Real World Use cases + Designing of Big Data project with Spark eco-system & Interview asked Use cases. This course is very rare of its kind and includes even very thin details of Spark which are not available anywhere online.

In this course you will get to understand a step by step learning of very Basic Spark to Advance Spark (which is actually used in Real-time projects) like with latest Spark version 3.x

Spark Setup , All file formats ,Hive Optimization Concepts like Partition , Bucketing , Joins , Spark Code Review like Experts : all demo / interactive sessions

Spark Google cloud account setup for hands-on over all concepts

Spark SQL Clauses : Distribute by , order by , clustered by , sort by

Scala basics Coding

Eclipse Coding Application with Java 8 as Maven Project and Spark API

Window functions like rank , row_number , dense_rank : all demo / interactive sessions

RDD , Dataset & DataFrame API

Different ways to create / insert data in Hadoop or Hive table

Spark Job Configuration Optimization

Spark Application DAG analysis and debugging using spark UI

Spark Streaming & Structured Streaming with Coding in Java

Performance Technique that big companies use to query fast on data.

This course is a full package explaining even rarely used commands and concepts in Spark. After completing this course you won't find any topic left in Spark. This course is made keeping in mind the Real Implementation of Spark in Live Projects..

Additionally ,You can download the Step Step Installation Guide (doc) to Install Scala and Apache Spark

Who this course is for:

  • IT Engineer want to move career into Big data technologies
  • Beginner in Big Data hadoop and spark
  • Students who want to crack Interview for Big data technologies related positions
  • Data Analyst who works on large data or continuous flow of data