Apache Spark and Scala

A complete Guide for Processing Big Data with Spark
3.8 (33 ratings)
Instead of using a simple lifetime average, Udemy calculates a
course's star rating by considering a number of different factors
such as the number of ratings, the age of ratings, and the
likelihood of fraudulent ratings.
256 students enrolled
$19
$25
24% off
Take This Course
  • Lectures 67
  • Length 7.5 hours
  • Skill Level All Levels
  • Languages English
  • Includes Lifetime access
    30 day money back guarantee!
    Available on iOS and Android
    Certificate of Completion
Wishlisted Wishlist

How taking a course works

Discover

Find online courses made by experts from around the world.

Learn

Take your courses with you and learn anywhere, anytime.

Master

Learn and practice real-world skills and achieve your goals.

About This Course

Published 4/2016 English

Course Description

This course on Apache Spark and Scala aims at providing an advanced expertise in big data Hadoop ecosystem. This course will provide a standard skillset which helps one become a specialist on the top of Big data Hadoop developer. 

The course starts with a detailed description on limitations of mapreduce and how Spark can help overcome them. Further it covers a deeper dive into the Scala programming language.

Moving on it covers Spark as a standalone cluster and an understanding of Resiliient Distributed Datasets.

The course also covers concepts of Spark SQL using SQL queries through SQL context and Hive Queries through Hive context.

This course certainly provides material required for building a career path from Big data Hadoop developer to BIg data Hadoop architect.


What are the requirements?

  • Prior knowledge of Apache Hadoop will be an added advantage, but not compulsory
  • Fundamental understanding of any programming language

What am I going to get from this course?

  • Understand the limitations of Hadoop mapreduce and how Spark overcomes these limitations
  • Gain expertise in Scala programming language and its characteristics
  • Able to work with RDDs' and create applications in Spark
  • A thorough understanding about Spark SQL by using SQL queries in Spark

What is the target audience?

  • Students who aspire to gain a deep understanding of Apache Spark
  • Professionals looking for a career in real time big data analytics

What you get with this course?

Not for you? No problem.
30 day money back guarantee.

Forever yours.
Lifetime access.

Learn on the go.
Desktop, iOS and Android.

Get rewarded.
Certificate of completion.

Curriculum

Section 1: Module-1 Introduction to Big data, Hadoop and Spark
1.1 Overview of Big Data
Preview
03:27
1.2 Introduction to Apache Hadoop
Preview
02:29
1.3 Hadoop Distributed File System
Preview
05:00
1.4 Hadoop Map Reduce
Preview
03:33
1.5 Introduction to Apache Spark
Preview
05:12
1.6 Characteristics of Apache Spark
Preview
02:44
1.7 Users and Use Cases of Apache Spark
07:45
1.8 Job Execution Flow and Spark Execution
01:12
1.9 Spark Unified Stack
01:08
1.10 Complete Picture of Apache Spark
06:37
1.11 Why Spark with Scala
02:12
1.12 Apache spark Architecture
02:16
Section 2: Module 2: Introduction to Scala Programming Language
2.1 Introduction to Scala
10:15
2.2 Scala Basic Syntax
05:11
2.3 Scala Class and Objects
04:03
2.4 If else Statements in Scala
08:31
2.5 Loops in Scala
09:32
Section 3: Module 3: Advanced Scala Programming
3.1 Functions and Procedures in Scala
08:19
3.2 Access Modifiers
06:25
3.3 Strings and Arrays
10:45
3.4 Scala Collections
14:29
3.5 Scala Traits
03:58
3.6 Pattern Matching
07:25
3.7 Scala Extractors
05:41
3.8 Scala Exception Handling
03:29
3.9 Scala Files IO
09:26
Section 4: Module 4: Apache Spark RDDs
4.1 Programming with RDDs
02:33
4.2 Starting with Spark
02:16
4.3 Creating RDDs
02:10
4.4 RDD Operations
03:36
4.5 Lifecycle of Spark
01:41
Section 5: Module 5: Apache Spark RDDs II
5.1 Spark Caching
03:12
5.2 Common Transformations and Actions
06:03
5.3 Spark Functions
13:56
5.4 Some more Spark functions
10:15
Section 6: Module 6: Working with Key-Value pairs
6.1 Key Value Pairs
05:24
6.2 Aggregate Functions
10:55
6.3 Working with Aggregate Functions
20:01
6.4 Joins in Spark
18:44
6.5 Practical on Word count example
07:38
Section 7: Module 7: Advanced Spark Programming
7.1 Spark Shared Variables
13:47
7.2 Spark and Fault Tolerance
01:53
7.3 Broadcast variables
11:16
7.4 Numeric RDD Operations
03:18
7.5 Per-Partition Operations
11:23
Section 8: Module 8: Running Spark jobs on Cluster
8.1 Spark Runtime Architecture
02:45
8.2 Spark Driver
03:41
8.3 Executors
01:07
8.4 Cluster Managers
06:11
8.5 Cluster Managers II
03:14
Section 9: Module 9: Spark SQL
9.1 Introduction to Spark SQL
04:59
9.2 Starting Point-SQL Context
07:42
9.3 Hive with Spark SQL
10:20
9.4 Spark SQL Caching
10:28
Section 10: Module 10: Spark Streaming
People.json, Employee.json
Article
Section 11: Module 11: Machine Learning in Spark
11.1 machine learning with mllib
06:07
11.2 MLib Data Types
10:07
11.3 labeled point data types
08:05
11.4 Local Matrices in mllib
06:25
11.5 MLib Algorithms
09:45
11.6 Classification and Regression
07:58
11.7 Clustering
12:43
Section 12: Module 12: GraphX in Spark
12.1 GraphX Introduction
08:14
12.2 Creating Graphs
17:03
12.3 Graph Operators
08:55
12.4 Subgraph Transformation
09:12
12.5 Computation with map reduce triplets
03:43

Students Who Viewed This Course Also Viewed

  • Loading
  • Loading
  • Loading

Instructor Biography

Digitorious Technologies, Make Learning Smarter

Digitorious technologies is a leading publisher of development courses which provide in-depth knowledge and high quality training. Digitorious technologies is serving with a mission of providing right direction to people who are looking for a career in IT/software industry. Digitorious is the best place for learning new technologies and making things easy to understand virtually.

Ready to start learning?
Take This Course