Advanced Apache Spark for Data Scientists and Developers
3.6 (32 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
270 students enrolled
Wishlisted Wishlist

Please confirm that you want to add Advanced Apache Spark for Data Scientists and Developers to your Wishlist.

Add to Wishlist

Advanced Apache Spark for Data Scientists and Developers

Apache Spark
3.6 (32 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
270 students enrolled
Created by Adastra Academy
Last updated 1/2016
English
Current price: $10 Original price: $50 Discount: 80% off
4 days left at this price!
30-Day Money-Back Guarantee
Includes:
  • 2.5 hours on-demand video
  • 29 Supplemental Resources
  • Full lifetime access
  • Access on mobile and TV
  • Certificate of Completion
What Will I Learn?
  • Understand the functionality of Spark's four built-in libraries
  • Create real-world applications using Spark’s libraries
  • Understand how to develop, debug and optimize the performance of Spark applications
View Curriculum
Requirements
  • Completed a introductory Apache Spark course. Adastra Academy's Introduction to Apache Spark for Developers and Engineers recommended.
  • A beginner to intermediate understanding of the Scala programming language. Adastra Academy's Scala in Practice recommended.
  • A basic understanding of Apache Hadoop and Big Data
Description

Apache Spark is an open source data processing engine. Spark is designed to provide fast processing of large datasets, and high performance for a wide range of analytics applications. Unlike MapReduce, Spark enables in-memory cluster computing which greatly improves the speed of iterative algorithms and interactive data mining tasks.

Adastra Academy’s Advanced Apache Spark includes illuminating video lectures, thorough application examples, a guide to install the NetBeans Integrated Development Environment, and quizzes. Through this course, you will learn about Spark’s four built-in libraries - SparkStreaming, DataFrames (SparkSQL), MLlib and GraphX - and how to develop, build, tune, and debug Spark applications. The course exercises will enable you to become proficient at creating fully functional real-world applications using the Apache Spark libraries. Unlike other courses, we give you the guided and ground-up approach to learning Spark that you need in order to become an expert.

Who is the target audience?
  • Data Scientists
  • Developers
  • Data Engineers
Students Who Viewed This Course Also Viewed
Curriculum For This Course
71 Lectures
05:33:19
+
Introduction to Advanced Apache Spark
3 Lectures 04:19

Spark Installation
16 pages

Spark Installation Quiz
1 question

IDE Installation
14 pages

IDE Installation Quiz
1 question
+
Tuning and Debugging
7 Lectures 14:55



Data Serialization
01:56

Memory Tuning
03:36

Level of Parallelism
02:20

Section Topics
00:33
+
Spark Streaming
16 Lectures 21:06
Introduction and Topics
00:41

Overview of Spark Streaming
01:17

Linking Input Sources
00:52

Streaming Context
01:15

Discretized Streams (DStreams)
00:47

Input DStreams
02:29

Hands-on Exercise 1: Spark Streaming
11 pages

Stateless Transformations on DStreams
03:51

Stateful Transformations
03:30

Hands-on Exercise 2: Spark Streaming
6 pages

Output Operations
01:54

Hands-on Exercise 3: Spark Streaming
7 pages

Checkpointing
00:46

Caching and Persisting
00:44

Tuning and Debugging
02:28

Section Topics
00:32
+
Spark SQL
14 Lectures 58:17
Introduction to Spark SQL
00:59

Spark SQL Overview
06:48

The Spark Shell hands-on
2 pages

Hands-on Exercise 1: part a) Import CSV
30 pages

Schema Inference
06:25

Data Query Select
05:19

Data Query Select
1 question

DataFrame.Reader DataFrame.Writer
08:11

Hands-on Exercise 1: part b) Import JSON
18 pages

Data Query INNER JOINs
06:40

Data Query INNER JOINs
2 questions

Group By, Order By, Window Functions
05:41

Group By, Order By, Window Functions
2 questions

Data Query OUTER JOINs, SEMI JOIN
09:50

Data Query OUTER JOINs, SEMI JOIN
1 question

Custom UDF (User Defined Function)
04:41

Custom UDF (User Defined Function)
1 question

API or SQL?
03:43

Hands-on Exercise 2: Spark SQL
18 pages
+
Spark MLlib
15 Lectures 31:50
Introduction and Topics
00:41

Machine Learning
01:17

MLlib
02:32

Basic Statistics
01:00

Optimization
01:49

Classification
06:20

Hands-on Exercise 1: Spark MLlib: Classification
12 pages

Validation
01:07

Regression
02:18

Clustering
03:51

Hands-on Exercise 2: Spark MLlib: Clustering
12 pages

Feature Extraction and Transformation
01:00

Dimensionality Reduction
05:23

Collaborative Filtering
00:55

Evaluation Metrics
03:37
+
Spark GraphX
16 Lectures 24:52
Introduction to Spark GraphX
07:18

Graph creation examples
2 pages

Graph Operators Overview, Information about a Graph
03:18

Information about a graph example
1 page

Transform Graph Items
02:35

Transform graph items examples
1 page

Modify Graph Structure
01:24

Modify graph structure example
1 page

Graph Neighborhood Aggregations
02:30

Neighborhood Aggregations Examples
2 pages

Graph Algorithms
02:36

Triangle Count Example
1 page

Pregel- Graph Parallel Computation
02:11

Pregel Example
1 page

Optimized Graph Representation
03:00

Hands-on Exercise: Spark GraphX
23 pages
About the Instructor
Adastra Academy
4.0 Average rating
1,054 Reviews
10,066 Students
6 Courses
Emerging Data Management and Analytics Technology Educators

We're focused on the tools and technologies that matter most for today and tomorrow.

Adastra Academy is a leading source of training and development for Information Management professionals and individuals interested in Data Management and Analytics technology. Our dedication to identifying and mastering emerging technologies guarantees our students are the first to gain access to critical skills. Our programs consist of hands-on labs and real world examples allowing students to easily apply their new knowledge.

As a division of Adastra Corporation, we leverage twenty years of world-class Information Management knowledge, experience, services and solutions to fuel the Academy and to advance Information Management professionals everywhere.