Introduction to Apache Spark for Developers and Engineers
4.4 (81 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
446 students enrolled
Wishlisted Wishlist

Please confirm that you want to add Introduction to Apache Spark for Developers and Engineers to your Wishlist.

Add to Wishlist

Introduction to Apache Spark for Developers and Engineers

Basic to intermediate level introduction to Apache Spark that provides the main skills required to use the technology
4.4 (81 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
446 students enrolled
Created by Adastra Academy
Last updated 9/2015
English
Current price: $10 Original price: $50 Discount: 80% off
5 hours left at this price!
30-Day Money-Back Guarantee
Includes:
  • 2 hours on-demand video
  • 17 Supplemental Resources
  • Full lifetime access
  • Access on mobile and TV
  • Certificate of Completion
What Will I Learn?
  • Identify and understand the concepts of Big Data
  • Clearly describe Apache Spark
  • Understand and explain the various components of the Spark framework
  • Differentiate between Spark and Hadoop MapReduce
  • Download, install and use Spark on a local machine
  • Identify and understand the main Scala programming language concepts
  • Develop basic Spark applications
  • Explain and use Spark Resilient Distributed Datasets
View Curriculum
Requirements
  • Basic understanding of Big Data concepts
  • Some understanding of a programming language such as Python, Java or Scala
  • Administrator privileges on a computer to download and install software
Description

What is Apache Spark?

Apache Spark is the next generation open source Big Data processing engine. Spark is designed to provide fast processing of large datasets and high performance for a wide range of applications. Spark enables in-memory cluster computing which greatly improves the speed of iterative algorithms and interactive data mining tasks.

Course Outcomes

'Introduction to Apache Spark' includes illuminating video lectures, practical hands-on Scala and Spark exercises, a guide to local installation of Spark, and quizzes. In this course, we guide students through:

  • An explanation of the Spark framework
  • The basics of programming in Scala, Spark's native language
  • An outline of how to work with Spark's primary abstraction, resilient distributed datasets (RDDs).

Upon completion of the course, students will be able to explain core concepts relating to Spark, understand the fundamentals of coding in Scala, and execute basic programming and data manipulation in Spark. This course will take approximately 8 hours to complete.

Recommended Experience

Programming Languages recommended for this course:

  • Scala (course exercises are in Scala)
  • Java
  • Python

Recommended for:

  • Data scientists and engineers
  • Developers
  • Individuals with a basic understanding of: Apache Hadoop, Big Data, programming languages (Scala, Java, or Python)

For students unfamiliar with Big Data and Hadoop, the course will provide a brief overview of each topic.

Why Adastra Academy?

Adastra Academy is a leading source of training and development for Information Management professionals and individuals interested in Data Management and Analytics technology. Our dedication to identifying and mastering emerging technologies guarantees our students are the first to have access to these quality courses. For an exceptional learning experience, our programs include hands-on labs and real world examples allowing students to easily apply their new knowledge.

Who is the target audience?
  • Big Data Developers
  • Data Engineers
  • Big Data Consultants
  • Data Scientists
Compare to Other Apache Spark Courses
Curriculum For This Course
55 Lectures
02:52:48
+
Overview of Big Data
5 Lectures 17:31

This lecture discusses:

  • What big data is
  • Creation history of Hadoop
  • Overview of the MapReduce model
Preview 06:40

This lecture discusses:

  • Traditional data warehousing features
  • Big data features
1.3 Big Data Features and Traditional Datawarehousing Charactaristics
05:40

This lecture discusses:

  • How big data tools fit into an enterprise solution
1.4 Use Case: Adastra's Big Data Reference Architecture
02:01

1.5 Section Conclusion
00:46

1.6 Big Data Concepts Quiz
3 questions
+
What is Apache Spark
5 Lectures 13:02
2.1 Introduction and topcis
00:33

This lecture discusses:

  • What Apache Spark is
  • Spark programming languages
  • Spark's built-in libraries
Preview 02:54

This lecture discusses:

  • Creation history of Spark
  • Spark's growth
  • Companies using Spark
2.3 Spark's History
03:21

This lecture discusses:

  • Comparison of Spark and MapReduce
  • Reasons for choosing Spark
2.4 Why Use Spark
05:38

2.5 Section Conclusion
00:36

2.6 Spark Concepts Quiz
5 questions
+
Spark Infrastructure
7 Lectures 22:04
3.1 Introduction and Topics
00:31

This lecture discusses:

  • Spark deployment modes
    • Local stand-alone
    • Stand-alone cluster
    • Shared cluster
3.2 Spark Deployment Modes
03:49

3.3 Hands-on Exercise: Installing Stand-Alone Spark
00:15

This hands-on exercise will guide you through:

  • Installation of Scala
  • Local installation of stand-alone Apache Spark
  • Downloading of sample data used for course exercises
3.4 Hands-on Exercise: Install Stand-Alone Spark on your computer
16 pages

3.5 Spark Install Quiz
1 question

This lecture discusses:

  • Cluster managers
  • Spark core
  • Built-in libraries
3.6 The Spark Framework
10:08

This lecture discusses:

  • Driver program
  • SparkContext
  • Executors
  • Stand-alone applications
3.7 Spark Application Concepts
06:45

3.8 Section Conclusion
00:36

3.9 Spark Infrastructure Quiz
5 questions
+
The Scala Programming Language
25 Lectures 32:22
4.1 Introduction and topics
00:36

This lecture discusses:

  • Introduction to Scala
  • Scala main features
4.2 Scala Introduction & Language Features
05:06

This lecture discusses:

  • Scala base types
4.3 Scala Language Basics-Base Types
02:17

This hands-on exercise provides practice with:

  • Scala base types
4.4 Hands-on Examples: Scala Base Types
5 pages

This lecture discusses:

  • Scala operators
4.5 Scala Language Basics-Operators
03:36

This hands-on exercise provides practice with:

  • Scala operators
4.6 Hands-on Examples: Scala Operators
5 pages

This lecture discusses:

  • Variables in Scala
4.7 Scala Language Constructs-Variables
01:52

This hands-on exercise provides practice with:

  • Variables in Scala
4.8 Hands-on Examples: Scala Variables
1 page

4.9 Scala Language Constructs-Variables Quiz
2 questions

This lecture discusses:

  • Arrays in Scala
4.10 Scala Language Constructs-Arrays
02:17

This hands-on exercise gives practice with:

  • Arrays in Scala
4.11 Hands-on Examples: Scala Arrays
1 page

This lecture discusses:

  • Lists in Scala
4.12 Scala Language Constructs-Lists
02:18

This hands-on exercise provides practice with:

  • Lists in Scala
4.13 Hands-On Exercise: Scala Lists
2 pages

This lecture discusses:

  • Collections in Scala
4.14 Scala Language Constructs-Collections
02:06

4.15 Quiz: Scala Arrays and Lists
2 questions

This lecture discusses:

  • Scala IF expressions
4.16 Scala Language Constructs-IF Expressions
02:39

This hands-on exercise provides practice with:

  • Scala IF expressions
4.17 Hands-On Excercise: Scala IF Expressions
1 page

This lecture discusses:

  • Scala Match-case expressions
4.18 Scala Language Constructs-MATCH-CASE Expressions
01:28

This hands-on exercise provides practice with:

  • Scala Match-case expressions
4.19 Hands-On Excercise: Scala MATCH-CASE Expressions
1 page

This lecture discusses:

  • Scala while loop expressions
  • Scala for loop expressions
4.20 Scala Language Constructs-WHILE & FOR Loop Expressions
02:09

This hands-on exercise provides practice with:

  • Scala while loop expressions
  • Scala for loop expressions
4.21 Hands-On Excercise: Scala WHILE & FOR Loop Expressions
2 pages

4.22 Quiz: Scala Loops and Execution Flow
1 question

This lecture discusses:
  • Functions in Scala
4.23 Scala Language Basics-Functions
01:39

This hands-on exercise provides practice with:

  • Functions in Scala
4.24 Hands-On Excercise: Scala Functions
1 page

4.25 Quiz: Scala Functions: Greatest Common Divisor
1 question

This lecture discusses:

  • Anonymous function in Scala
4.26 Scala Language Basics-Anonymous Functions
03:26

This hands-on exercise provides practice with:

  • Anonymous functions in Scala
4.27 Hands-on Examples: Anonymous Functions
1 page

4.28 Scala Functions - Create your own function
2 questions

4.29 Scala Functions - quiz solution
2 pages

4.30 Section Conclusion
00:53
+
Resilient Distributed Datasets
13 Lectures 32:49
5.1 Introduction and sections
01:09

This lecture discusses:

  • What are Resilient Distributed Datasets (RDDs)?
  • Why use RDDs?
5.2 Resilient Distributed Datasets-Overview
03:04

This lecture discusses:

  • RDD Operations
    • Transformations
      • RDD Fault Tolerance
      • Directed Acyclic Graph
      • Lazy Evaluation
    • Actions
5.3 Resilient Distributed Datasets
10:26

This hand-on exercise provides practice with:

  • Creating RDDs
  • Performing transformations and actions on RDDs
5.4 Hands-On Exercise: RDDs Lazy Evaluation & Actions
5 pages

5.5 RDDs Lazy Evaluation & Actions
2 questions

This lecture discusses:

  • RDD creation methods
    • Loading from an external dataset
    • Parallelizing an existing dataset
    • Creating from an existing RDD
5.6 Resilient Distributed Datasets-How to Create
03:32

This hands-on exercise provides practice with:

  • Creating RDDs from a collection
5.7 Hands-On Exercise: Creating an RDD from a Collection
1 page

5.8 RDD Creation
2 questions

This lecture discusses several topics relating to RDD key/value pairs:

  • What pair RDDs are
  • Creating pair RDDs
  • Performing transformations on pair RDDs
5.9 Pair Resilient Distributed Datasets
04:18

This hands-on exercise provides practice with:

  • Creating pair RDDs
5.10 Hands-On Exercise: Pair RDDs
6 pages

5.11 Pair RDDs - Joining datasets
1 question

This lecture discusses:

  • RDD persistence
    • cache() method
    • persist() method
5.12 Resilient Distributed Datasets-Persistence
04:15

This lecture discusses:

  • shuffle operations
  • shared variables
    • broadcast variables
    • accumulator variables
5.13 Resilient Distributed Datasets-Shared Variables
04:55

This hands-on exercise provides practice with:

  • Creating and using shared variables
5.14 Hands-on Examples: Distributed Shared Variables
4 pages

5.15 "Advanced" data processing with Spark
1 question

5.16 "Advanced" data processing with Spark - quiz solution
1 page

5.17 Section Conclusion
01:10
About the Instructor
Adastra Academy
4.1 Average rating
1,209 Reviews
11,180 Students
6 Courses
Emerging Data Management and Analytics Technology Educators

We're focused on the tools and technologies that matter most for today and tomorrow.

Adastra Academy is a leading source of training and development for Information Management professionals and individuals interested in Data Management and Analytics technology. Our dedication to identifying and mastering emerging technologies guarantees our students are the first to gain access to critical skills. Our programs consist of hands-on labs and real world examples allowing students to easily apply their new knowledge.

As a division of Adastra Corporation, we leverage twenty years of world-class Information Management knowledge, experience, services and solutions to fuel the Academy and to advance Information Management professionals everywhere.