Spark Starter Kit
4.6 (454 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
6,939 students enrolled
Wishlisted Wishlist

Please confirm that you want to add Spark Starter Kit to your Wishlist.

Add to Wishlist

Spark Starter Kit

NOT another "What is Spark?" course ! Explore Spark in depth and get a strong foundation in Spark.
4.6 (454 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
6,939 students enrolled
Last updated 6/2017
English
English [Auto-generated]
Price: Free
Includes:
  • 3.5 hours on-demand video
  • 7 Supplemental Resources
  • Full lifetime access
  • Access on mobile and TV
  • Certificate of Completion
What Will I Learn?
  • Learn about the similarities and differences between Spark and Hadoop.
  • Explore the challenges Spark tries to address, you will give you a good idea about the need for spark.
  • Learn “How Spark is faster than Hadoop?”, you will understand the reasons behind Spark’s performance and efficiency.
  • Before we talk about what is RDD, we explain in detail what is the need for something like RDD.
  • You will get a strong foundantion in understanding RDDs in depth and then we take a step further to point out and clarify some of the common misconceptions about RDD among new Spark learners.
  • You will understand the types of dependencies between RDD and more importantly we will see why dependencies are important.
  • We will walk you through step by step how the program we write gets translated in to actual execution behind the scenes in a Spark cluster.
  • You will get a very good understanding of some of the key concepts behind Spark’s execution engine and the reasons why it is efficient.
  • Master fault tolerance by simulating a fault situation and examine how Spark recover from it.
  • You will learn how memory and the contents in memory are managed by spark.
  • Understand the need for a new programming language like Scala.
  • Examine object oriented programming vs. functional programming.
  • Explore Scala's features and functions.
View Curriculum
Requirements
  • Basic Hadoop concepts. Don't know Hadoop ? Don't worry, sign up for our free Hadoop Starter Kit course.
Description

When our students asked us to create a course on Spark, we looked at other Spark related courses in the market and also what are some of the common questions students are asking in websites like stackoverflow and other forums when they try to learn Spark and we saw a recurring theme.

Most courses and other online help including Spark's documentation is not good in helping students understand the foundational concepts. They explain what is Spark, what is RDD, what is "this" and what is "that" but students were most interested in understanding core fundamentals and more importantly answer questions like - 

  1. Why do we need Spark when we have Hadoop ? 
  2. What is the need for RDD ?
  3. How Spark is faster than Hadoop?
  4. How Spark achieves the speed and efficiency it claims ?
  5. How does memory gets managed in Spark?
  6. How fault tolerance work in Spark ?

and that is exactly what you will learn in this free Spark Starter Kit course. The aim of this course is to give you a strong foundation in Spark.

Who is the target audience?
  • Anyone who is interested in distributed systems and computing and big data related technologies.
Compare to Other Apache Spark Courses
Curriculum For This Course
17 Lectures
03:17:56
+
Get Set GO !
2 Lectures 12:16

In this lecture we will go over the prerequisites and the structure of the course.

Let's get started
05:43

In this lecture we will explain how to get Spark running on your computer in less than 10 minutes.

Running Spark on your computer
06:33
+
Introduction to Spark
3 Lectures 36:33

In this lecture, we will compare Hadoop and Spark on three fundamental aspects - Storage, Computation, Computational Speed and Resource Management.

Spark vs. Hadoop - who wins ?
15:30

To understand a solution like Spark, we first need to understand the problems spark is going to solve. In this lesson we will talk about the pain points, challenges or inefficiens that Spark tries to solve in 2 different areas - iterative machine learning and interactive data mining.

Challenges Spark tries to address
12:24

If you plan to learn Spark or any technology you need to have a clear understanding of why that technology is better than the other similar technlogies in the ecosystem and not only you should know why the technology is better you should also understand how the technology is better and that is exactly the goal of this lesson.

How Spark is faster than Hadoop ?
08:39
+
RDD - Core of Spark
3 Lectures 31:30

The purpose of this lesson is to help you understand an important issue with in-memory distributed computation with big datasets, that is fault tolerance. This lesson will help you understand the need for RDD

The need for RDD
11:29

In this lesson, we are going to go one level deep and understand what is RDD. Most aspiring Spark learners have looked at the RDD paper, but don't worry we will explain RDD with out asking you to refer to the RDD paper.

What is RDD ?
12:30

We see a lot of misconceptions when it comes to RDD especially with new Spark learners. We can’t let that happen to our students so we created this seperate lesson to address those misconceptions.

What an RDD is not
07:31
+
Execution in Spark (Behind the scenes)
4 Lectures 59:26

In this lesson we are going to calculate maximum volume of each stock symbol in our stocks dataset. Our goal in this chapter is to understand the types of operations we can do on the RDDs.

First program in Spark
16:04

In this lesson we will explore what are the types of dependencies between RDDs. More importantly we will see why dependencies between RDDs are important to understand.

What are dependencies & why they are important ?
11:11

In this lesson we will learn how a logical plan in Spark gets converted to Physical task and finally ends up as jobs, stages and tasks in Spark.

Program to Execution (Part 1)
13:01

In this lesson we will learn how a logical plan in Spark gets converted to Physical task and finally ends up as jobs, stages and tasks in Spark.

Program to Execution (Part 2)
19:10
+
2 important concepts in Spark
2 Lectures 22:38

Spark's speciality is in-memory computing. In this lesson we will explore what is kept in memory and what is not and how Spark manages memory.

Memory management
15:04

Most Spark learners don’t have a good grasp on fault tolerance and we don’t blame them because fault tolerance is an abstract concept and you can’t get a handle on it until you see things in action. So in this lesson we are going to do a full circle, we are going to demonstarted how fault tolreance works in Spark.

Fault tolerance
07:34
+
A short chapter on Scala
3 Lectures 35:33

In this lesson, we will explore object oriented programming principles vs. functional programming principles. We will also see the need for a new programming language like Scala.

Introduction to Scala
12:05

In this lesson we will learn about some important concepts and features in Scala. Don't worry we won't use the HelloWorld sample because it is not cool :-)

First program in Scala (not Hello World)
11:45

In this lesson we will explore functions in Scala. We will talk about anonymous and higher-order functions.

Scala functions
11:43
About the Instructor
Hadoop In Real World
4.5 Average rating
7,227 Reviews
66,645 Students
3 Courses
Expert Big Data Consultants

We are a group of Senior Hadoop Consultants who are passionate about Hadoop and Big Data technologies. We have experience across several key domains from finance and retail to social media and gaming. We have worked with Hadoop clusters ranging from 50 all the way to 800 nodes.

We have been teaching Hadoop for several years now. Check out our FREE and successful Hadoop Starter Kit course at Udemy.