Big Data Processing using Apache Spark
0.0 (0 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
1 student enrolled
Wishlisted Wishlist

Please confirm that you want to add Big Data Processing using Apache Spark to your Wishlist.

Add to Wishlist

Big Data Processing using Apache Spark

Leverage one of the most efficient and widely adopted Big Data processing framework - Apache Spark
0.0 (0 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
1 student enrolled
Created by Packt Publishing
Last updated 6/2017
English
Current price: $12 Original price: $125 Discount: 90% off
4 days left at this price!
30-Day Money-Back Guarantee
Includes:
  • 1.5 hours on-demand video
  • 1 Supplemental Resource
  • Full lifetime access
  • Access on mobile and TV
  • Certificate of Completion

Training 5 or more people?

Get your team access to Udemy's top 2,000 courses anytime, anywhere.

Try Udemy for Business
What Will I Learn?
  • Understand the Spark API and its architecture
  • Know the difference between RDD and the DataFrame API.
  • Learn to join big amounts of data
  • Start a project using Apache Spark
  • Discover how to write efficient jobs using Apache Spark
  • Test Spark code correctly
  • Leverage Apache Spark to process big data more rapidly
View Curriculum
Requirements
  • A basic understanding and functional knowledge of Apache Spark and big data are required.
Description

Every year we have a big increment of data that we need to store and analyze. When we want to aggregate all data about our users and analyze that data to find insights from it, terabytes of data undergo processing. To be able to process such amounts of data, we need to use a technology that can distribute multiple computations and make them more efficient. Apache Spark is a technology that allows us to process big data leading to faster and scalable processing.

In this course, we will learn how to leverage Apache Spark to be able to process big data quickly. We will cover the basics of Spark API and its architecture in detail. In the second section of the course, we will learn about Data Mining and Data Cleaning, wherein we will look at the Input Data Structure and how Input data is loaded In the third section we will be writing actual jobs that analyze data. By the end of the course, you will have sound understanding of the Spark framework which will help you in writing the code understand the processing of big data.

About the Author

Tomasz Lelek is a Software Engineer, programming mostly in Java, Scala. He is a fan of microservices architecture, and functional programming. He has dedicated considerable time and effort to be better every day. He recently dived into Big Data technologies such as Apache Spark and Hadoop. Tomasz is passionate about nearly everything associated with software development. Recently he was a speaker at conferences in Poland - Confitura and JDD (Java Developers Day) and also at Krakow Scala User Group. He has also conducted a live coding session at Geecon Conference.

Who is the target audience?
  • If you are a software Engineer interested in Big Data Processing then this course is for you.
Compare to Other Big Data Courses
Curriculum For This Course
13 Lectures
01:24:40
+
Writing Big Data Processing Using Apache Spark
5 Lectures 31:12

This video will an overview of entire course

Preview 01:37

In this video, we will cover the Spark Architecture.

Overview of the Apache Spark and its Architecture
11:29

This video focuses on creating a project.

Start a Project Using Apache Spark, Look at build.sbt
03:32

This video shows the installation of spark-submit on our machine.

Creating the Spark Context
07:00

In this video we will look at the API of Spark.

Looking at API of Spark
07:34
+
Data Mining and Data Cleaning
4 Lectures 21:30

Thinking what problem we want to solve?

Preview 04:42

In this video, we will learn about Spark API to load data.

Using RDD API in the Data Mining Process
04:22

In this video, we will cover how to load input data.

Loading Input Data
04:42

In this video, we look at how to tokenizing input data

Cleaning Input Data
07:44
+
Writing Job Logic
4 Lectures 31:58

This video shows how to implement counting Word Logic.

Preview 07:37

In this video, we will focus on solving problems.

Using RDD API Transformations and Actions to Solve a Problem
10:23

This video shows how to write Robust Spark Test Suite.

Testing Spark Job
09:38

This video shows how to start our Apache Spark job for two text books.

Summary of Data Processing
04:20
About the Instructor
Packt Publishing
3.9 Average rating
8,274 Reviews
59,159 Students
687 Courses
Tech Knowledge in Motion

Packt has been committed to developer learning since 2004. A lot has changed in software since then - but Packt has remained responsive to these changes, continuing to look forward at the trends and tools defining the way we work and live. And how to put them to work.

With an extensive library of content - more than 4000 books and video courses -Packt's mission is to help developers stay relevant in a rapidly changing world. From new web frameworks and programming languages, to cutting edge data analytics, and DevOps, Packt takes software professionals in every field to what's important to them now.

From skills that will help you to develop and future proof your career to immediate solutions to every day tech challenges, Packt is a go-to resource to make you a better, smarter developer.

Packt Udemy courses continue this tradition, bringing you comprehensive yet concise video courses straight from the experts.