HDPCD:Spark using Scala
4.3 (349 ratings)
Course Ratings are calculated from individual students’ ratings and a variety of other signals, like age of rating and reliability, to ensure that they reflect course quality fairly and accurately.
5,215 students enrolled

HDPCD:Spark using Scala

Prepare for Hortonworks HDP Certified Developer - Spark using Scala as programming language
4.3 (349 ratings)
Course Ratings are calculated from individual students’ ratings and a variety of other signals, like age of rating and reliability, to ensure that they reflect course quality fairly and accurately.
5,215 students enrolled
Current price: $34.99 Original price: $49.99 Discount: 30% off
5 hours left at this price!
30-Day Money-Back Guarantee
This course includes
  • 18 hours on-demand video
  • 2 downloadable resources
  • Full lifetime access
  • Access on mobile and TV
  • Certificate of Completion
Training 5 or more people?

Get your team access to 4,000+ top Udemy courses anytime, anywhere.

Try Udemy for Business
What you'll learn
  • Learn Scala, Spark, HDFS etc for the preparation of HDPCD Spark certification
Requirements
  • Basic programming skills
  • Hortonworks Sandbox or valid account for IT Versity Big Data labs or any Hadoop clusters where Hadoop, Hive and Spark are well integrated.
  • Minimum memory required based on the environment you are using with 64 bit operating system
Description

Course cover the overall syllabus of HDPCD:Spark Certification.

  • Scala Fundamentals - Basic Scala programming required using REPL

  • Getting Started with Spark - Different setup options, setup process

  • Core Spark - Transformations and Actions to process the data

  • Data Frames and Spark SQL - Leverage SQL skills on top of Data Frames created from Hive tables or RDD

  • One week complementary lab access

  • Exercises - A set of self evaluated exercises to test skills for certification purpose

After the course one will gain enough confidence to give the certification and crack it.

All the demos are given on our state of the art Big Data cluster. You can avail one week complementary lab access by filling this form which is provided as part of the welcome message.

Who this course is for:
  • Any one who want to prepare for HDPCD Spark Certification using Scala
Course content
Expand all 86 lectures 17:49:34
+ Scala Fundamentals
18 lectures 03:13:28
Setting up Scala
10:51
Basic Programming Constructs
18:53
Setup Scala on Windows
07:23
Functions
18:35
Object Oriented Concepts - Class
17:42
Object Oriented Concepts - Object
13:02
Object Oriented Concepts - Case Classes
11:14
Collections - Seq, Set and Map
08:56
Basic Map Reduce Operations
14:08
Setting up Data Sets for Basic I/O Operations
04:23
Basic I/O Operations and using Scala Collections APIs
16:23
Tuples
04:56
Development Cycle - Developing Source code
07:24
Development Cycle - Compile source code to jar using SBT
09:32
Development Cycle - Setup SBT on Windows
02:48
Development Cycle - Compile changes and run jar with arguments
04:21
Development Cycle - Setup IntelliJ with Scala
12:07
Development Cycle - Develop Scala application using SBT in IntelliJ
10:50
+ Spark Getting Started
11 lectures 01:33:49
Introduction
02:48
Setup Options
02:10
Setup using tar ball
06:55
Setup using Hortonworks Sandbox
06:16
Using labs.itversity.com
09:00
Using Windows - Putty and WinSCP
10:33
Using Windows - Cygwin
14:46
HDFS - Quick Preview
20:24
YARN - Quick Preview
09:53
Setup Data Sets
08:09
Curriculum
02:55
+ Core Spark - Transformations and Actions with advanced features
20 lectures 04:58:27
Introduction
03:45
Setup Spark on Windows
23:15
Problem Statement and Environment
12:11
Initialize the Job
20:26
Resilient Distributed Data Sets
17:05
Previewing the Data
09:55
Filtering the Data
14:42
Accumulators
17:22
Converting to Key Value Pairs - using map
13:15
Joining Data Sets
14:16
Get Daily Revenue by Product - reduceByKey
20:50
Get Daily Revenue and count by Product - aggregateByKey
09:55
Execution Life Cycle
19:54
Broadcast Variables
15:36
Sorting the Data - By Date in Ascending order and revenue in Descending Order
16:32
Saving data back to HDFS
13:29
Add spark dependencies to sbt
08:01
Develop as Scala based application
25:34
Run locally using spark-submit
09:03
Ship and run it on big data cluster
13:21
+ Spark SQL using Scala
22 lectures 04:03:10
Introduction to Spark SQL and Objectives
04:40
Different interfaces to run SQL - Hive, Spark SQL
09:25
Create database and tables of text file format - orders and order_items
25:00
Create database and tables of ORC file format - orders and order_items
10:18
Running queries using Scala - spark-shell
03:51
Functions - Getting Started
05:11
Functions - String Manipulation
22:23
Functions - Date Manipulation
13:44
Functions - Aggregations in brief
05:49
Functions - case and nvl
14:10
Row level transformations
08:30
Joining data from multiple tables
18:10
Group by and aggregations
11:41
Sorting the data
07:27
Set operations - union and union all
05:39
Analytics functions - aggregations
15:53
Analytics functions - ranking
08:39
Windowing functions
07:48
Creating Data Frames and register as temp tables
16:00
Write Spark Application - Processing Data using Spark SQL
08:38
Write Spark Application - Saving Data Frame to Hive tables
07:20
Data Frame Operations
12:54
+ Exercises or Problem Statements with Solutions
13 lectures 03:42:32
Introduction about exercises
03:26
General Guidelines about Exercises or Problem Statements
05:52
General Guidelines - Initializing the Job
13:32
Getting crime count per type per month - Understanding Data
16:04
Getting crime count per type per month - Implementing the logic - Core API
25:21
Getting crime count per type per month - Implementing the logic - Data Frames
25:08
Getting crime count per type per month - Validating Output
06:29
Get inactive customers - using Core Spark API (leftOuterJoin)
24:20
Get inactive customers - using Data Frames and SQL
20:51
Get top 3 crimes in RESIDENCE - using Core Spark API
21:35
Get top 3 crimes in RESIDENCE - using Data Frame and SQL
20:04
Convert NYSE data from text file format to parquet file format
18:50
Get word count - with custom control arguments, num keys and file format
21:00