CCA 131 - Cloudera Certified Hadoop and Spark Administrator
3.9 (279 ratings)
Course Ratings are calculated from individual students’ ratings and a variety of other signals, like age of rating and reliability, to ensure that they reflect course quality fairly and accurately.
5,808 students enrolled

CCA 131 - Cloudera Certified Hadoop and Spark Administrator

Prepare for CCA 131 by setting up cluster from scratch and performing tasks based on scenarios derived from curriculum.
3.9 (279 ratings)
Course Ratings are calculated from individual students’ ratings and a variety of other signals, like age of rating and reliability, to ensure that they reflect course quality fairly and accurately.
5,809 students enrolled
Last updated 6/2019
English
English [Auto-generated]
Current price: $139.99 Original price: $199.99 Discount: 30% off
5 hours left at this price!
30-Day Money-Back Guarantee
This course includes
  • 21.5 hours on-demand video
  • Full lifetime access
  • Access on mobile and TV
  • Certificate of Completion
Training 5 or more people?

Get your team access to 4,000+ top Udemy courses anytime, anywhere.

Try Udemy for Business
What you'll learn
  • Prepare for CCA 131 Administrator Exam
  • Provision Cluster from GCP (Google Cloud Platform)
  • Create Virtual Machines using Vagrant
  • Setup Ansible for server automation
  • Setup 8 node cluster from scratch using CDH
  • Understand Architecture of HDFS, YARN, Spark, Hive, Hue and many more
Requirements
  • Basic Linux Skills
  • A 64 bit computer with minimum of 4 GB RAM
  • Operating System - Windows 10 or Mac or Linux Flavor
Description

CCA 131 is certification exam conducted by the leading Big Data Vendor, Cloudera. This online proctored exam is scenario based which means it is very hands on. You will be provided with multi-node cluster and need to take care of given tasks.

To prepare the certification one need to have hands on exposure in building and managing the clusters. However, with limited infrastructure it is difficult to practice in a laptop. We understand that problem and built the course using Google Cloud Platform where you can get credit up to $300 till offer last and use it to get hands on exposure in building and managing Big Data Clusters using CDH.

Required Skills

Install - Demonstrate an understanding of the installation process for Cloudera Manager, CDH, and the ecosystem projects.

  • Set up a local CDH repository

  • Perform OS-level configuration for Hadoop installation

  • Install Cloudera Manager server and agents

  • Install CDH using Cloudera Manager

  • Add a new node to an existing cluster

  • Add a service using Cloudera Manager

Configure - Perform basic and advanced configuration needed to effectively administer a Hadoop cluster

  • Configure a service using Cloudera Manager

  • Create an HDFS user's home directory

  • Configure NameNode HA

  • Configure ResourceManager HA

  • Configure proxy for Hiveserver2/Impala

Manage - Maintain and modify the cluster to support day-to-day operations in the enterprise

  • Rebalance the cluster

  • Set up alerting for excessive disk fill

  • Define and install a rack topology script

  • Install new type of I/O compression library in cluster

  • Revise YARN resource assignment based on user feedback

  • Commission/decommission a node

Secure - Enable relevant services and configure the cluster to meet goals defined by security policy; demonstrate knowledge of basic security practices

  • Configure HDFS ACLs

  • Install and configure Sentry

  • Configure Hue user authorization and authentication

  • Enable/configure log and query redaction

  • Create encrypted zones in HDFS

Test - Benchmark the cluster operational metrics, test system configuration for operation and efficiency

  • Execute file system commands via HTTPFS

  • Efficiently copy data within a cluster/between clusters

  • Create/restore a snapshot of an HDFS directory

  • Get/set ACLs for a file or directory structure

  • Benchmark the cluster (I/O, CPU, network)

Troubleshoot - Demonstrate ability to find the root cause of a problem, optimize inefficient execution, and resolve resource contention scenarios

  • Resolve errors/warnings in Cloudera Manager

  • Resolve performance problems/errors in cluster operation

  • Determine reason for application failure

  • Configure the Fair Scheduler to resolve application delays

Our Approach

  • You will start with creating Cloudera QuickStart VM (in case you have laptop with 16 GB RAM with Quad Core). This will facilitate you to get comfortable with Cloudera Manager.

  • You will be able to sign up for GCP and avail credit up to $300 while offer lasts. Credits are valid up to year.

  • You will then understand brief overview about GCP and provision 7 to 8 Virtual Machines using templates. You will also attaching external hard drive to configure for HDFS later.

  • Once servers are provisioned, you will go ahead and set up Ansible for Server Automation.

  • You will take care of local repository for Cloudera Manager and Cloudera Distribution of Hadoop using Packages.

  • You will then setup Cloudera Manager with custom database and then Cloudera Distribution of Hadoop using Wizard that comes as part of Cloudera Manager.

  • As part of setting up of Cloudera Distribution of Hadoop you will setup HDFS, learn HDFS Commands, Setup YARN, Configure HDFS and YARN High Availability, Understand about Schedulers, Setup Spark, Transition to Parcels, Setup Hive and Impala, Setup HBase and Kafka etc.

  • Once all the services are configured, we will revise for exam by mapping with required skills of the exam.

Who this course is for:
  • System Administrators who want to understand Big Data eco system and setup clusters
  • Experienced Big Data Administrators who want to prepare for the certification exam
  • Entry level professionals who want to learn basics and Setup Big Data Clusters
Course content
Expand all 169 lectures 21:15:09
+ Introduction - CCA 131 Cloudera Certified Hadoop and Spark Administrator
5 lectures 21:51
Understanding required skills for the certification
04:42
Understanding the environment provided while taking the exam
03:03
Signing up for the exam
04:28
+ Getting Started - Provision instances from Google Cloud
10 lectures 01:16:48
Introduction
01:46
Setup Ubuntu using Windows Subsystem
05:28
Create template for Big Data Server
04:47
Provision Servers for Big Data Cluster
09:52
Review Concepts
11:43
Setting up gcloud
12:58
Cluster Topology
11:30
+ Getting Started - Setup local yum repository server – CDH
6 lectures 51:39
Introduction
08:29
Overview of yum
07:44
Setup httpd service
10:32
Setup local yum repository - Cloudera Manager
09:48
Setup local yum repository - Cloudera Distribution of Hadoop (CDH)
04:49
Copy repo files
10:17
+ Install CM and CDH - Setup CM, Install CDH and Setup Cloudera Management Service
8 lectures 47:52
Introduction
01:44
Setup Pre-requisites
05:41
Install Cloudera Manager
06:06
Licensing and Installation Options
05:16
Install CM and CDH on all nodes
09:05
CM Agents and CM Server
04:44
Setup Cloudera Management Service
06:03
Cloudera Management Service – Components
09:13
+ Install CM and CDH - Configure Zookeeper
6 lectures 48:33
Introduction
02:12
Learning Process
05:25
Setup Zookeeper
10:06
Review important properties
06:49
Zookeeper Concepts
13:45
Important Zookeeper Commands
10:16
+ Install CM and CDH - Configure HDFS and Understand Concepts
12 lectures 02:20:05
Introduction
06:26
Setup HDFS
18:40
Copy Data into HDFS
08:45
Copy Data into HDFS Contd
10:42
Components of HDFS
10:57
Components of HDFS Contd
10:03
Configuration files and Important Properties
19:17
Review Web UIs and log files
16:02
Checkpointing
12:11
Checkpointing Contd
09:24
Namenode Recovery Process
07:15
Configure Rack Awareness
10:23
+ Install CM and CDH - Important HDFS Commands
16 lectures 02:06:03
Introduction
02:17
Getting list of commands and help
14:42
Creating Directories and Changing Ownership
08:38
Managing Files and File Permissions - Deleting Files from HDFS
06:45
Managing Files and File Permissions - Copying Files Local File System and HDFS
13:45
Managing Files and File Permissions - Copying Files within HDFS
06:26
Managing Files and File Permissions - Previewing Data in HDFS
05:00
Managing Files and File Permissions - Changing File Permissions
08:43
Controlling Access using ACLs - Enable ACLs On Cluster
02:59
Controlling Access using ACLs - ACLs On Files
05:05
Controlling Access using ACLs - ACLs On Directories
10:08
Controlling Access using ACLs - Removing ACLs
04:03
Overriding Properties
11:11
HDFS usage commands and getting metadata
11:54
Creating Snapshots
05:21
Using CLI for administration
09:06
+ Install CM and CDH - Configure YARN + MRv2 and Understand Concepts
12 lectures 01:57:02
Introduction
03:38
Setup YARN + MR2
09:28
Run Simple Map Reduce Job
07:15
Components of YARN and MR2
05:48
Configuration files and Important Properties - Overview
04:10
Configuration files and Important Properties - Review YARN Properties
13:18
Configuration files and Important Properties - Review Map Reduce Properties
08:18
Configuration files and Important Properties - Running Jobs
13:05
Review Web UIs and log files
13:20
YARN and MR2 CLI
14:54
YARN Application Life Cycle
05:08
Map Reduce Job Execution Life Cycle
18:40
+ Install CM and CDH - Configuring HDFS and YARN HA
10 lectures 01:00:59
Introduction
01:37
High Availability – Overview
04:14
Configure HDFS Namenode HA
12:16
Review Properties – HDFS Namenode HA
05:41
HDFS Namenode HA – Quick Recap of HDFS typical Configuration
07:52
HDFS Namenode HA – Components
06:35
HDFS Namenode HA – Automatic failover
05:37
Configure YARN Resource Manager HA
03:22
Review – YARN Resource Manager HA
06:10
High Availability – Implications
07:35