Teach on Udemy

Turn what you know into an opportunity and reach millions around the world.

Learn More

Your cart is empty.

Keep shopping

CCA131 Cloudera CDH 5 & 6 Hadoop Administrator Master Course

Name: CCA131 Cloudera CDH 5 & 6 Hadoop Administrator Master Course
Rating: 4.5 (1151 reviews)

Master Cloudera CDH Admin. Spin up cluster in AWS, Mess it, Fix it, Play it and Learn. Real time demo on CCA131 Topics.

Highest Rated

Created byMuthukumar Subramanian

Last updated 2/2021

English

EnglishSpanish [Auto],

What you'll learn

Successful Hadoop Cloudera Administrator
Start working in Hadoop Cloudera Production Environment
Install, Configure, Manage, Secure, Test and Troubleshoot Hadoop Cloudera Cluster
Manage and secure production grade Hadoop Cloudera Cluster

Course content

32 sections • 120 lectures • 15h 13m total length

Introduction to iSMAC4:59
iSMAC- IoT, Social, Mobility, Analytics and Cloud will be the five pillars for next business innovation. How these five technology will be driving the transformation of e-business to digital business and where Big data and Cloudera fits in.

Introduction to iSMAC
Place for Big data in iSMAC world
Answering Where, What and How?
Types of analytics - Descriptive, diagnostic, Predictive and Prescriptive
Spectrum of analytics - What happened? What is happening? what will happen? what should happen?
Lambda Architecture Overview and Details5:38
Almost every Big data project will fall under Lambda Architecture. Lambda Architecture will be servicing three layers, batch layer, service layer, speed layer. Hadoop ecosystem supports many components for these layers.

Introduction to Lambda Architecture
Three layers of Lambda Architecture
Why Lambda Architecture important in Big data ecosystem?
Various components serving different layers
Use cases where lambda architecture being used
Understanding Machine Learning and it Application in Bigdata World4:28
Machine learning and deep learning plays a significant role in current big data space. You will understand and get answers to the following questions.
How machine learning will answer the question, what?
What is a model?
How model is generated with train data?
How train model will be used or applied?
Why machine learning / deep learning is important to process unstructured data?

AWS - Introduction To Amazon Web Services (AWS)1:13
AWS Amazon Web Services provides access to technology resource with pay as you go model. You will pay only for the resources you are using. Provides wide spectrum of hardware resources, right from compute, storage, database, network, Analytics, security, Machine Learning and many more.

For this course we will understand what is AWS how it can facilitate to setup Cloudera cluster similar to production environment
AWS being the market leader in cloud, learning Cloudera administration with AWS will be added advantage
AWS - Signup and Billing (Important - Pricing related)4:36
How to signup a new account with Amazon Web Services (AWS). How services would get charged and track the billing

Signup new account with Amazon Web Services (AWS)
AWS Free Tier
Sample billing calculation for EC2 Instance and EBS Volume
Verify billing information at billing page
AWS - Zones and Regions2:57
AWS resources are located in multiple physical locations across the globe. Regions at Geographic area and each regions with multiple zones. We will learn about Zones and Regions.

Advantages of regions and zones
How to achieve high availability with regions and zones
Why it is important with big data cluster
AWS - Launch EC2 Instance7:57
AWS Elastic Compute Cloud EC2 provides secure, resizable compute capacity in the cloud. Provides the complete control to the user in choosing, scaling up and scaling down the infrastructure based on the need. Launching EC2 instances and terminating EC2 instance involves few mandatory steps.

Objective

Identify the instance type
Selecting the network details
Attaching EBS volume
Setting up security configurations, opening the ip and port
Tagging the system
Attach New or existing key to login to the system
Monitoring the health of the system
Terminate the instance
Verification of dependent services whether got terminated property or not
What all components will be chargeable
AWS - Simple Storage Service (S3)3:04
Learn about Simple Storage Service (S3), What is bucket and how it could be used with big data space

Introduction to Simple Storage Service (S3)
Pricing in S3
How to create/delete Bucket
How to add/delete files
AWS - Login to EC2 Instance using Putty4:19
Learn about Key based login using Putty Client. As administrator we may login to each and every machine to do OS level configuration. This could be done by loggin into the system using Putty Client

Private Public key pair
Putty Client and SSH Protocol
PuttyGen to extract the ppk file from PEM format. (Privacy Enhanced Mail )
Login as root to AWS EC2 Instance using private key
AWS - EC2 AMI (Amazon Machine Image)16:09
Amazon Machine Image (AMI) facilitates the installation of any software / framework real quick and fast. Entire Amazon Elastic Compute Cloud (EC2) will be stored as AMI.
Objective:
How to create Amazon Machine Image (AMI)
How Elastic Block Storage (EBS) Would get stored as Snapshot
How to create more instances with AMI
How the security group will get applied against the instance
Disable Linux Firewall SELinux in AMI
Covered with CentOS 6
AWS - EC2 Spot Instances4:39
AWS provides four type of instances.

On-Demand
Spot Instances
Reserved Instances
Dedicated Hosts
Out of these On-Demand is production grade which gives very high availability.
Spot instances are spare compute capacity in AWS is made available at huge discount compared to On-Demand. The price will be decided on bidding process. Highest bidder will get the instance. But spot instances can be interrupted by EC2 if the bid price is less than the market price. Apart from this all other features like fault-tolerance, reliability all the same as On-Demand Instance

Since we are going to learn and not run production cluster, we can use Spot Instances and get almost 10X capacity for the same budget. That is budget savings between 80 to 90% than on demand.
Objective
How to find the current pricing?
Choose the right instance type for Cloudera?
Check the memory/processor/network capacity of instances?
Creating bid for Spot Instances?
Features available in Spot Instances.
AWS - Relational Data Service (RDS)8:32
Relational Data Service (RDS) is pay as you go model of database service. Users can start database of MySQL, Aurora, Postgre, Maria, Oracle or Microsoft SQL in any capacity and run it for any duration with High Availability

Introduction to RDS
Start MySQL RDS
Connect to MySQL RDS from EC2 Instance
Connect to MySQL RDS from HeidiSQL client
Working with RDS security configuration
Introduction to RDS snapshot and High Availability
Delete RDS
Pricing calculation for RDS

HDFS - Hadoop Distributed File System4:17
Hadoop has HDFS (Hadoop Distributed File System). HDFS provides highly scaling and highly available storage within Hadoop. HDFS works with master worker architecture which provides horizontal scaleability.

Introduction to HDFS
How master worker architecture is implemented in HDFS?
How High Availability (HA) is provided with Master and Worker?
How Scaleability provided with Master and Worker?
How data gets distributed across HDFS?
YARN - Yet another Resource Negotiator2:48
YARN (Yet Another Resource Negotiator) provides distributed processing environment with Master worker architecture within Hadoop.

How Master worker architecture implemented within YARN?
How High Availability implemented with Master and Worker?
How High Scaleability implemented with Master and Worker?
How resource manager works?
Introduction to scheduler and its purpose.
MySQL Database setup and Installation14:36
Cloudera services and management roles of Cloudera uses database to store meta information. Few of the services like Sqoop needs sql database to demonstrate its functionality. We will be using MySQL or MariaDB for the same.

Installing MySQL database
Prepare OS firewall to use MySQL database
When to use MariaDB
Securing MySQL/MariaDB Installation
Need for MySQL Driver and its installation procedure
Creating sample database/tables/users
Using UI client like HeidiSQL

Prepare AWS AMI for Cloudera Installation14:40
AWS Amazon Machine Image (AMI) is great to use a customized image of any system. Starting of N number of EC2 instances with predefined configuration will be very easy if we have an AMI of our need

Select the base AMI with CentOS?
Why CentOS for learning?
Choose the type of instance with right memory and processing capacity
Start and Login to EC2 instance with Putty
Configure Linux firewall, volume, swappiness, iptables and prepare the instance to install Cloudera manager
Prepare AMI with predefined configurations for Cloudera installation
Install - Cloudera Data Hadoop (CDH) Quick Install17:42
Cloudera Data Hadoop (CDH) provides quick install option where the Cloudera Manager and Cluster can be installed with minimal configuration.

Setting up AWS EC2 for Cloudera Manager
Installing Cloudera Manager
Adding host to be managed
Selecting the required parcels
Installing CDH
Installation Notes0:05
Cloudera Installation Phases and Paths2:25
Learn about various steps for installation of CDH. In total there are six steps and this six steps could be installed in three different paths. Each will have its own advantage and disadvantage

Three different paths of installation
Overview on Path A, B and C
Differences in these three paths
Six steps in each path of installation
Cloudera Manager Introduction and Overview3:39
Cloudera Manager Server is the admin console for Cloudera Administrator.

Overview on Cloudera Manager
Exploring the options available in Cloudera Manager
Overview on Agents
Overview on Cloudera Management Service
Overview on Cloudera Manager Database
Overview on Cloudera Repository and Parcels
Managing Services
Managing multiple cluster and its services
Cloudera Parcels1:42
Cloudera has its own binary distribution format which contains the required programs. Parcels are very similar to Linux packages.

Introduction to Cloudera Parcels
Difference between Parcels and Packages
Advantage of using Parcels than Packages in Cloudera
Cloudera Repository Setup with Apache httpd9:52
Setup Cloudera repository rpm locally in a httpd server. This will help the organisation to define the required version and make the installation available locally.

Installing httpd server locally in AWS EC2 instance
Configure firewall for httpd
Making the Cloudera manager repository to be available locally
Making Cloudera Data Hadoop (CDH) Parcels available locally
Verify the repository and parcel accessibility
Cloudera Installation Path B with local repository - AMI Prepare17:51
Cloudera installation could be done by 3 different paths. Here we will learn about Path B where every step installation will be done manually.

Verify repository for Cloudera installation
Setup Database/users/tables for Cloudera manager
Configure firewall, Transparent Huge Pages, Defragmentation, Caching, ntp to work with Cloudera setup
Creating AMI for future use
Cloudera Installation Path B - Manager Installation and Configuration12:23
Cloudera manager is the heart of Cloudera which can manage N number of clusters and services within it. As a part of Path B installation, Cloudera manager with Management Service Agent, JDK will be configured.

Installing Cloudera Manager
Installing Management Service Agents
Configuring JDK
Prepare and populate MySQL database for Cloudera Manager
Updating configuration file to use the correct port and JDK
Starting the server and agent
Verifying the installation with web UI
Introduction to various version of CDH installation
Cloudera Installation Path B - Agent and CDH Installation and Configuration17:50
Cloudera Data Hadoop (CDH) can be installed in multiple hosts by selecting various services and their corresponding roles. As a part of Path B CDH installation, Server, Agents, Database and roles will be configured manually

Make AWS security configurations to facilitate Cloudera Manager to manage hosts
Installing Agents in individual hosts
Installing CDH parcels with JDK
Configuring agent to send heart beat signal to Cloudera Manager
Starting Agent in all hosts
Configuring auto start of agents on restart of hosts
Installing CDH Parcels in all hosts
Create/Configure required database for report manager, Hue, Hive and Oozie
Installing core Hadoop
Selecting roles for various services
Verifying installation
Add Cluster, Add Service and Delete Cluster life cycle10:09
With Cloudera Manager, we can add any number of clusters, we can add/manage multiple services within each cluster

Adding a new cluster
Introduction to roles and service in a cluster
Adding various services to a cluster
Adding various roles to a service
Removing service from a cluster
Removing a cluster from Cloudera Manager

HDFS Shell Commands12:17
HDFS Hadoop Distributed File System works on Write Once Read Many (WORM) methodology. we can use HDFS Client to Add, Read, Delete Files and folders. We can also do some additional functionalities like setting permissions, changing ownership etc.

Add/Move/Delete files and folders
Changing replication factor
Finding the files blocks, its locations, rack details
Getting a file from HDFS to local file system
File report on space, quota utilization
Uses of -touchz
Testing whether the entity is a file or folder
Capturing the return value from shell commands
Chanding permission and ownership
HDFS Trash8:27
Hadoop supports a process of moving any deleting files to Trash. Trash acts like recycle bin.

Understand Trash concept in Hadoop
Details on trash interval and trash interval checkpoint
Trash folder location
Trash checkpoint
Trash expunge and skiptrash details

HDFS High Availability (HA) - Concepts3:00
High Availability HA can be provided in HDFS by increasing the replication factor of blocks in Datanodes and introducing Standby Namenode for Master. Fail-over controller and zookeeper provides auto fail-over

Concepts of HA in Datanode
Concept of HA in Namenode
Introduction to Standby Namenode
Functionality of Fail-over controller and Zookeeper
Introduction and need for Journal nodes
HDFS High Availability (HA) - Setup3:47
Enabling HDFS setup in Cloudera is as simple as adding the required roles and enabling HA.

Adding nameservice to existing HDFS
Selecting systems for journal nodes
Setting the standby Namenode and fail-over controller
Configuring zookeeper to coordinate the fail-over
HDFS High Availability (HA) - Test5:06
Reliability of High Availability can be tested by manually making a Namenode to fail and check whether fail-over controller is triggering and promoting the standby Namenode to Namenode

Stopping active Namenode
Verifying automatic fail-over controller detection
Purpose of zookeeper in automatic fail-over
HDFS High Availability (HA) - Remove4:21
HDFS High Availability is a must for production environment, but for test scenario we can disable it to free up the resources. We will see how to disable HDFS HA

HDFS Balancer5:37
HDFS Stores the actual data as blocks in Datanodes. Sometimes due to commissioning or decommissioning of nodes into the cluster, the distribution of blocks in Datanode may not be even. This will create uneven processing load and IO while reading or doing the processing. HDFS balancer helps to balance the blocks evenly across the Datanodes

Need for HDFS balancer
Details on HDFS balancer role
Adding HDFS balancer Role
Configuration and setup of threshold
Making balancer to run
Verify balanced blocks
Removing balancer role
HDFS Maintenance Mode5:06
To apply software or hardware fix, we have to take away few system for maintenance. With Cloudera we can take a host or entire cluster or a service or role to maintenance mode. When a host or service or role taken to maintenance mode, alerts will be suppressed

Need for maintenance mode
Taking cluster to maintenance mode and bringing cluster away from maintenance mode
Taking Datanode to maintenance mode
Impact of maintenance mode with replication factor
Exiting from maintenance mode
HDFS Quota Management12:51
Within HDFS files count and quantity can be controlled within any folder by administrator. This will help administrator to control the resource utilization of Namenode and Datanode efficiently by the users.

Understanding Name and Space Quota
Setting Name and Space Quota
Details on How Space Quota works.
Internal working of space allocation while adding file and impacts with Space Quota
Removing Name and Space Quota
HDFS Canary Test2:48
HDFS Canary test is a regular health check done by executing few client operations to monitor the health of HDFS

Purpose of Canary test
What all operations gets executed as part of Canary test
Implications on allowing Canary test to run
Disabling Canary test and reason for disabling it
HDFS Rack Awareness7:21
HDFS can be made aware of on the arrangements of nodes in different availability zones or racks and HDFS can place the blocks so that the availability of blocks and files can be increased

Understanding Rack awareness
Achieving rack awareness in Cloud
Verifying Rack awareness
Configuring rack awareness script
Implementing rack awareness

HDFS Edits FSImage Introduction4:36
Within HDFS, Namenode stores meta information and Datanode stores the blocks. Meta information will be stored in RAM and changes in metadata will be recorded in two different type of files. Edits and FSImage. Process of writing into edits and merging edits with FSImage helps us to recover the metadata in any case if it is lost

Functions and Structure of Edits and FSImage
Drawback of playing back huge Edits
Concept of saving namespace
Concept of Checkpoint
Types of trigger to update the FSImage from Edits
Format of Editlog files
HDFS Checkpoint Introduction and Deepdive6:58
On various triggers, transactions in edits will get merged with FSImage. N number of edits gets maintained as a part of rolls

Reading process of Edits and FSImage on start of Namenode
Checkpoint process
Details on segment in Edit logs
Role played by Standby Namenode and Secondary Namenode during checkpoint process
HDFS Edits FSImage - Offline Image View (OIV) and Offline Edits View (OEV)6:36
Content within Edits and FSImage can be seen using OIV and OEV utility.

Using Offline Image Viewer OIV
Using Offline Edit Viewer OEV
Process of how transactions gets updated in Edit logs
Purpose and use of seen_txid
purpose of block pool id, Cluster id, namespace id, layout version
Generating output from oiv and oev in various format
HDFS Roll Edits12:18
HDFS edits will get rolled on various conditions. Transaction will get rolled and gets saved as FSImage.

Understanding the process of HDFS Roll edits
Triggering condition for edits roll
Understanding checkpoint check period
Various scenarios when the edit logs roll will happen
Manually rolling the edits
HDFS Save Namespace16:43
All edits transactions will be played back to create the effective transaction list within FSImage. This will reduce the metadata loading time into RAM

Loading of metadata on start of Namenode
Reason to save namespace
Introduction to HDFS Safemode
Verifying edit transactions on save namespace
HDFS roll edits, safemode, save namespace scenarios with dfsadmin command

HDFS Snapshot13:26
HDFS allows to take point in time copy of entire file system or a specific folder. It involves very innovative way where additional copy of blocks will not be stored. Snapshots will be very helpful to recover the state of the HDFS

Understanding Snapshot process in HDFS
Enabling Snapshot for a folder
Internal working of Snapshot
Restoring Snapshot
Disabling Snapshot
HDFS Snapshot Policy4:04
HDFS Snapshot process can be automated by creating policy. Policy could be taken on regular interval like cron job

Creating HDFS Snapshot policy
Setting the frequency of policy execution
Setting up alerts on Snapshot execution
Verifying Snapshot policy execution
Disabling Snapshot policy
HDFS Edge Node12:41
Access to the Cluster needs to be protected at the same time all the required access needs to be provided to the end user. Edge node or Gateway achieves this as well as Edge node can handle the client traffic and load and sizing can be done to handle many simultaneous and concurrent users accessing the cluster

Purpose and need of Edge Node or Gateway
How security of the cluster can be increased with Edgenode
Horizontally scaling Edgenode to handle load
Configuring a Gateway
Access HDFS from Gateway
Setting security configuration between Gateway and Cluster to restrict the direct access to cluster by clients
HDFS WebHDFS8:47
HDFS gives an option to interact with REST API where REST over http protocol. There are advantages on using REST API in terms of abstraction, Security and Gateway. It acts as single point of entry to the system

Working of WebHDFS
Setting up WebHDFS
Interacting with HDFS using WebHDFS
GET, PUT, POST, DELETE Operations
Create directory/files with WebHDFS
Difference between WebHDFS and httpFS
HDFS httpFS6:48
httpFS is additional role provided by cloudera which works on top of WebHDFS. httpFS acts as a gateway or a proxy in a single system. This provides complete control and security on interaction of client with HDFS over web

Architecture of httpFS
Advantage and disadvantage of httpFS
Details on httpFS role
Adding/removing httpFS role
Create file/directory with httpFS
Difference between httpFS and WebHDFS
HDFS FSCK Utility9:15
HDFS provides File Check Utility where the status of the file system can be verified in terms of under/over/mis replicated blocks along with block, file, location and rack details of files and blocks

Getting File Check Utility Report
Identifying missing/corrupted blocks
When and how full block report gets generated
Identifying block location and details
HDFS Recovery7:35
The transactions stored as part of Edits and FSImage may get corrupted due to loss of part of Edit log or few segments. This will protect the Namenode to start. If we have a latest back up we can use it. Otherwise we can use an option provided by Namenode to recover the Edits and FSImage

How Namenode Recover works
Simulating the scenario of corrupting HDFS edits
Make the Namenode to fail to start
Recover the Edits
Start the Namenode successfully
Understand the process of rectifying corrupted edits segment
HDFS Federation19:26
HDFS supports Namenode to scale horizontally and the process is called Federation. Namenode will be added and each Namenode will be handling a namespace. Federation supports Scalability of Namenode

Architecture of Federation
Enabling Federation
Verifying the functionality of Federation
Verifying Cluster Id, Block pool Id, Namespace ID,
Verifying VERSION file in Namenode and Datanode
Creating Namespace
Disabling Federation
HDFS - Home Directory3:41
Every user within HDFS will have a dedicated home directory. This will be the default directory to add or read files.

Configuring users home directory location
HDFS home directory permission and ownership
Configuring HUE/LDAP to add home directory automatically

Cluster Commission and Decommission8:53
Hadoop Cluster capacity may needs to be increased or decreased for various reasons. Hadoop supports a process called commissioning and decommissioning to add or remove hosts without impacting the running services.

Understand the need of removing/adding hosts to cluster
Behavior of cluster during and after commissioning and decommissioning
Processes of commissioning and de-commissioning
Verifying the state of hosts
Cluster Client Configuration6:50
Every service within Cloudera will be controlled by configuration files. The same configuration file is required to connect to the service from any client. Cloudera offers to download the client configurations for any or all services.

Purpose of client configuration files
Downloading client configuration files
Using client configuration files
Deploying changed client configuration across cluster and its clients
Cluster Host Template3:40
Within Cluster in Cloudera we may have to add different type of hosts, playing different roles. While adding the host, we need to choose all the required roles. To reduce the complexity, we can define the template so that, if we apply a template to a host, it would automatically choose the required roles.

Understanding Host Templates
Purpose and use of Host Templates
Applying Host template to hosts while commissioning a host or existing managed hosts

Requirements

Linux, Cloud Basics, System Administration will be added advantage
AWS account registration. This course will guide through setup of production grade Hadoop Cloudera Cluster in AWS Cloud.
Knowing any system setup experience will be added advantage will make the learning experience more enjoyable.
Basic understanding of IT administration or development activities

Description

This course is designed for professionals from zero experience to already skilled professionals to enhance their learning. Hands on session covers on end to end setup of Cloudera Cluster. We will be using AWS EC2 instances to deploy the cluster.

COURSE UPDATED PERIODICALLY SINCE LAUNCH (Cloudera 6)

What students are saying:

5 stars, "Very clear and adept in delivering the content. Learnt a lot. He covers the material 360 degrees and keeps the students in minds."
5 stars, "This course is an absolute paradigm shift for me. This is really an amazing course, and you shouldn't miss if you are a novice/intermediate level in Cloudera Administration."
5 stars, "Great work by the instructor... highly recommended..."
5 stars, "It is really excellent course. A lot of learning materials."
5 stars, "This course is help me a lot for my certification preparation. thank you!"

The course is targeted at Software Engineers, System Analysts, Database Administrators, Devops engineer and System Administrators who want to learn about Big Data Ecosystem with Cloudera. Other IT professionals can also take this course, but might have to do some extra work to understand some of the advanced concepts.

Cloudera being the market leader in Big data space, Hadoop Cloudera administration brings in huge job opportunities in Cloudera and Big data domain. Covers all the required skills as follows for CCA131 Certification

Install - Demonstrating and Installation of Cloudera Manager, Cloudera Data Hadoop (CDH) and Hadoop Ecosystem components
Configure - Basic to advanced configurations to setup Cloudera manager, Namenode High Availability (HA), Resource manager High Availability(HA)
Manage - Create and maintain day-to-day activities and operations in Cloudera Cluster like Cluster balancing, Alert setup, Rack topology management, Commissioning, Decommissioning hosts, YARN resource management with FIFO, Fair, Capacity Schedulers, Dynamic Resource Manager Configurations
Secure - Enabling relevant service and configuration to add security to meet the organisation goals with best practice. Configure extended Access Control List (ACL), Configure Sentry, Hue authorization and authentication with LDAP, HDFS encrypted zones
Test - Access file system commands via HTTPFS, Create, restore snapshot for HDFS directory, Get/Set extended ACL for a file or directory, Benchmark the cluster
Troubleshoot - Ability to find the cause of any problem, resolve them, optimize inefficient execution. Identify and filter out the warnings and predict the problem and apply the right solution, Configure dynamic resource pool configuration for better optimized use of cluster. Find the Scalability bottleneck and size the cluster.
Planning - Sizing and identify the dependencies, hardware and software requirements.

Getting a real time distributed environment with N number of machines at enterprise quality will be very costly. Thanks to Cloud which can help any user to create distributed environment with very minimal expenditure and pay only for what you are using it. AWS is very much technology neutral and all other cloud providers like Microsoft Azure, IBM Bluemix, Google Compute cloud, etc., works the similar way.

Content Added on Request

Dec Cloudera 6 Overview and Quick Install

Nov HDFS Redaction

Nov Memory management - Heap Calculation for Roles and Namenode

Nov IO Compression

Nov Charts and Dashboard

Oct File copy, distcp

Oct Command files added for all the section.

Sep Kafka Service Administration

Sep Spark Service Administration

Aug Cluster Benchmarking

Who this course is for:

Those who are taking CCA Hadoop Cloudera Administrator Exam (CCA131)
Anyone who want to become Cloudera Hadoop Administrator
Switching from Mainframe / Testing / Analytics domain to Cloudera Hadoop Administration domain
Data Scientists / Technical Architects / Software Developers / Testing and Mainframe Professionals
Hadoop Developer and Hadoop Cloudera Administrator who wants to work in Production like environment
Create production like environment for test or production purpose

CCA131 Cloudera CDH 5 & 6 Hadoop Administrator Master Course

What you'll learn

Explore related topics

Course content

Foundation - iSMAC, Lambda Architecture and Machine Learning3 lectures • 15min

AWS - Amazon Web Services9 lectures • 53min

Hadoop Foundation on HDFS and YARN.3 lectures • 22min

Cloudera Installation - Repository setup, httpd, path B installation11 lectures • 1hr 48min

HDFS Basics shell commands2 lectures • 21min

HDFS High Availabiliity HA - Concept, Setup, Configure, Test, Verify, Remove4 lectures • 16min

HDFS Manage - Balancer, Maintenance, Quota Management, Canary Test5 lectures • 34min

HDFS Checkpoint. Understand, Manage, Work with Edits, FSImage, Roll Edits,5 lectures • 47min

HDFS Advanced - Snapshot, WebHDFS, Federation, Recovery, httpFS, Edge Node9 lectures • 1hr 26min

Cloudera Manage - Commission , Decommission, Client configuration, Host Template3 lectures • 19min

Requirements

Description

Who this course is for: