Apache NiFi Complete Master Course - HDP - Automation ETL
4.0 (265 ratings)
Course Ratings are calculated from individual students’ ratings and a variety of other signals, like age of rating and reliability, to ensure that they reflect course quality fairly and accurately.
1,346 students enrolled

Apache NiFi Complete Master Course - HDP - Automation ETL

Next Gen Data Flow. Process - distribute data using powerful, reliable framework. Apache Nifi, Nifi Registry, Minifi
4.0 (265 ratings)
Course Ratings are calculated from individual students’ ratings and a variety of other signals, like age of rating and reliability, to ensure that they reflect course quality fairly and accurately.
1,346 students enrolled
Last updated 1/2020
Current price: $139.99 Original price: $199.99 Discount: 30% off
5 hours left at this price!
30-Day Money-Back Guarantee
This course includes
  • 5 hours on-demand video
  • 3 articles
  • 43 downloadable resources
  • Full lifetime access
  • Access on mobile and TV
  • Certificate of Completion
Training 5 or more people?

Get your team access to 4,000+ top Udemy courses anytime, anywhere.

Try Udemy for Business
What you'll learn
  • Apache Nifi (Niagara Files) basics to advanced concepts
  • Flowfile, Processor, Connections, Controller, ProcessGroup, Input - output ports, Funnel etc.,
  • Installation, Security, Customization, Scalability of Apache Nifi
  • Develop simple to complex Dataflow and take it to production
  • Nifi Registry - Dataflow registry
  • Hortonworks DataFlow HDF
  • Integreate with Kafka, NoSQL Database, RDBMS, File System, etc
  • Porcess different types of files like CSV, JSON, Text file, etc.,
  • Basic understanding on Data movement and ETL
  • Interest to learn more and upgrade to latest technology

Apache Nifi is next generation framework to create data pipeline and integrate with almost all popular systems in the enterprise. It has more than 250 processors and more than 70 controllers.

This course covers all all basic to advanced concepts available in Apache Nifi like

  • Flowfile

  • Controllers

  • Processors

  • Connections

  • Process Group

  • Funnel

  • Data Provenance

  • Processor relationships

  • Input and Output Ports

This course also covers on the Apache Nifi Subprojects like

  • Nifi Registry

As part of production maintenance, user may have to take cautious decision to improve the performance and handle the errors efficiently. To facilitate the same, Demo also covers on

  • Handling Throughput and Latency

  • Handling Back Pressure and Yield

  • Error handling

  • Failure Retry

  • Monitoring Bulletin

  • Data Provenance

To have seamless experience with data, handling of data latency and throughput and prioritizing the data is important. Its controlled with relationship, yield and back pressure.

Various processors and controllers to process various type of data is demonstrated.

Processors which are used in production scenarios like HTTP, RDBMS, NoSQL S3, CSV, JSON, Hive, etc., are covered in detail along with controllers like SSL, ConnectionPool, etc., with demo.

All these concepts are covered with demo and real time implementation is provided.

For easy practical purpose, all the demonstrated flowfile template is uploaded as part of the course.

Demo on creating and using KeyStore, Trust Store for SSL communication.

Using Maven and Eclipse EE for custom processor and deploying nar file to Nifi libraries.

Who this course is for:
  • Developers, Architects, Beginners who wants to learn Apache NiFi
  • ETL team who wants to move to latest technology
Course content
Expand all 57 lectures 05:03:10
+ Introduction to Apache Nifi
6 lectures 31:00

Introduction to this course.

Preview 01:28
  • Basic understanding on Apache Nif Project

  • Why Apache Nifi?

  • Compare with Other ETLs.

  • Features which differentiates Apache Nifi from other ETLs

Apache Nifi Introduction
  • Understand what is Dataflow

  • Overview on Apache Nifi UI and its features

  • Dataflow and challenges

  • Apache Nifi key features

  • Nifi role in Push and Pull architecture

Dataflow Introduction - Key Features
  • Install Apache Nifi in Windows

  • Start Apache Nifi

  • Open Apache Nifi UI in browser

Preview 03:57
  • Get to know about various terminologies like

  1. FlowFile

  2. FlowFile Processor

  3. Connection

  4. Flow Controller

  5. Process Group

  • Create a simple workflow

  • Play with Flowfile generator

Terminology Introduction

Understand various section of the UI like

  • component Toolbar

  • Global Menu

  • Search

  • Status Bar

  • Navigate Palette

  • Operate Pallette

UI Introduction - Play with Apache Nifi User Interface
+ First Baby Step - Flow file Demo
1 lecture 08:31
  • Create a simple Flow

  • Introduction to GetFile and PutFile Processor

  • Processor Configuration

  • Connection Configuration

  • Relationship Termination

Preview 08:31
+ Processors and Connections
5 lectures 27:19

Understand various Category types like

  • Data Ingestion

  • Routing and Mediation

  • Database Access

  • Attribute Extraction

  • System Interaction

  • Data Transformation

  • Sending Data

  • HTTP Access

  • AWS Cloud Access

Processor Category

Understand various configuration option as part of connections like

  • Flowfile Expiration

  • Back Pressure

  • Object Threshold

  • Size Threshold

  • Prioritization

  • Various options in connection context menu

  • Queue monitoring

  • Using Queue Empty Options

Connection configuration

Various general configuration as part of Processor Settings tab

  • Penalty Duration

  • Yield Duration

  • Bulletin Level

  • Relationship Termination

Processor Configuration Settings

Various options in Processor Scheduling Tab option

  • Different Scheduling Strategy

  • Relationship between latency and throughput

  • Concurrent task configuration

  • Run Schedule configuration

  • Execution mode

Processor Configuration Scheduling
  • Managing property of various processors

  • Customizing mandatory and non mandatory properties

  • Error handling on missing properties

Processor Configuration Property
+ Next Step into Flowfile
4 lectures 22:39
  • Changing the payload attributes

  • Taking decision based on flowfile attribute

  • Logging the attributes in log file

  • Monitoring log attributes

Preview 05:55
  • Customizing log file configuration

  • Logging attributes in separate log file

Log Configuration and Monitoring Logs
  • Failure handling by processor

  • Retry failed Flowfile

  • Monitoring failure queue

  • Check failure message from bulletin

Handling Failures
  • Purpose and use of Templates

  • Creating Templates

  • Managing Templates

  • Uploading Templates

  • Template file structure

  • Handling sensitive information in Templates

Preview 04:06
+ Integrating Apache Nifi with Distributed Messaging System - Apache Kafka
3 lectures 13:12

Understand Apache Kafka

Install Kafka

Create Topic

Publish Message to Topic

Read Message from Topic

Preview 05:27
  • Create message with Flowfile

  • Post message to Kafka Topic

  • Read message using Kafka console consumer

Nifi As Producer
  • Post message using Kafka Producer

  • Read message from Topic using Apache Nifi

  • Convert message to Flowfile

Preview 03:20
+ Process group and Funnel
3 lectures 12:43
  • Purpose of process group

  • Input and Output Ports

  • Create and Use Process groups

Process group - Input and Output ports
  • Create Funnel

  • Understand forking concepts and its use

  • Fork flow file to multiple processor

Preview 03:43
  • Understand Combine or Fan-in concept and its use

  • Combine flowfiles from multiple processors

Funnel Combine
+ Monitoring and Provenance
2 lectures 11:26

Monitor various statistical information about

  • Processors

  • Input Ports

  • Output Ports

  • Remote Process Group

  • Connections

  • Process Groups

Observing overall Bulletin

Data Provenance

Nifi history

Nifi Monitoring and Statistics
  • Purpose and Usage of Data Provenance

  • Provenance data lineage

  • Detail event analysis

  • View/Download input and output claim

  • Replay / Retry events

  • Observe failed queues

  • Observe modified attributes as part of event

Preview 05:44
+ Structured Data Processing
4 lectures 23:37
  • Connect and Read data from MySQL database

  • Use of Avro and JSON

  • Using Connection Pool Controller

Read MySQL Table data as Avro and JSON
  • Using AvroSchemaRegistry

  • Using CSV Reader

  • Using JSONRecordWriter

  • Using Custom Schema of CSV

  • Monitor updated attribute using data provenance

Transform CSV to JSON
  • Purpose of state management

  • Reading only Delta records from RDBMS table

  • Using maximum-value Column property to manage state

Preview 05:11
  • Creating sample data from mackaroo

  • Using dynamic schema

  • Purpose and realtime use of dynamic schema

Preview 03:36
+ Nifi Registry
2 lectures 09:45
  • Nifi - Registry Introduction

  • Purpose of Nifi Registry

  • Installing Nifi Registry

  • Staring Nifi Registry service

  • Creating and managing buckets

Apache Nifi Registry - Introduction
  • Connecting Nifi Registry with flow file

  • Adding a flowfile to a bucket

  • Maintain Version of flowfile

  • Committing and changing version of flowfile

  • Check version history of flowfile

  • Rollback changes

Nifi Registry as Version Control System
+ Nifi Cluster
2 lectures 14:54
  • Install 3 node cluster

  • Understand Primary Node, Cluster Coordinator role and responsibility

  • Using Zookeeper Quorum and configuration

  • Configuration change of nifi.properties and state-management.xml

  • Update Zookeeper connection string

  • Starting nodes and cluster

  • Verification of primary node election

Cluster Installation and Configuration
  • Cluster overview on Nodes, Systems, JVM, Storage and Versions

  • Create sample flow file

  • Monitor status history

  • Execution of processor in Primary node and in All nodes

Preview 05:47