Name: Mastering AWS Analytics ( AWS Glue, KINESIS, ATHENA, EMR)
Rating: 4.0 (47 reviews)

Udemy Business

Teach on Udemy

Turn what you know into an opportunity and reach millions around the world.

Learn More

Your cart is empty.

Keep shopping

Created bymanish tiwari

Last updated 10/2022

English

What you'll learn

Confidently work with AWS Serverless services to develop Data Catalogue, ETL, Analytics on a Data Lake
Build a serverless data lake on AWS using structured and unstructured data
Develop deep knowledge in Glue, Athena, kinesis, kinesis analytics etc.
Real time streaming data store and process

Course content

12 sections • 52 lectures • 6h 49m total length

Introduction0:36
Join this introductory course to explore data engineering with AWS analytics tools, including AWS Glue, Kinesis, Athena, and EMR, and practice with labs requiring an AWS account and internet.
Course Curriculum2:35
Explore the course curriculum with hands-on labs, building Windows and Linux machines, mastering S3 and IAM, and applying Kinesis, AWS Glue crawlers, and catalogs for real-time analytics.

Create your first Windows Machine9:57
Learn to launch a Windows machine in the AWS console, configure key pairs and security groups, then connect using the RTP client with the generated password.
Create your first Linux Machine4:02
Learn to create a Linux machine on AWS by launching an instance, selecting a Linux image, generating a key pair, configuring security, and connecting via SSH to run commands.
INTRODUCTION to IAM0:58
Explore identity access management (IAM) concepts, including users, groups, roles, and policies, and learn how to create users, assign access, and attach multiple policies.
CREATE IAM USER AND ASSIGN POLICY8:15
Learn to create IAM users, assign limited policies, and grant programmatic and web access, demonstrating read-only and admin permissions while avoiding root account use.
IAM GRUP CONCEPT4:25
Create a development group, add a user to the group, and attach an access policy to grant S3 permissions, illustrating how group permissions propagate to the user account.
IAM ROLE6:48
Learn how to create an IAM role and attach multiple policies to enable AWS Glue to access S3 and trigger Lambda, enabling automated data ETL workflows.

CLOUD STORAGE1:56
Explore cloud storage, how to upload and access files over a network, with examples like Google Drive and Amazon S3, and note upcoming bucket creation and file identification.
S3 BASIC CONCEPT3:51
Master the basics of Amazon S3 as secure, durable, scalable object storage for storing any file type accessed from anywhere on the web with 99.99% availability and durability.
S3 USE CASE2:10
Explore Amazon S3 use cases for data lakes, big data analytics, AI ML, and scalable cloud-native apps with backup, 99% availability, and lifecycle archiving using Glacier tiers.
LAB-1 : AMAZON S33:18
Create and manage an Amazon S3 bucket by configuring a unique name, region, and access settings, then upload and delete files, and finally delete the bucket.
S3 LIFECYCLE5:13
Explore how S3 lifecycle transitions data between storage classes—standard, intelligent tiering, and Glacier—balancing cost and access speed, and learn how to configure and view storage class changes in the console.
S3 VERSIONING3:54
Learn how to enable versioning on an S3 bucket, upload files, and manage multiple object versions to restore or download previous data.

Introduction ETL1:59
Learn how ETL works with AWS Glue by extracting data from web apps and IoT sensors, transforming it through cleaning, filtering, and joining, and loading it into a target location.
WHAT IS GLUE2:23
Explore AWS Glue, a serverless data integration service that combines and prepares data for analytics and machine learning using extract, transform, load, and data crawlers, with visual and coding interfaces.
GLUE BENEFITS1:25
Explore the benefits of AWS Glue, a serverless data integration service that enables faster, collaborative data preparation with automated ETL scripts and crawlers, scaling automatically.
GLUE USE CASE1:24
Explore use cases of data processing with Lambda-triggered events that automatically run jobs when new data arrives in S3, enable automated scripts, and use a data catalog to discover datasets.
GLUE TERMINOLOGY2:48
Understand glue terminology such as data catalog, crawler, and classifier, and see how crawlers scan S3 data to populate metadata and create tables and databases with triggers.
GLUE ARCHITECTURE3:14
Explore the Glue architecture, including data stores, crawlers, data catalogs, and ETL jobs that extract, transform, and load data from sources using automated or scripted workflows.
GLUE DEMO LAB20:11
Build an AWS Glue workflow by creating an S3 data store and bucket, crawling data to populate the catalog, and generating ETL scripts and JSON outputs in Glue Studio.
GLUE TRANSFORMATION LAB22:49
Master glue transformations in this lab. Set up S3 input and output buckets, crawl data to create a catalog, and run a Glue Studio job with rename and aggregate operations.
GLUE TANSFORMATION LAB - MULTIPLE SOURCE17:06
Build a multi-source etl workflow in AWS Glue by joining two inputs, applying transformations, and saving the result as csv in an S3 output bucket.
PROJECT -1 END TO END REAL TIME PROJECT (GLUE+S3+LAMBDA)17:57
Execute an end-to-end AWS analytics project using S3, Glue, and Lambda to crawl CSV data, build an ETL pipeline, transform data, and run jobs.
PROJECT -2 END TO END REAL TIME PROJECT PARTITIONING13:46
Learn to implement an end-to-end real-time partitioning workflow using AWS Glue crawler to create a single partitioned table from multi-folder S3 data, and load transformed CSV output via Glue Studio.

REAL TIME STREAMING3:40
Explore real time streaming data and how online games, ecommerce actions, and IoT events are processed in real time with Kinesis to drive instant insights.
KINESIS FAMILY3:46
Explore real-time streaming with the Kinesis family—video streams for live video analytics, data streams for real-time data, data firehose for near real-time storage, and analytics with Apache Flink.
KINESIS OVERVIEW8:02
Learn how Kinesis enables real-time streaming, analytics, and storage via Kinesis data streaming and Kinesis Firehose, processing IoT, clickstream, and video streams with Amazon S3 and a data warehouse.
LAB 1 : KINESIS DATA FIREHOSE LAB (GENERATE REAL TIME DATA )11:40
Generate real-time data with the Kinesis Data Firehose to stream into Amazon S3, using input and output buckets, a delivery stream, and demo data for testing.
project : kinesis firehose + data analytics + s322:53
This end-to-end project demonstrates real-time streaming with Kinesis Firehose, routing data to S3 buckets and analytics results to S3 via Kinesis Data Analytics, using two pipelines.

Athena Introduction2:35
Learn how Amazon Athena provides a serverless, interactive SQL query service to analyze data directly on S3 without moving it. It is not a database or data warehouse.
how athena works1:18
Learn how Athena analyzes data stored in an S3 bucket without moving it, using an external table to query and analyze data directly in S3.
who is athena for and prereqisite1:45
Identify who can use Athena to analyze logs stored in S3, including cloud, flow, app, and IoT data, using basic SQL knowledge to run queries.
difference between sql server and athena1:51
Compare SQL Server and Athena, contrasting SQL Server's ml operations and database management with Athena's serverless, external-table analytics, and note Athena's lack of user-defined functions and DDL support.
LAB-1 : athena create table by crawler11:34
Demonstrate how to use an Athena crawler to create a table from S3 data, build a data catalog, and run queries in the Athena editor.
Lab -2 create table without crawler and directly from s33:39
Create a table in Athena directly from an s3 bucket without a crawler, configure the database and s3 path, define csv columns, and query the data.
Project -1 superstore data analysis by using athena11:22
Execute an end-to-end AWS Athena project on S3 superstore data, building a data catalog with a crawler, then query total sales, total profit, and top locations by state or city.
Project -2 Partitioning using athena7:16
Set up a glue crawler to catalog daily s3 data and create partitions in a table. Query the partitioned data in athena by date to analyze daily files.

INTRODUCTION1:22
Understand Amazon EMR, Elastic MapReduce, for big data analytics with Hadoop frameworks. See when to use EMR for computing and real-time data processing with S3 and Kinesis.
EMR BASIC1:30
Learn how EMR simplifies big data deployment by creating master and slave nodes, launching multi-node clusters, and one-click deployment of Spark and other frameworks.
BENEFITS1:53
Explore cost-efficient pay-as-you-go pricing and integration with other services via IAM policies, then deploy, scale, and monitor EMR clusters with cloud watch and log center, ensuring 99.99% availability and security.
ARCHITECTURE3:26
Understand EMR architecture with storage options like S3, cluster resource management with MapReduce, and installable applications; plan storage, framework, and cluster deployment and monitoring.
USECASE AND PRICING2:32
Identify real-time streaming, interactive analysis, and genomics use cases in AWS analytics, and learn pay-per-second pricing with a one-minute minimum, plus terminate resources to avoid charges.
LAB1 : EMR CLUSTER CONFIGURATION (QUICK OPTION)5:43
Learn to launch and configure an EMR cluster in the AWS console, selecting software, hardware, and security settings for quick, scalable analytics workloads.
LAB12 : EMR CLUSTER CONFIGURATION (ADVANCE OPTION)6:11
Explore EMR cluster configuration via the advanced option, customize software, select multiple master nodes, configure data catalog, and tune hardware and networking for optimized cluster utilization.

Requirements

Basic working knowledge of any SQL style query language
Course includes demo of all the labs. An AWS Account would be required to try labs hands-on.

Description

In this course, we would learn the following:

1) We will start with Basics on Serverless Computing .

2) We will learn Schema Discovery, ETL, Scheduling, and Tools integration using Serverless AWS Glue Engine built on Spark environment.

3) We will learn to develop a centralized Data Catalogue too using Serverless AWS Glue Engine.

4) We will learn to query data lake using Serverless Athena Engine build on the top of Presto and Hive.

5) We will learn about kinesis family and learn how we can handle real time data and do analytics

Businesses have always wanted to manage less infrastructure and more solutions. Big data challenges are continuously challenging the infrastructure boundaries. Having Serverless Storage, Serverless ETL, Serverless Analytics, and Serverless Reporting, all on one cloud platform had sounded too good to be true for a very long time. But now its a reality on AWS platform. AWS is the only cloud provider that has all the native serverless components for a true Serverless Data Lake Analytics solution.

This course understands your time is important, and so the course is designed to be laser-sharp on lecture timings, where all the trivial details are kept at a minimum and focus is kept on core content for experienced AWS Developers / Architects / Administrators. By the end of this course, you can feel assured and confident that you are future-proof for the next change and disruption sweeping the cloud industry.

I am very passionate about AWS Serverless computing on Data and Analytics platform, and am covering A-to-Z of all the topics discussed in this course.

So if you are excited and ready to get trained on AWS Serverless Analytics platform, I am ready to welcome you in my class !

Who this course is for:

Anyone who wants to learn AWS Serverless technologies for data and analytics should take this course
Data Professionals seeking to learn Serverless Storage, Serverless ETL, Serverless Data Analysis

What you'll learn

Explore related topics

Course content

Introduction2 lectures • 3min

DEMO PROJECT1 lecture • 16min

BASIC EC2 AND IAM6 lectures • 34min

AMAZON S36 lectures • 20min

AWS GLUE11 lectures • 1hr 45min

Kinesis5 lectures • 50min

ATHENA8 lectures • 41min

EMR (ELASTIC MAP REDUCE)7 lectures • 23min

project 1 : Real time end to end data engineer project1 lecture • 32min

PROJECT 2 : S3 + GLUE + ATHENA1 lecture • 22min

Requirements

Description

Who this course is for: