AWS Athena - Interactive SQL Interface

Name: AWS Athena - Interactive SQL Interface
Rating: 3.6 (11 reviews)

Learn AWS Athena interacting with S3 and other data sources using Sql queries

Created byJim Macaulay

Last updated 5/2022

English

What you'll learn

AWS Athena
An Interactive SQL Interface for various data sources
Sql queries on S3
AWS Data Engineering

Course content

5 sections • 16 lectures • 1h 1m total length

About the Course0:53
Explore AWS Athena, an interactive service to analyze data directly, covering creating databases and tables, loading data from S3, and working with partitions, bucketing, and structured, unstructured, and semi-structured data.
Introduction and Navigation of Athena Interface6:04
Navigate the Athena interface to write and run SQL queries, view results, and save or download results. Manage data sources and recent queries, plus view execution stats and encryption settings.

Creating Database1:44
Create a database in AWS Athena using the employees dataset housed in the history bucket, including employees, departments, countries, job history, jobs, and locations, and verify the database creation.
Creating Table9:49
Learn to create an external table from an S3 bucket in the employees data database by defining columns and data types, configuring encryption, handling headers, and validating results.
CTAS (Create Table As Select)3:14
Learn CTAS in AWS Athena by creating a table from a select statement that filters employees with salaries greater than or equal to 10,000, resulting in 19 records.
Creating View4:14
Create and use views in AWS Athena, building a view from a template, concatenating names, and querying salary and job data.

CTAS - Text1:41
CTAS - Parquet4:21
CTAS - Json2:07
Save a table and data in Jason format, rename the table to employees underscore Jason, and download and verify the generated Jason file after executing the query.
CTAS - ORC1:36
CTAS - Avro1:40
Learn to store data in a row format, rename the table employees in the school, view the data schema, and download the file that isn’t in a human readable format.

Partition8:40
Bucketing2:45
Learn bucketing as a high-cardinality alternative to partitioning to speed queries by grouping records into a fixed number of buckets, e.g., four buckets.
Partition and Bucketing4:14
Explore partitioning and bucketing in AWS Athena, using partitioned and bucketed tables for low and high cardinality fields. Build an employees table with four buckets to illustrate partitioning and bucketing.

Requirements

Sql knowledge is required

Description

Amazon Athena is an interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL. Athena is serverless, so there is no infrastructure to manage, and you pay only for the queries that you run.

In this course you will work with,

• Creating a database

• Creating tables

• Create table out of a file

• Querying the data from S3 bucket

• CTAS (Create Table As Select)

• Partitions and Bucketing

• Interact with structured, unstructured and semi structured data

• Store the data in TEXTFILE, PARQUET, JSON, ORC and AVRO formats

Athena is easy to use. Simply point to your data in Amazon S3, define the schema, and start querying using standard SQL. Most results are delivered within seconds. With Athena, there’s no need for complex ETL jobs to prepare your data for analysis. This makes it easy for anyone with SQL skills to quickly analyze large-scale datasets.

Athena is out-of-the-box integrated with AWS Glue Data Catalog, allowing you to create a unified metadata repository across various services, crawl data sources to discover schemas and populate your Catalog with new and modified table and partition definitions, and maintain schema versioning.

Amazon Athena uses Presto with ANSI SQL support and works with a variety of standard data formats, including CSV, JSON, ORC, Avro, and Parquet. Athena is ideal for interactive querying and can also handle complex analysis, including large joins, window functions, and arrays. Amazon Athena is highly available; and executes queries using compute resources across multiple facilities and multiple devices in each facility. Amazon Athena uses Amazon S3 as its underlying data store, making your data highly available and durable.

Who this course is for:

ETL Developers
Data Analysts
Data Architects
ETL Architects
Business Analysts
Database developers

AWS Athena - Interactive SQL Interface

What you'll learn

Explore related topics

Course content

Introduction2 lectures • 7min

Database and Tables4 lectures • 19min

CTAS5 lectures • 11min

Partition and Bucketing3 lectures • 16min

DML's2 lectures • 9min

Requirements

Description

Who this course is for: