Apache Flink Relational Programming using Table API and SQL

Name: Apache Flink Relational Programming using Table API and SQL
Rating: 4.1 (104 reviews)

Learn Apache Flink Table and SQL Interfaces via Python to process batch and streaming data workloads at scale

Created byAdam McQuistan

Last updated 11/2022

English

What you'll learn

Apache Flink Table API
Apache Flink SQL Interface
Apache Flink with Python (PyFlink)
Batch Data Processing
Stream Data Processing

Course content

4 sections • 42 lectures • 4h 13m total length

Introduction0:17
Why this Course is Important0:41
Focus of Course1:15
About Instructor2:11
Course Prerequisites0:40

Comparing Static Bounded Tables and Unbounded Stream Oriented Tables6:08
Building a Mental Model of Data Flow in Batch and Stream Data Processing9:22
Overview of Basic Operations4:11
Demo: Table Projects with select(...) and SELECT ...10:23
Demo: Filtering Tables with where(...), filter(...) and WHERE ...9:36
Demo: Joining Tables10:52
Demo: Aggregations on Tables with group_by(...), GROUP BY ... and Calculations8:49
Time Component of Stream Processing8:31
Streaming Aggregations and Windows3:33
Tumbling Windows2:36
Sliding Windows3:45
Session Windows3:29
Streaming Demos Common Setup10:06
Demo: Tumbling Windows with Table API15:02
Demo: Tumbling Windows with SQL10:14
Demo: Sliding Windows with Table API10:35
Demo: Sliding (Hopping) Windows with SQL9:41
Demo: Session Windows with Table API7:50
Demo: Session Windows with SQL8:45
Demo: Event Time Processing10:50
Demo: Row Operations and UDFs9:51

Requirements

Previous experience with Python programming
Basic Understanding of Operating Systems and Docker
Basic Understanding of Distributed Computing

Description

Apache Flink is widely growing in popularity for its ability to perform advanced stateful computations in a way that scales to meet the demands of both high throughput and high performance use cases. Not only is Apache Flink very scalable and performant it also integrates with a wide variety of source and sink data systems like flat files (CSV,TXT,TSV), Databases, and Message Queues (Kafka, AWS Kinesis, GCP Pub/Sub, RabbitMQ).

In this course students will learn to harness the power of Apache Flink which is a modern distributed computing framework providing a unified approach to both batch and streaming data processing workloads. This course specifically focuses on the relational programming paradigm exposed through Apache Flink's Table API and SQL interface (with examples in Python) offering intuitive yet powerful abstractions to process vast amounts of data in either bounded (batch) or unbounded (streaming) sources.

Students learn batch processing with Flink through many examples of consuming, processing, and producing results from/to the filesystem in CSV format.
Students also learn stream processing with Flink through several examples consuming, processing and producing results from/to Apache Kafka running in a local Dockerized Kafka cluster.

Apache Flink offers support for developing Flink applications with the Table API and SQL interface in Java, Scala and Python. However, this course focuses on using the Python bindings for Apache Flink. The focus on Python for this course was chosen due to the popularity of the Python programming language, particularly in the big data engineering ecosystem, but also due to the underrepresentation of Python in existing Apache Flink courses which primarily cover the Java and Scala APIs of Flink.

Who this course is for:

Data centric Python developers

Apache Flink Relational Programming using Table API and SQL

What you'll learn

Explore related topics

Course content

Introduction5 lectures • 5min

Introduction to Apache Flink Table API and SQL Interface7 lectures • 13min

TableEnvironment, Table Sources and Table Sinks9 lectures • 1hr 1min

Operations on the Table Object using Table API and SQL21 lectures • 2hr 54min

Requirements

Description

Who this course is for: