
Discover Snowflake’s cloud-native architecture, a hybrid of shared disk and shared nothing, with three layers: cloud services, compute, and storage, and scalable warehouses with varying pricing.
Explore how Snowflake objects are organized in a hierarchy from organization to accounts, schemas, and databases, and how tables, views, stages, procedures, and user-defined functions fit within this structure.
Learn how Snowflake uses credits to bill storage, compute, cloud services, and serverless features, and how to estimate costs across on-demand versus pre-purchased storage and multi-cluster warehouses.
Learn how Snowflake's search optimization accelerates point lookup queries on non-clustered tables by pruning partitions and reducing scans, while understanding cost and maintenance implications.
Learn how to ingest CSV data from S3 into Snowflake by creating a development schema, line item table, file format, stage, and copy into the table.
Learn how to extract and unload data from Snowflake to S3 using a storage integration, with options for partitioned data and file formats, including JSON via object_construct.
Learn how Snowflake streams become stale due to data retention limits, time travel, and offsets, and how unconsumed streams affect billing and behavior.
Understand push down in Snowflake and other big data tools, contrasting load-first filter-later with early filtering. Learn how push down boosts performance and reduces memory use, while recognizing confidentiality risks.
Create an IAM role to grant Snowflake access to database components through the API gateway, naming it currency conversion external role and linking it to the prior S3–Snowflake integration.
Set up a managed airflow cluster on AWS for Snowflake data pipelines, including creating an S3 bucket and configuring a CloudFormation VPC with subnets.
Set up an Airflow DAG to copy data into Snowflake with the Snowflake operator, then trigger a Spark/Glue job, while managing stages, formats, and credentials.
Course Update as of Feb 2023 : This Course has been updated with Snowpark API which covers UDFs,Stored Procedures for ETL and also covers Machine Learning use-case deployments . This course will help you clear SnowPro Advanced Certifications
Snowflake is the next big thing and it is becoming a full blown data eco-system . With the level of scalability & efficiency in handling massive volumes of data and also with a number of new concepts in it ,this is the right time to wrap your head around Snowflake and have it in your toolkit . This course not only covers the core features of Snowflake but also teaches you how to deploy python/pyspark jobs in AWS Glue and Airflow that communicate with Snowflake , which is one of the most important aspects of building pipelines .
Anyone who has a basic understanding of cloud and belong to one of the below backgrounds can benefit from this course :
- Data Scientists / Analysts
- Data Engineers / Software Developers
- SQL Programmers or DBA's
- Aspiring Data analysts and scientists who are learning SQL and Python
This Course covers :
What is Snowflake
Most Crucial Aspects of Snowflake in a very practical manner
Writing Python/Spark Jobs in AWS Glue Jobs for data transformation
Real Time Streaming using Kafka and Snowflake
Interacting with External Functions & use cases
Security Features in Snowflake
Prerequisites for this course are :
Knowing SQL or at least some prior knowledge in writing queries
Scripting in Python (or any language )
Willingness to explore ,learn and put in the extra effort to succeed
An active AWS Account & know-how of basic cloud fundamentals
Important Note - You need to have an active AWS Account in order to perform tasks in sections related to Python and PySpark . For the rest of the course , a free trial snowflake account should suffice .
Some Tips :
Try to watch the videos at 1.2X speed
Read the reference links and the official documentation of Snowflake as much as possible