Snowflake cloud database with ELT(Airflow+Python+Talend)
What you'll learn
- Leveraging snowflake cloud data warehouse using Talend.
- Talend basics.
- Airflow basics.
- Building airflow Dags.
- Connect snowflake with python.
- Auditing snowflake commands.
- Capturing cost and performance metrics.
- Virtual machine with preconfigured airflow.
- End to End project to load Nyc traffic data(250 + GB ).
Requirements
- This course will use virtual box software.
- Disk space of ~30 Gb is required to run the virtual machine.
Description
In our previous course Snowflake Masterclass [Real time demos+Best practices+Labs] we deep-dived and understood the fundamentals of snowflake, solved lot of assignments, and understood best practices to load data and unload data.
Also, we closely evaluated most of the snowflake features understanding how they work under the hood. Through these discussions, you realized how to use Snowflake efficiently.
There was one missing piece, How to build and orchestrate ETL workflows on Snowflake. This course is just about that.
In this course, we are going to learn,
Build workflows in Airflow.
We will leverage Talend capabilities to build generic code to ingest data and process data in snowflake.
We will build audit tables and record every command we fire on the snowflake. We will record the time consumed for each task and capture snowflake credits.
Once we build the framework we will build a workflow to process and transform 250 + GB volume of NYC traffic data.
At last, we will connect the Snowflake with python and write code to capture stats of data we loaded to the snowflake.
you will also get access to preconfigured Jupyter notebook to run your python code on the Snowflake.
If you have previously not worked with Talend, Airflow and Python don't worry they are very simple tools I will provide the necessary introduction.
I am sure you will learn a lot from this journey. See you in the course!!
Who this course is for:
- Developers interested to know how to build workflows for snowflake.
Instructor
Pradeep has over 7 years of experience working in ETL and data warehousing.
He has great expertise on Snowflake enterprise database, Python, AWS cloud services(EC2,EMR,S3,LAMBDA,KINESIS) Hadoop, Talend and Informatica.
He has great experience in building data pipelines and building data products and services for customers by using advanced analytics.
He has good knowledge on data science and big data technologies. Currently he is working on building data pipelines using spark.