Data Integration & ETL with Talend Open Studio Zero to Hero
What you'll learn
- connect your data sources, such as files, databases, XML, web services, Google Drive and more formats
- build your own integration processes using practical examples and comprehensive scenarios
- master the most important transformations like mappings, joins, aggregations and sorting
- orchestrate processes into larger units by using preJobs, postJobs, variable and hierachies
Requirements
- Interest in data and bringing it together
- Computer/Laptop with 8+GB RAM and Java 11
- your database to use in the course (MySQL, PostgreSQL, or similar)
Description
Data. Everywhere. All well-behaved in their own environment. But who actually lets them talk to each other? You do. With data integration. Become a data savant and add value with ETL and your new knowledge!
Talend Open Studio is an open, flexible data integration solution. You build your processes with a graphical editor and over 600 components provide flexibility.
Each section has a practical example and you will receive this complete material at the beginning of the course. So you can not only view each section, but also compare it to your own solution. There are also extensive practical scenarios included. So you'll be well equipped for practice!
What are the biggest topics you can expect?
Installation on different operating systems (Windows, Linux, Mac)
understanding and using important data types
reading and writing from databases
process different file formats, like Excel, XML, JSON, delimited, positional
create and use metadata
build schemas
use helpful keyboard shortcuts
retrieve data from WebServices / REST
connect to GoogleDrive and fetch data
using iteration and loops
convert data flows into iterations
build and understand job hierarchies
All major transformations: Map, join, normalize, pivot, and aggregate data
create and extract XML and JSON
use regular expressions
Orchestrate components in processes
Check and improve data quality
Use fuzzy matching and interval matching
Use variables for different environments
Perform schema validation
Handle reject data separately
Find and fix errors quickly
Write meaningful logs
Include and react to warnings and aborts
Build job hierarchies and pass data between different levels
implement and test your own assumptions
configure your project for logging, versioning and context loading
learn best practices and establish your own
document items and have documentation generated
What are you waiting for? See you in the course!
Who this course is for:
- you want to bring together different data sets quickly and easily
- you have challenges to let your data talk to each other
- you are interested in a career in data space
- future ETL developers
Instructor
I love data! And I enjoy creating value from it!
My top-skills are Talend • SQL • Java • Linux • Cloud Computing
Since 2009 I have implemented solid, long-term oriented solutions mainly in the financial sector, but also in industries such as retail, logistics and e-commerce. This covers a wide range of use cases, stakeholders and team constellations.
Originally, I grew up in the "traditional" world of business intelligence and data integration. But over time I got deeper into coding and newer technologies, such as data lakes.
Since 2020 I'm almost completely devoted to the cloud with AWS and Google Cloud. The sky has no limit (-;
During my career I have lived in 3 countries for 3 years and speak 2 foreign languages.
Besides coding, I enjoy being with my family and ultra running is also important to me. I have successfully completed distances of up to 108 km.