Data Integration & ETL with Talend Open Studio Zero to Hero
What you'll learn
- connect your data sources, such as files, databases, XML, web services, Google Drive and more formats
- build your own integration processes using practical examples and comprehensive scenarios
- master the most important transformations like mappings, joins, aggregations and sorting
- orchestrate processes into larger units by using preJobs, postJobs, variable and hierachies
Requirements
- Interest in data and in bringing it together
- Computer/Laptop with 4+GB RAM and current Java Runtime
Description
Data. Everywhere. All well-behaved in their own environment. But who actually lets them talk to each other? You do. With data integration. Become a data savant and add value with ETL and your new knowledge!
Talend Open Studio is an open, flexible data integration solution. You build your processes with a graphical editor and over 600 components provide flexibility.
Each section has a practical example and you will receive this complete material at the beginning of the course. So you can not only view each section, but also compare it to your own solution. There are also extensive practical scenarios included. So you'll be well equipped for practice!
What are the biggest topics you can expect?
Installation on different operating systems (Windows, Linux, Mac)
understanding and using important data types
reading and writing from databases
process different file formats, like Excel, XML, JSON, delimited, positional
create and use metadata
build schemas
use helpful keyboard shortcuts
retrieve data from WebServices / REST
connect to GoogleDrive and fetch data
using iteration and loops
convert data flows into iterations
build and understand job hierarchies
All major transformations: Map, join, normalize, pivot, and aggregate data
create and extract XML and JSON
use regular expressions
Orchestrate components in processes
Check and improve data quality
Use fuzzy matching and interval matching
Use variables for different environments
Perform schema validation
Handle reject data separately
Find and fix errors quickly
Write meaningful logs
Include and react to warnings and aborts
Build job hierarchies and pass data between different levels
implement and test your own assumptions
configure your project for logging, versioning and context loading
learn best practices and establish your own
document items and have documentation generated
What are you waiting for? See you in the course!
Who this course is for:
- you want to bring together different data sets quickly and easily
- you have challenges to let your data talk to each other
- you are interested in a career in data space
- future ETL developers
Instructor
=== English ===
Hello, my name is Samuel.
I am a data solution developer and started working in this field in 2009. Since then I have built many challenging and exciting data warehouse projects.
My specialities are ETL, data integration, databases, SQL and Java. Over the last few years, the tool stack has continued to grow. It now includes AWS, Python, JSON, Big Data and NoSQL as well.
This allows me to combine data from many sources in a structured and useful manner. So that ultimately, companies can make better decisions.
I’m happy to share some of my knowledge here with you now.
Happy learning
=== Español ===
Hola, mi nombre es Samuel.
Soy desarrollador de soluciones de datos y empecé a trabajar en este campo en 2009. Desde entonces he construido muchos proyectos de almacén de datos desafiantes y emocionantes.
Mis especialidades son ETL, integración de datos, bases de datos, SQL y Java. En los últimos años, la pila de herramientas ha seguido creciendo. Ahora incluye también AWS, Python, JSON, Big Data y NoSQL.
Esto me permite combinar datos de muchas fuentes de forma estructurada y útil. Para que, en última instancia, las empresas puedan tomar mejores decisiones.
Estoy feliz de compartir algunos de mis conocimientos aquí contigo ahora.
Feliz aprendizaje
=== Deutsch ===
Hallo, mein Name ist Samuel.
Ich bin Entwickler von Datenlösungen und habe 2009 in diesem Bereich gestartet. Seitdem habe ich viele anspruchsvolle und spannende Data-Warehouse-Projekte entwickelt.
Meine Spezialgebiete sind ETL, Datenintegration, Datenbanken, SQL und Java. In den letzten Jahren hat sich der Tool-Stack weiter vergrößert. Er umfasst jetzt auch AWS, Python, JSON, Big Data und NoSQL.
Dies ermöglicht es mir, Daten aus vielen Quellen strukturiert und sinnvoll zu kombinieren. So können Unternehmen letztendlich bessere Entscheidungen treffen.
Ich freue mich, einen Teil meines Wissens nun hier mit dir zu teilen.
Viel Spaß beim Lernen