Data Architecture for Data Scientists
What you'll learn
- Data Architecture in general, to be able to navigate your organizations data landscape
- Develop understanding of topics like Data Lake, Datawarehousing and even Data Lakehouse to be able to communicate with data engineering teams
- Understand the pricinciples of data governance topics like Data Mesh to better navigate the data governance paradigm
- Get introduced to technologies related to machine learning specific data infrastructure like feature stores and vector databases
- What is data architecture? What is a data warehouse (DWH) ? What is data lake? What is data lakehouse? What is data mesh?
- How is streaming data used in data science? What is a feature store? How is a feature store used in machine learning? What are vector databases??
Requirements
- Basic understanding of data science project workflow like model training and model deployment
- Basic understanding of why data is needed for training and deploying models
- Understanding of the difference between batch and real time use cases
Description
Machine learning models are only as good as the data they are trained on, which is why understanding data architecture is critical for data scientists building machine learning models.
This course will teach you:
The fundamentals of data architecture
A refresher on data types, including structured, unstructured, and semi-structured data
DataWarehouse Fundamentals
Data Lake Fundamentals
The differences between data warehouses and data lakes
DataLakehouse Fundamentals
Data Mesh fundamentals for decentralized governance of data including topics like data catalog, data contracts and data fabric.
The challenges of incorporating streaming data in data science
Some machine learning-specific data infrastructure, such as feature stores and vector databases
The course will help you:
Make informed decisions about the architecture of your data infrastructure to improve the accuracy and effectiveness of your models
Adopt modern technologies and practices to improve workflows
Develop a better understanding and empathy for data engineers
Improve your reputation as an all-around data scientist
Think of data architecture as the framework that supports the construction of a machine learning model. Just as a building needs a strong framework to support its structure, a machine learning model needs a solid data architecture to support its accuracy and effectiveness. Without a strong framework, the building is at risk of collapsing, and without a strong data architecture, machine learning models are at risk of producing inaccurate or biased results. By understanding the principles of data architecture, data scientists can ensure that their data infrastructure is robust, reliable, and capable of supporting the training and deployment of accurate and effective machine learning models.
By the end of this course, you'll have the knowledge to help guide your team and organization in creating the right data architecture for deploying data science use cases.
Who this course is for:
- Data Scientists who are transitioning from academia or business domains
- Junior data scientists who would like to understand the topics surrounding data infrastructure
- Citizen data scientists who wish to deploy machine learning models in production
- Anyone who wishes to learn the basics of data architecture in a very short time
- BI Analysts and BI developers who would like a quick overview of the enterprise data landscape
- Folks who wish to get a quick overview of data architecture components in an enterprise.
Instructor
I have over 20 years of experience in helping enterprises manage data, and more than half of this in building scalable platforms for analytics and machine learning.
Call it timing, luck or destiny, I was able to gain exposure to a variety of areas like Linux, Unix, Networking, Data Integration, Analytics, Big Data, and Machine Learning. My exposure to such varied set of technology areas, helps me craft solution architecture, that's not only easy to maintain but also economical.
I'm originally from Mumbai, India, but got the opportunity to work in Malaysia, Singapore, Indonesia, Thailand, Sri Lanka, Finland, Sweden, Denmark, Norway, UK, and Germany, If we speak about my customers, they have been virtually in every country. I have gained immensely from this international exposure which not only reflects in my work but also in my personality.
Currently I'm based in Munich, and when I'm not working, I enjoy hiking and snowboarding in the mountains nearby.
If you like my courses, you are welcome to follow me on Linkedin where I regularly post