Machine Learning experiments and engineering with DVC
What you'll learn
- What is Data Version Control (DVC) tool and how to use it
- How to build reproducible Machine Learning experiments
- How to automate pipelines execution with DVC
- How to manage data and model versioning
- How to organize code in Machine Learning projects
- Basics of how to build, test, deploy and monitor Machine Learning model (CI/CD and MLOps)
- How to start to use DVC in your projects (step by step)
Course content
- Preview02:10
- Preview03:55
- Preview03:35
- Preview02:03
- 01:14Before start
- 00:11Resources and links
Requirements
- Python
- Basic knowledge in CLI and Git is a plus
- Linux / Mac OS
Description
Online video course to teach basics for Machine Learning experiment management, pipelines automation and CI/CD to deliver ML solution into production. During these lessons you’ll discover base features of Data Version Control (DVC), how it works and how it may benefit your Machine Learning and Data Science projects.
During this course listeners learn engineering approaches in ML around a few practical examples. Screencast videos, repositories with examples and templates to put your hands dirty and make it easier apply best features in your own projects.
After this course you will be able to
Use DVC for data and artifacts version control
Build reproducible machine learning pipelines
Manage Machine Learning experiments
Automate pipelines configuration
Organize code in Machine Learning projects
Setup CI/CD pipelines with GitLab / GitHub and DVC
Who this course is for:
- Data Scientists
- Machine Learning Engineers
- Data Engineers
- DevOps / MLOps Engineers
Instructors
I have a background in different fields, including data science/machine learning, robotics, product development and marketing. Around 6 years I work on data science projects as a developer and team lead. My interests cover topics in Data Science & Machine Learning project development processes, automated pipelines, reproducibility, experiments and model management. Also, I’m a creator of of ML REPA community: Machine Learning REPA: Reproducibility, Experimentation and Pipeline Automation. We organize meetups and workshops to learn & share knowledge on Machine Learning Engineering and Management topics
Marcel holds a Computer Engineering degree, a Big Data Graduate Certificate, and a Master of Science degree in Bioinformatics, all three degrees obtained at the Federal University of Rio Grande do Norte, in Brazil, and is currently enrolled as a Ph.D. Candidate at Sorbonne Université, in France.
He is an Early Stage Researcher at Institut Curie where he conducts research on Causal Inference Analysis applied to cancer patient records. He's been doing research on health informatics for over 10 years, using different methodologies to study a large set of human diseases.
Elle is a data scientist at Iterative, a startup building open source software tools for machine learning, and a lecturer at the University of Michigan School of Information. She completed her PhD at the University of Washington where she conducted research on speech and hearing using mathematical models. Elle is broadly interested in developing methods, standards, and educational resources for anyone who works with data.