Testing and Monitoring Machine Learning Model Deployments
What you'll learn
- Machine Learning System Unit Testing
- Machine Learning System Integration Testing
- Machine Learning System Differential Testing
- Shadow Deployments (also known as Dark/Decoy launches)
- Statistical Techniques for Assessing Shadow Deployments
- Monitoring ML System with Metrics (Prometheus & Grafana)
- Monitoring ML Systems with Logs (Kibana & the Elastic Stack)
- The Theory Around Continuous Delivery for Machine Learning
- Comfortable with Python
- Familiar with Scikit-Learn, Pandas, Numpy
- Comfortable with Data Science Fundamentals
- Can use Git version control
- Basic knowledge of Docker
- This is an advanced course
Learn how to test & monitor production machine learning models.
What is model testing?
You’ve taken your model from a Jupyter notebook and rewritten it in your production system. Are you sure there weren’t any mistakes when you moved from the research environment to the production system? How can you control the risk before your deployment? ML-specific unit, integration and differential tests can help you to minimize the risk.
What is model monitoring?
You’ve deployed your model to production. OK now what? Is it working as you expect? How do you know? By monitoring models, we can check for unexpected changes in:
When we think about data science, we think about how to build machine learning models, which algorithm will be more predictive, how to engineer our features and which variables to use to make the models more accurate. However, how we are going to actually test & monitor these models in a production system is often neglected, . Only when we can effectively monitor our production models can we determine if they are performing as we expect.
Why take this course?
This is the first and only online course where you can learn how to test & monitor machine learning models. The course is comprehensive, and yet easy to follow. Throughout this course you will learn all the steps and techniques required to effectively test & monitor machine learning models professionally.
In this course, you will have at your fingertips the sequence of steps that you need to follow to test & monitor a machine learning model, plus a project template with full code, that you can adapt to your own models.
What is the course structure?
Part 1: Testing
The course begins from the most common starting point for the majority of data scientists: a Jupyter notebook with a machine learning model trained in it. We gradually build up the complexity, testing the model first in the Juyter notebook and then in a realistic production code base. Hands-on exercises are interspaced with relevant and actionable theory.
Part 2: Shadow Mode
We explain the theory & purpose of deploying a model in shadow mode to minimize your risk, and walk you through an example project setup.
Part 3: Monitoring
We take you through the theory & practical application of monitoring metrics & logs for ML systems.
This course does not cover model deployment (we have a separate course dedicated to that topic)
Who are the instructors?
We have gathered a fantastic team to teach this course. Sole is a leading data scientist in finance and insurance, with 3+ years of experience in building and implementing machine learning models in the field, and multiple IT awards and nominations. Chris is a tech lead & ML software engineer with enormous experience in building APIs and deploying machine learning models, allowing business to extract full benefit from their implementation and decisions.
Who is this course for?
Data Scientists who want to know how to test & monitor their models beyond in production
Software engineers who want to learn about Machine Learning engineering
Machine Learning engineers who want to improve their testing & monitoring skills
Data Engineers looking to transition to ML engineering
Lovers of open source technologies
How advanced is this course?
This is an advanced level course, and it requires you to have experience with Python programming and git. How much experience? It depends on how much time you would like to set aside to go ahead and learn those concepts that are new to you. To give you an example, we will work with Python environments, we will work with object oriented programming, we will work with the command line to run our scripts, and we will checkout code at different stages with git. You don’t need to be an expert in all of these topics, but you need a reasonable working knowledge. We also work with Docker a lot, though we will provide a recap of this tool.
For those relatively new to software engineering, the course will be challenging. We have added detailed lecture notes and references, so we believe that those missing some of the prerequisites can take the course, but keep in mind that you will need to put in the hours to read up on unfamiliar concepts. On this point, the course slowly increases in complexity, so you can see how we pass, gradually, from the familiar Jupyter notebook, to the less familiar production code, using a project-based approach which we believe is optimal for learning. It is important that you follow the code, as we gradually build it up.
Still not sure if this is the right course for you?
Here are some rough guidelines:
Never written a line of code before: This course is unsuitable
Never written a line of Python before: This course is unsuitable
Never trained a machine learning model before: This course is unsuitable. Ideally, you have already built a few machine learning models, either at work, or for competitions or as a hobby.
Never used docker before: The second part of the course will be very challenging. You need to be ready to read up on lecture notes & references.
Have only ever operated in the research environment: This course will be challenging, but if you are ready to read up on some of the concepts we will show you, the course will offer you a great deal of value.
Have a little experience writing production code: There may be some unfamiliar tools which we will show you, but generally you should get a lot from the course.
Non-technical: You may get a lot from just the theory lectures, so that you get a feel for the challenges of ML testing & monitoring, as well as the lifecycle of ML models. The rest of the course will be a stretch.
To sum up:
With more than 70 lectures and 8 hours of video this comprehensive course covers every aspect of model testing & monitoring. Throughout the course you will use Python as your main language and other open source technologies that will allow you to host and make calls to your machine learning models.
We hope you enjoy it and we look forward to seeing you on board!
Who this course is for:
- Data Scientists who want to know how to test & monitor their models beyond in production
- Software engineers who want to learn about Machine Learning engineering
- Machine Learning engineers who want to improve their testing & monitoring skills
- Data Engineers looking to transition to ML engineering
- Lovers of open source technologies
My name is Chris. I'm a professional software engineer from the UK. I've been writing code for over a decade, and for the past five years I've focused on scaling machine learning applications. I've done this at fintech and healthtech companies in London, where I've worked on and grown production machine learning applications used by millions of people. I've built and maintained machine learning systems which make credit-risk and fraud detection judgements on over a billion dollars of personal loans per year for the challenger bank Zopa. I previously worked on systems for predicting health risks for patients around the world at Babylon Health.
In the past, I've worn a variety of hats. I worked at a global healthcare company, Bupa, which included being a core developer on their flagship website, and three years working in Beijing setting up mobile, web and IT for medical centers in China. Whilst in Beijing, I ran the Python meetup group, mentored a lot of junior developers, and ate a lot of dumplings. I enjoy giving talks at engineering meetups, building systems that create value, and writing software development tutorials and guides. I've written on topics ranging from wearable development, to internet security, to Python web frameworks.
I'm passionate about teaching in a way that minimizes the time between "ah hah" moments, but doesn't leave you Googling every other word. Complexity is necessary for application in the real world, but too much complexity is overwhelming and counter-productive. I will help you find the right balance.
Feel free to connect on LinkedIn (very active) or Twitter (getting more active in 2022)
Soledad Galli is a lead data scientist and founder of Train in Data. She has experience in finance and insurance, received a Data Science Leaders Award in 2018 and was selected “LinkedIn’s voice” in data science and analytics in 2019. Sole is passionate about sharing knowledge and helping others succeed in data science.
As a data scientist in Finance and Insurance companies, Sole researched, developed and put in production machine learning models to assess Credit Risk, Insurance Claims and to prevent Fraud, leading in the adoption of machine learning in the organizations.
Sole is passionate about empowering people to step into and excel in data science. She mentors data scientists, writes articles online, speaks at data science meetings, and teaches online courses on machine learning.
Sole has recently created Train In Data, with the mission to facilitate and empower people and organizations worldwide to step into and excel in data science and analytics.
Sole has an MSc in Biology, a PhD in Biochemistry and 8+ years of experience as a research scientist in well-known institutions like University College London and the Max Planck Institute. She has scientific publications in various fields such as Cancer Research and Neuroscience, and her research was covered by the media on multiple occasions.
Soledad has 4+ years of experience as an instructor in Biochemistry at the University of Buenos Aires, taught seminars and tutorials at University College London, and mentored MSc and PhD students at Universities.
Feel free to contact her on LinkedIn.
Soledad Galli es científica de datos y fundadora de Train in Data. Tiene experiencia en finanzas y seguros, recibió el premio Data Science Leaders Award en 2018 y fue seleccionada como "la voz de LinkedIn" en ciencia y análisis de datos en 2019. A Soledad le apasiona compartir conocimientos y ayudar a otros a tener éxito en la ciencia de datos.
Como científica de datos en compañías de finanzas y seguros, Sole desarrolló y puso en producción modelos de aprendizaje automático para evaluar el riesgo crediticio, automatizar reclamos de seguros y para prevenir el fraude, facilitando la adopción del aprendizaje de máquina en estas organizaciones.
A Sole le apasiona ayudar a que las personas aprendan y se destaquen en ciencia de datos, es por eso habla regularmente en reuniones de ciencia de datos, escribe varios artículos disponibles en la web y crea cursos sobre aprendizaje de máquina.
Sole ha creado recientemente Train In Data, con la misión de ayudar a las personas y organizaciones de todo el mundo a que aprendan y se destaquen en la ciencia y análisis de datos.
Sole tiene una maestría en biología, un doctorado en bioquímica y más de 8 años de experiencia como investigadora científica en instituciones prestigiosas como University College London y el Instituto Max Planck. Tiene publicaciones científicas en diversos campos, como la investigación contra el Cáncer y la Neurociencia, y sus resultados fueron cubiertos por los medios en múltiples ocasiones.
Soledad tiene más de 4 años de experiencia como instructora de bioquímica en la Universidad de Buenos Aires, dio seminarios y tutoriales en University College London, en Londres, y fue mentora de estudiantes de maestría y doctorado en diferentes universidades.
No dudes en contactarla en LinkedIn.