Udemy
    •  
    •  
    •  
    •  
    •  
    •  
    •  
    •  
Turn what you know into an opportunity and reach millions around the world.
Learn More
Your cart is empty.
Keep shopping
Python for Data Science & Machine Learning Foundations
New

Python for Data Science & Machine Learning Foundations

Master NumPy, Pandas, Matplotlib, Scikit-Learn and PyTorch with real African datasets — before your first ML model.
Last updated 6/2026
English

What you'll learn

  • Write clean Python for data science: comprehensions, OOP, file I/O, and *args/**kwargs
  • Use NumPy arrays, broadcasting, and vectorisation instead of slow Python loops
  • Wrangle messy real-world data using Pandas groupby, merge, and feature engineering
  • Produce publication-quality EDA charts with Matplotlib and Seaborn
  • Build production-ready Scikit-Learn pipelines that prevent data leakage
  • Write a PyTorch training loop from scratch: tensors, autograd, nn.Module, DataLoader
  • Apply hypothesis testing and distributions to make better modelling decisions
  • Set up a full Colab + Google drive environment for any data science project

Course content

7 sections32 lectures6h 33m total length
  • Python Patterns You Will Use Every Day8:24
  • File I/O and JSON — Reading Real Data Files9:02
  • OOP Essentials — Why Sklearn Works the Way It Does21:48
  • Virtual Environments and Colab Setup5:53
  • Python Refresher — Test Your Knowledge

Requirements

  • Basic Python knowledge — you should know what a function, loop, and list is
  • No prior data science or ML experience needed
  • A Google account (all work is done in free Google Colab — no local setup required)
  • Willingness to run real code on real datasets every lesson

Description

Most students fail their first ML course not because the algorithms are hard — but because they can't read the data, clean it, or understand what the model is operating on.

This course fixes that. You'll build the exact Python foundation that every professional data scientist uses before touching a single algorithm: NumPy arrays, Pandas wrangling, Matplotlib visualisations, Scikit-Learn pipelines, PyTorch training loops, and statistical thinking.

Every lesson uses real datasets so the skills feel immediately practical, not textbook-abstract.

Every dataset in this course comes from real-world problems — agriculture, finance, and public health — so you're never practising on made-up numbers. You'll know how to handle the kind of messy, incomplete, real data that actually shows up on the job.

By the end of this course you will be able to: load any real-world dataset, clean and wrangle it with Pandas, visualise it for EDA, build a full Scikit-Learn preprocessing pipeline, write a PyTorch training loop from scratch, and apply the right statistical test to support your modelling decisions.

This is not a detour from machine learning. This is the ML infrastructure. Students who complete this course go on to finish ML courses — students who skip it simply do not.

Each module comes with a downloadable cheatsheet and a hands-on Colab notebook with exercises and solutions — so you are not just watching videos, you are building a personal reference library you will use for years. Everything runs in free Google Colab. No paid software, no complex local setup, no excuses.

Who this course is for:

  • Python developers who want to transition into data science or ML
  • Students who have tried an ML course and felt lost when the data got messy
  • Self-taught programmers building a formal data science foundation
  • Anyone working with agricultural, financial, or survey data
  • Engineers enrolling in the companion ML & Deep Learning course