Teach on Udemy

Turn what you know into an opportunity and reach millions around the world.

Learn More

Your cart is empty.

Keep shopping

Machine Learning & Data Science: The Complete Visual Guide

Name: Machine Learning & Data Science: The Complete Visual Guide
Rating: 4.8 (693 reviews)

Learn data science & machine learning topics with simple, step-by-step demos and user-friendly Excel models (NO code!)

Created byMaven Analytics • 1,500,000 Learners, Chris Dutton, Joshua MacCarty

Last updated 11/2025

English

What you'll learn

Build foundational machine learning & data science skills WITHOUT writing complex code
Play with interactive, user-friendly Excel models to learn how machine learning techniques actually work
Enrich datasets using feature engineering techniques like one-hot encoding, scaling and discretization
Predict categorical outcomes using classification models like K-nearest neighbors, naïve bayes, and decision trees
Build accurate forecasts and projections using linear and non-linear regression models
Apply powerful techniques for clustering, association mining, outlier detection, and dimensionality reduction
Learn how to select and tune models to optimize performance, reduce bias, and minimize drift
Explore unique, hands-on case studies to simulate how machine learning can be applied to real-world cases

Course content

22 sections • 182 lectures • 8h 51m total length

Course Structure & Outline2:31
Explore a beginner-friendly visual guide to machine learning and data science with interactive Excel models, covering data profiling, classification, regression, unsupervised learning, and models like logistic regression and decision trees.
READ ME: Important Notes for New Students2:13
DOWNLOAD: Course Resources0:11
Setting Expectations2:39
Demystify essential machine learning topics using Excel as a teaching tool with no coding required. Cover data profiling, linear logistic regression, forecasting, and unsupervised learning for analysts and BI professionals.

Intro to Machine Learning1:00
Define machine learning as using statistical methods to find patterns and make predictions, and apply contextual inference beyond programmed rules across churn, sales, cross-selling, and sentiment.
When is ML the right fit?1:08
The Machine Learning Process2:28
The Machine Learning Landscape1:49
Explore the machine learning landscape, distinguishing supervised and unsupervised methods, and preview foundational techniques like classification, regression, clustering, k-NN, logistic regression, and sentiment analysis for business intelligence.

Introduction2:35
Learn how to perform preliminary data quality assurance in machine learning, including handling missing values, variable types, and outliers, to ensure error-free data and reliable analyses.
Why QA?2:13
Practice rigorous data QA to ensure error-free data, proper encoding, and unbiased capture so ML models are reliable and you avoid wasted time, money, and reputational risk.
Variable Types2:42
Analyze variable types to ensure data quality and readiness, distinguishing numeric, discrete, and categorical variables, recognizing when to treat zip codes as strings, recode into buckets, and handle empty values.
Empty Values3:43
Examine empty values in data, distinguish zeros and blanks, and decide to keep, remove, or impute using methods like mean or linear interpolation to avoid bias.
Range Calculations1:39
Explore range calculations to verify min and max values, reveal outliers, and validate data realism across variables like age, income, and height; prepare for count calculation checks.
Count Calculations1:39
Left & Right Censored Data2:08
Identify left and right censored data by recognizing when min or max values do not reflect the range, with examples from surveys of people over 18 and censored repurchase rates.
Table Structure2:24
Explore table structure by comparing long and wide formats, and learn how pivoting and unpivoting transform data rows into columns for exploratory data analysis and modeling.
CASE STUDY: Preliminary QA13:13
BEST PRACTICES: Preliminary QA1:27
Master preliminary qa by reviewing all fields, configuring variable types, handling missing values and zeros, and applying basic diagnostics and censored data to support modeling.
QUIZ: Preliminary Data QA

Introduction2:15
Advance from quality assurance to univariate profiling, performing descriptive analysis of each variable before multivariate modeling, covering categorical and numerical distributions, histograms, kernel densities, and mean, median, and mode.
Categorical Variables1:33
Understand categorical variables and how categories serve as values and dimensions to filter numerical data, with examples like product type, country, gender, and binary 1/0 flags.
Discretization1:38
Discretize a numerical variable to create a categorical price level using rules that label values as low, medium, or high, enabling better modeling and analysis.
Nominal vs. Ordinal2:22
Learn to distinguish nominal and ordinal categorical variables, understand when order matters, and explore their distributions to build foundations for predictive modeling.
Categorical Distributions3:22
Numerical Variables1:44
Explore numerical variables and their distributions, distinguishing quantitative data from categorical, and apply histograms and kernel densities to metrics like page views and revenue for machine learning and business intelligence.
Histograms & Kernel Densities4:47
Visualize numerical distributions with histograms and kernel densities by binning data and smoothing the shape. Use both to spot outliers, compare bin sensitivity, and apply Sturges' rule for bin counts.
CASE STUDY: Histograms4:41
Normal Distribution2:48
Present the normal distribution as a symmetric bell curve centered at the mean, also called Gaussian, guiding ML and statistics and enabling testing differences and comparisons across distributions.
CASE STUDY: Normal Distribution4:54
Analyze female athlete heights from the 2016 Rio Olympics against the general population using histograms and kernel density to reveal near-normal distributions and a notable height difference.
Univariate Data Profiling1:41
Mode2:39
Identify the mode as the most frequent value, shown by Houston and 24 sessions. It is not very useful on its own, but guides multivariate profiling for deeper insight.
Mean1:33
Learn how the mean defines the central value for numerical data and serves as a basic predictive estimate, with a quick example and notes on outliers and skewness.
Median1:14
Apply the median to numerical data to identify the center of a distribution and resist outliers. Calculate the median as the average of the two middle values in ordered data.
Percentile1:23
Variance3:11
Explore variance as the measure of how far observations lie from the mean, describing distribution width and enabling comparison of numerical groups, with examples and a path to standard deviation.
Standard Deviation1:22
Relate standard deviation to variance by square-rooting to the variable's scale. Apply the empirical rule: 68%, 95%, 99.7% within one, two, or three standard deviations for normal distributions.
Skewness1:30
Explore how skewness measures deviations from a normal distribution in univariate profiling. Visualize left and right skew, compare mean, mode, and median, and learn how skewness identifies non-normal distributions.
BEST PRACTICES: Univariate Profiling1:48
Apply univariate profiling tools to distinguish categorical and numerical variables, use distributions for exploration, and quality assurance to ensure metrics like mean, median, mode, and variance support predictive insights.
QUIZ: Univariate Profiling

Introduction3:03
Categorical-Categorical3:02
Learn multivariate profiling of two categorical variables using frequency and proportion tables and heatmaps to visualize joint distributions, with examples on design and size and notes on Naive Bayes classification.
CASE STUDY: Heat Maps4:55
Analyze heat maps of NYC traffic accidents by time of day and day of week, using frequency tables, counts, proportions, and conditional formatting with a red–yellow–green color scale in Excel.
Categorical-Numerical2:07
Explore categorical numerical distributions to compare numerical data across categories using histograms, kernel densities, violin plots, and box plots for multivariate profiling of key business metrics.
Multivariate Kernel Densities2:47
Visualize categorical numerical distributions by applying per-class kernel densities on the same plot, compare means and variances, and prepare to contrast with violin and box plots.
Violin Plots1:38
Explore violin plots as mirrored kernel densities that you visualize for each category, turning the density on its side to show distribution without overlap and aid machine learning insights.
Box Plots1:20
Discover box plots, like violin plots, that reveal median, min and max (excluding outliers), 25th and 75th percentiles, and outliers for multivariate distributions of a categorical and a numerical variable.
Limitations of Categorical Distributions2:12
Numerical-Numerical1:28
Correlation3:07
Explore how correlation reveals linear relationships between two numeric variables through multivariate profiling, covariance, and standard deviations, using scatter plots to visualize variance.
Correlation vs. Causation1:53
Explain why correlation does not imply causation, using an ice cream sales and drowning example, and highlight the role of a common unobserved variable like warm weather.
Visualizing Third Dimension2:05
CASE STUDY: Correlation5:25
Analyze correlations among weekly digital media spend, site traffic, offline spend, and sales using scatter plots to reveal positive, negative, and diminishing returns patterns.
BEST PRACTICES: Multivariate Profiling1:25
QUIZ: Multivariate Profiling
Looking Ahead to Part 21:12
Build on data profiling and quality assurance to explore supervised learning and classification techniques, such as k-nearest neighbors, Naive Bayes, decision trees, logistic regression, and sentiment analysis for business intelligence.

Supervised vs. Unsupervised Learning1:54
Explore supervised and unsupervised learning, including classification and regression techniques like k nearest neighbors, Naive Bayes, and decision trees, plus clustering and outlier detection; supervised predicts labels, unsupervised reveals structure.
Classification vs. Regression2:05
Explore supervised learning by comparing classification and regression, focusing on predicting categorical targets versus numerical values, with examples like churn, sentiment, and revenue forecasts.
RECAP: Key Concepts3:28
Review rows and columns, categorical and numerical variables, including binary 1/0, to understand data; explore conditioning for better predictions, quality assurance, data profiling, and classification with machine learning.
Classification 1013:55
Explore how classification predicts a dependent variable from independent variables using a crm example, training a model on observed churn data to predict future churn for new customers.
Classification Workflow3:01
Map the classification workflow from scoping the business challenge and stakeholders to feature engineering, data splitting, iterative training, and tuned model selection.
Feature Engineering3:40
Enrich data with new independent variables through feature engineering, including one hot encoding, scaling, log transformation, discretization, date component extraction, and boolean flags to boost predictive power for model validation.
Data Splitting1:41
Overfitting3:39
Intro to Classification

Common Classification Models1:15
Explore common foundational classification methods, including k-nearest neighbors, naive Bayes, decision trees, and random forests, plus logistic regression and sentiment analysis for classifying categorical outcomes.
Intro to K-Nearest Neighbors (KNN)1:06
Explore k-nearest neighbors, a classification technique that predicts an observation's class from the closest points in a scatter plot, with k guiding the prediction for applications like customer segmentation.
KNN Examples4:02
Explore how k-nearest neighbors classifies purchases by comparing a new customer's age and income to nearby examples, selecting the best k and resolving ties with distances.
CASE STUDY: KNN9:26
Explore a KNN case study predicting a Spotify track outcome (listen, skip, or favorite) from scaled features. See how Excel visualizes with a scatter plot and computes prediction confidence.
Intro to Naïve Bayes1:38
Naïve Bayes | Frequency Tables2:04
Train a naive Bayes classifier by building frequency tables for each independent variable and the purchase outcome, then calculate conditional probabilities to predict purchases.
Naïve Bayes | Conditional Probability5:02
Build intuition for naïve Bayes by deriving conditional probabilities from frequency tables, predicting purchase likelihood for new observations, and embracing computer-assisted, rapid probability calculations.
CASE STUDY: Naïve Bayes7:28
Explore a Naïve Bayes case study using a small binary dataset to predict purchase probability from three interactions: newsletter, Facebook, and website visits, with frequency tables and conditional probabilities.
Intro to Decision Trees1:52
Decision Trees | Entropy 1012:43
Entropy & Information Gain4:38
Calculate entropy using P1 and P2 from class counts with log base two, producing a curve between 0 and 1, and show entropy guiding splits on churn and login days.
Decision Tree Examples4:56
Explore decision trees for churn prediction, covering root, decision, and leaf nodes, information gain, hyperparameters, overfitting, and the intro to random forests.
Random Forests1:17
Random forests use random subsets of observations and variables at each split to explore many options. Each tree votes, and the forest prediction is the mode of all tree predictions.
CASE STUDY: Decision Trees7:46
Explore a practical case study on building a simple decision tree to predict paid subscriptions using binary customer features, entropy, and information gain.
Intro to Logistic Regression2:05
Logistic Regression Example2:45
False Positives vs. False Negatives3:02
Logistic Regression Equation2:00
The Likelihood Function4:27
Maximize the likelihood that the logistic regression s-curve predicts probabilities closest to the actual training data, then adjust beta zero and beta one to minimize distance.
Multivariate Logistic Regression2:47
CASE STUDY: Logistic Regression7:52
Explore logistic regression to predict unsubscribe probability from weekly email frequency, maximizing likelihood with a univariate model, visualizing the curve, and identifying the 50% decision threshold for optimal email cadence.
Intro to Sentiment Analysis2:09
Explore sentiment analysis as a classification approach in supervised machine learning, using bag-of-words features, labeled training data, and data preparation to predict emotions, noting word clouds' limitations for market research.
Cleaning Text Data1:51
Clean text data for sentiment analysis by removing noise, punctuation and special characters, and stopwords, while applying stemming or lemmatizing and proper encoding to preserve key sentiment words.
"Bag of Words" Analysis4:11
CASE STUDY: Sentiment Analysis6:06
Classification Models

Intro to Selection & Tuning0:57
Select the best model for a problem and tune hyperparameters to maximize predictive power, address imbalanced classes, and interpret confusion matrices while monitoring drift over time.
Hyperparameters3:00
Imbalanced Classes3:24
Learn how imbalanced classes bias predictive models toward the majority class and balance data with up sampling, down sampling, and weighting for rare event detection, using confusion matrices for evaluation.
Confusion Matrix2:18
Explore how a confusion matrix compares predicted to actual classes and identifies true positives, true negatives, false positives, and false negatives, linking these counts to accuracy, precision, and recall.
Accuracy, Precision & Recall2:46
Explore the confusion matrix and define accuracy, precision, and recall, showing how true positives, true negatives, false positives, and false negatives influence model performance.
Multi-class Confusion Matrix2:27
Explore multi-class confusion matrices for predictions across products A, B, C, D, focusing on diagonal for accuracy and diagnosing misclassifications B and C to guide precision, recall, and feature engineering.
Multi-class Scoring4:41
Explore multi-class confusion matrices and compute per-class and weighted-average accuracy, precision, and recall to evaluate and compare predictive models.
Model Selection1:50
Train multiple classification models quickly and select the best using context-specific metrics. Evaluate recall, precision, and accuracy via the confusion matrix to choose the most important metric for your challenge.
Model Drift1:09
Model drift degrades predictions over time as relationships change. Retrain with newer data, benchmark day one, and use feature engineering to counter drift.
Model Selection & Tuning
Looking ahead to Part 30:34
Finish part two and advance from supervised machine learning foundations to regression and forecasting, predicting numeric variables with linear regression, intervention analysis, and Markov chains.

Requirements

This is a beginner-friendly course (no prior knowledge or math/stats background required)
We'll use Microsoft Excel (Office 365) for some course demos, but participation is optional

Description

This course is for everyday people looking for an intuitive, beginner-friendly introduction to the world of machine learning and data science.

Build confidence with guided, step-by-step demos, and learn foundational skills from the ground up. Instead of memorizing complex math or learning a new coding language, we'll break down and explore machine learning techniques to help you understand exactly how and why they work.

Follow along with simple, visual examples and interact with user-friendly, Excel-based models to learn topics like linear and logistic regression, decision trees, KNN, naïve bayes, hierarchical clustering, sentiment analysis, and more – without writing a SINGLE LINE of code.

This course combines 4 best-selling courses from Maven Analytics into a single masterclass:

PART 1: Univariate & Multivariate Profiling
PART 2: Classification Modeling
PART 3: Regression & Forecasting
PART 4: Unsupervised Learning

PART 1: Univariate & Multivariate Profiling

In Part 1 we’ll introduce the machine learning workflow and common techniques for cleaning and preparing raw data for analysis. We’ll explore univariate analysis with frequency tables, histograms, kernel densities, and profiling metrics, then dive into multivariate profiling tools like heat maps, violin & box plots, scatter plots, and correlation:

Section 1: Machine Learning Intro & Landscape
Machine learning process, definition, and landscape
Section 2: Preliminary Data QA
Variable types, empty values, range & count calculations, left/right censoring, etc.
Section 3: Univariate Profiling
Histograms, frequency tables, mean, median, mode, variance, skewness, etc.
Section 4: Multivariate Profiling
Violin & box plots, kernel densities, heat maps, correlation, etc.

Throughout the course, we’ll introduce real-world scenarios to solidify key concepts and simulate actual data science and business intelligence cases. You’ll use profiling metrics to clean up product inventory data for a local grocery, explore Olympic athlete demographics with histograms and kernel densities, visualize traffic accident frequency with heat maps, and more.

PART 2: Classification Modeling

In Part 2 we’ll introduce the supervised learning landscape, review the classification workflow, and address key topics like dependent vs. independent variables, feature engineering, data splitting and overfitting. From there we'll review common classification models like K-Nearest Neighbors (KNN), Naïve Bayes, Decision Trees, Random Forests, Logistic Regression and Sentiment Analysis, and share tips for model scoring, selection, and optimization:

Section 1: Intro to Classification
Supervised learning & classification workflow, feature engineering, splitting, overfitting & underfitting
Section 2: Classification Models
K-nearest neighbors, naïve bayes, decision trees, random forests, logistic regression, sentiment analysis
Section 3: Model Selection & Tuning
Hyperparameter tuning, imbalanced classes, confusion matrices, accuracy, precision & recall, model drift

You’ll help build a simple recommendation engine for Spotify, analyze customer purchase behavior for a retail shop, predict subscriptions for an online travel company, extract sentiment from a sample of book reviews, and more.

PART 3: Regression & Forecasting

In Part 3 we’ll introduce core building blocks like linear relationships and least squared error, and practice applying them to univariate, multivariate, and non-linear regression models. We'll review diagnostic metrics like R-squared, mean error, F-significance, and P-Values, then use time-series forecasting techniques to identify seasonality, predict nonlinear trends, and measure the impact of key business decisions using intervention analysis:

Section 1: Intro to Regression
Supervised learning landscape, regression vs. classification, prediction vs. root-cause analysis
Section 2: Regression Modeling 101
Linear relationships, least squared error, univariate & multivariate regression, nonlinear transformation
Section 3: Model Diagnostics
R-squared, mean error, null hypothesis, F-significance, T & P-values, homoskedasticity, multicollinearity
Section 4: Time-Series Forecasting
Seasonality, auto correlation, linear trending, non-linear models, intervention analysis

You’ll see how regression analysis can be used to estimate property prices, forecast seasonal trends, predict sales for a new product launch, and even measure the business impact of a new website design.

PART 4: Unsupervised Learning

In Part 4 we’ll explore the differences between supervised and unsupervised machine learning and introduce several common unsupervised techniques, including cluster analysis, association mining, outlier detection and dimensionality reduction. We'll break down each model in simple terms and help you build an intuition for how they work, from K-means and apriori to outlier detection, principal component analysis, and more:

Section 1: Intro to Unsupervised Machine Learning
Unsupervised learning landscape & workflow, common unsupervised techniques, feature engineering
Section 2: Clustering & Segmentation
Clustering basics, K-means, elbow plots, hierarchical clustering, dendograms
Section 3: Association Mining
Association mining basics, apriori, basket analysis, minimum support thresholds, markov chains
Section 4: Outlier Detection
Outlier detection basics, cross-sectional outliers, nearest neighbors, time-series outliers, residual distribution
Section 5: Dimensionality Reduction
Dimensionality reduction basics, principle component analysis (PCA), scree plots, advanced techniques

You'll see how K-means can help identify customer segments, how apriori can be used for basket analysis and recommendation engines, and how outlier detection can spot anomalies in cross-sectional or time-series datasets.

__________

Ready to dive in? Join today and get immediate, LIFETIME access to the following:

9+ hours of on-demand video
ML Foundations ebook (350+ pages)
Downloadable Excel project files
Expert Q&A forum
30-day money-back guarantee

If you're an analyst or aspiring data professional looking to build the foundation for a successful career in machine learning or data science, you've come to the right place.

Happy learning!

-Josh & Chris

__________

Looking for our full business intelligence stack? Search for "Maven Analytics" to browse our full course library, including Excel, Power BI, MySQL, Tableau and Machine Learning courses!

See why our courses are among the TOP-RATED on Udemy:

"Some of the BEST courses I've ever taken. I've studied several programming languages, Excel, VBA and web dev, and Maven is among the very best I've seen!" Russ C.

"This is my fourth course from Maven Analytics and my fourth 5-star review, so I'm running out of things to say. I wish Maven was in my life earlier!" Tatsiana M.

"Maven Analytics should become the new standard for all courses taught on Udemy!" Jonah M.

Who this course is for:

Anyone looking to learn the foundations of machine learning through interactive, beginner-friendly demos
Data Analysts or BI experts looking to transition into data science or build a fundamental understanding of machine learning
R or Python users seeking a deeper understanding of the models and algorithms behind their code
Excel users who want to learn and apply powerful tools for predictive analytics

Machine Learning & Data Science: The Complete Visual Guide

What you'll learn

Explore related topics

Course content

Getting Started4 lectures • 8min

Intro to the ML Landscape4 lectures • 6min

PART 1: QA & Data Profiling1 lecture • 2min

Preliminary Data QA10 lectures • 34min

Univariate Profiling19 lectures • 46min

Multivariate Profiling15 lectures • 38min

PART 2: Classification Modeling1 lecture • 2min

Intro to Classification8 lectures • 23min

Classification Models25 lectures • 1hr 34min

Model Selection & Tuning10 lectures • 23min

Requirements

Description

Who this course is for: