Teach on Udemy

Turn what you know into an opportunity and reach millions around the world.

Learn More

Your cart is empty.

Keep shopping

Hyperparameter Optimization for Machine Learning

Name: Hyperparameter Optimization for Machine Learning
Rating: 4.7 (855 reviews)

Learn grid and random search, Bayesian optimization, multi-fidelity models, Optuna, Hyperopt, Scikit-Optimize & more.

Created bySoledad Galli, Train in Data Team

Last updated 9/2024

English

What you'll learn

Hyperparameter tunning and why it matters
Cross-validation and nested cross-validation
Hyperparameter tunning with Grid and Random search
Bayesian Optimisation
Tree-Structured Parzen Estimators, Population Based Training and SMAC
Hyperparameter tunning tools, i.e., Hyperopt, Optuna, Scikit-optimize, Keras Turner and others

Course content

11 sections • 95 lectures • 9h 24m total length

Introduction3:29
Discover how to boost machine learning performance by tuning hyperparameters with grid and random search, Bayesian optimization, and open-source tools like scikit-learn, Optuna, Hyperopt, and Keras Tuner.
Course curriculum6:37
Explore hyperparameter optimization for machine learning, covering performance metrics, cross-validation, Bayesian optimization with gaussian processes and tree-structured Parzen estimators, and practical implementations in scikit-learn, optuna, hyperopt, and more.
Course aim and knowledge requirements2:24
Explore hyperparameter optimization from linear regression to neural networks, using grid search, random search, and Bayesian methods with Python open-source packages on pre-cleaned datasets; deep learning familiarity enhances value.
Course material1:45
Access the hyperparameter optimization course material, including Jupyter notebooks, videos, presentations, and online datasets, and use the three articles to download code, presentations, and datasets before you start.
Jupyter notebooks0:33
Presentations0:19
Datasets0:11
Set up your computer - required packages0:13
Resources to learn machine learning skills0:40

Parameters and Hyperparameters11:14
Learn how learning algorithms optimize model parameters to minimize training criteria, while hyperparameters set outside training control capacity and improve generalization, including regularization and tuning.
Hyperparameter Optimization8:52
Learn how to optimize hyperparameters to minimize generalization error and balance model performance with training cost. Explore hyperparameter space, sampling methods, cross-validation, and strategies like grid, random, and bayesian optimization.

Performance Metrics - Introduction1:17
Explore performance metrics essential for hyperparameter optimization, including classification and regression metrics, cross-validation, and sampling methods. Learn how scikit-learn metrics are used and how to create custom metrics for optimization.
Classification Metrics (Optional)8:08
Explore classification metrics for supervised models, including accuracy, precision, recall, f1 score, false positive rate, false negative rate, and roc-auc using a confusion matrix.
Regression Metrics (Optional)3:41
Explore regression metrics to evaluate continuous targets, including mean squared error, RMSE, MAE, and R-squared. See how these metrics measure prediction errors and explain variance to guide optimization with scikit-learn.
Creating your own metrics9:05
Create a custom false negative rate metric with the confusion matrix, use make_scorer, and optimize a random forest's hyperparameters by grid search on the breast cancer dataset.
Using Scikit-learn metrics1:56
Learn how to use scikit-learn metrics with grid search to optimize hyperparameters on the breast cancer dataset, by passing a metric string like accuracy or roc-auc.

Cross-Validation9:15
Explore cross-validation to prevent over-fitting and assess generalization by splitting data into train, test, and folds, averaging performance across folds and comparing hyperparameter spaces with error estimates.
Bias vs Variance (Optional)0:04
Cross-Validation schemes13:55
Explore bias, variance, and generalization error, and learn multiple cross-validation schemes—k-fold, leave-one-out, leave-p-out, repeated k-fold, and stratified k-fold—applied to model selection and hyperparameter tuning.
Estimating the model generalization error with CV - Demo8:35
Demonstrates estimating a model's generalization error using multiple cross-validation schemes—k-fold, repeated k-fold, leave-one-out, leave-p-out, and stratified k-fold—on a logistic regression breast cancer model with scikit-learn.
Cross-Validation for Hyperparameter Tuning - Demo7:33
Explore cross-validation for hyperparameter tuning in logistic regression using grid search on the cancer dataset, comparing ridge and lasso penalties across five-fold cross-validation.
Special Cross-Validation schemes7:07
Apply tailored cross-validation for grouped data, using Group K-Fold and Leave One Group Out to ensure new subjects are tested unseen. Use TimeSeriesSplit for time series to forecast future observations.
Group Cross-Validation - Demo5:03
Apply group cross-validation and leave-one-group-out validation to estimate logistic regression generalization error, then perform a grid search to optimize its hyperparameters on the breast cancer dataset.

Basic Search Algorithms - Introduction5:10
Explore basic hyperparameter search methods for machine learning, including manual, grid, and random search, and learn to balance search breadth with computational cost, hyperparameter types, and performance metrics.
Manual Search6:35
Use manual search to identify promising hyperparameter regions, benchmark models, and prepare for grid search, while illustrating 5-fold cross-validation with logistic regression and random forests.
Grid Search3:21
Explore how grid search exhaustively tests all combinations of specified hyperparameters using the cartesian product, highlighting limitations like the curse of dimensionality and manual value selection, while enabling parallel execution.
Grid Search - Demo7:50
Apply grid search in scikit-learn to tune a gradient boosting classifier on the cancer dataset, evaluating 60 parameter combinations with 5-fold cross-validation to maximize ROC-AUC and identify the best settings.
Grid Search with different hyperparameter spaces2:18
Explore grid search over two hyperparameter spaces for a support vector classifier, comparing linear and rbf kernels. Identify best parameters by accuracy with 3-fold cross-validation, notably linear C=100.
Random Search7:34
Random search selects hyperparameter combinations at random from the hyperparameter space, using independent draws from a uniform distribution, and targets high-dimensional, continuous spaces more efficiently than grid search.
Random Search with Scikit-learn5:37
Apply random search to optimize hyperparameters for a gradient boosting classifier on the breast cancer dataset using scikit-learn and scipy.stats distributions, evaluating with cross-validation and ROC-AUC.
Random Search with Scikit-Optimize7:30
Use randomized search with scikit-optimize's dummy_minimize to optimize a gradient boosting classifier on the breast cancer dataset, defining real, integer, and categorical hyperparameters and evaluating via cross-validated accuracy.
Random Search with Hyperopt11:06
Explore Hyperopt for hyperparameter optimization using random search, defining fmin-based objective functions and flexible spaces, track trials, and optimize xgboost on the breast cancer dataset for cross-validated accuracy.
More examples0:09

Sequential Search5:49
Sequential search in hyperparameter optimization iteratively evaluates a few hyperparameter settings. Bayesian optimization guides where to sample next, treating the hyperparameter response surface φ(λ) as a black-box, non-differentiable function.
Bayesian Optimization5:10
Explore Bayesian optimization for global optimization of costly black-box objective functions. Apply priors, posteriors, and acquisition functions, using Gaussian processes for hyperparameter tuning.
Bayesian Inference - Introduction7:11
Learn the core of bayesian inference: reallocate probability from priors to posteriors using bayes' rule, updating beliefs with data across models and hyperparameters.
Joint and Conditional Probabilities7:40
Explore joint and conditional probabilities with a dog breed hip dysplasia data set, computing marginal, joint, and conditional probabilities and introducing prior, posterior, and Bayes' Rule.
Bayes Rule12:02
Explore Bayes' rule, linking prior and posterior probabilities via conditional and joint forms, with fraud detection examples, and apply these ideas to hyperparameters via bayesian optimization and gaussian processes.
Sequential Model-Based Optimization15:54
Learn how sequential model-based optimization for hyperparameter optimization uses Bayes' rule to infer a posterior over a black-box objective, guided by a Gaussian process surrogate and an acquisition function.
Gaussian Distribution7:28
Explore the Gaussian distribution’s bell shape, centered at the mean μ with spread σ and variance σ², including the standard normal N(0,1), and preview multivariate forms and a Gaussian process.
Multivariate Gaussian Distribution16:22
Generalize the univariate gaussian to multivariate distributions by modeling a vector x with mean mu and covariance matrix, capturing diagonal variances, off-diagonal covariances, and underpinning gaussian processes.
Gaussian Process14:47
Explore how Gaussian processes model distributions over functions to estimate the hyperparameter response function, using mean, covariance, and multivariate Gaussian rules to derive priors and posteriors.
Kernels6:41
Explore how kernels measure similarity in Gaussian processes to predict hyperparameter responses, adjust smoothness, and guide Bayesian optimization with squared exponential and Matérn kernels.
Acquisition Functions13:44
Learn how acquisition functions guide Bayesian optimization for hyperparameter search by balancing exploration and exploitation, using methods like probability of improvement, expected improvement, and upper and lower confidence bounds.
Additional Reading Resources0:13
Scikit-Optimize - 1-Dimension14:11
Learn to implement one-dimensional Bayesian optimization with scikit-optimize to tune n_estimators for a gradient boosting classifier, using gp_minimize, and visualize convergence and Gaussian-process posteriors.
Scikit-Optimize - Manual Search5:20
Explore bayesian optimization with scikit-optimize to tune a gradient boosting classifier by optimizing a multi-parameter space using gp_minimize, cross-validation, and convergence plots.
Scikit-Optimize - Automatic Search4:03
Explore how BayesSearchCV enables automatic hyperparameter search for a scikit-learn estimator using Bayesian optimization, selecting best parameters for a regression task and evaluating with mean squared error.
Scikit-Optimize - Alternative Kernel3:24
Explore scikit-optimize gp_minimize with alternative kernels like rbf and matern to optimize a gradient boosting classifier on the breast cancer dataset using a gaussian process regressor and 3-fold cross-validation.
Scikit-Optimize - Neuronal Networks14:17
Optimize convolutional neural network hyperparameters for MNIST using Bayesian optimization with scikit-optimize, defining the search space for learning rate, dense layers, neurons, and activations, and evaluating model performance.
Scikit-Optimize - CNN - Search Analysis6:00
Explore how scikit-optimize visualizes bayesian CNN hyperparameter searches, showing learning rate and dense nodes effects on accuracy via plots, evaluations, test set performance, and confusion matrix.

Other SMBO Algorithms4:11
Explore sequential model-based optimization in machine learning using random forests, gradient boosting machines, and tree parzen estimators to approximate the hyperparameter response function f(x) and guide acquisition-driven sampling decisions.
SMAC6:14
Explore SMAC, a Bayesian hyperparameter optimization method that uses random forests or gradient boosted trees to approximate the objective f(x) and guide sampling via acquisition functions like expected improvement.
SMAC Demo11:04
Optimize the convolutional neural network for MNIST digits using SMAC and Scikit-Optimize, exploring hyperparameters like convolutional and dense layers, neurons, activation, and learning rate to achieve high accuracy.
Tree-structured Parzen Estimators - TPE4:00
Explore Tree-structured Parzen Estimators, which model hyperparameters given the score using two densities l(x) and g(x), guiding sampling via expected improvement to focus on promising regions of the prior space.
TPE Procedure8:08
Explore tree-structured Parzen estimators (TPE) for hyperparameter optimization by sampling configurations, splitting observations into best and rest, and modeling per-hyperparameter distributions with Parzen windows to guide expected improvement.
TPE hyperparameters4:39
Explore the default hyperparameters for tree-structured Parzen estimators (TPE) in Hyperopt, including sampling, gamma quantile split, Parzen density estimation, and candidate selection.
TPE - why tree-structured?4:29
Explore why tree-structured parzen estimators sample hyperparameters conditionally, per hyperparameter, and support nested spaces across models like svm and a decision tree.
TPE with Hyperopt6:02
Explore hyperparameter optimization for a convolutional neural network using Tree Parzen Estimators with Hyperopt, including learning rate, number of convolutional and dense layers, neurons, and activations.
Discussion: Bayesian Optimization and Basic Search13:30
Explore when to use basic search methods (manual, grid, random) versus Bayesian optimization (GP, SMAC, TPE) for hyperparameter tuning, including parallelization, dimensionality, and resource trade-offs.

Scikit-Optimize5:45
Explore Scikit-Optimize, an open-source Python package for random and Bayesian optimization, detailing gaussian processes, forest_minimize, gbrt_minimize, and dummy_minimize, and its hyperparameter space sampling and acquisition functions.
Section content2:10
Explore hyperparameter optimization with scikit-optimize notebooks, including randomized search and sequential methods using Gaussian processes, random forests, gradient boosting, and XGBoost, plus parallel Bayesian optimization and kernel choices.
Hyperparameter Distributions4:37
Learn to define hyperparameter spaces with skopt.space, sampling real, integer, and categorical parameters from uniform and log-uniform distributions for machine learning models.
Defining the hyperparameter space2:36
Define a hyperparameter space with scikit-optimize by sampling integers, real numbers, and categorical options in a param_grid for a gradient boosting machine.
Defining the objective function1:59
Define a customizable objective function to minimize with Scikit-Optimize, passing hyperparameters to a gradient boosting classifier via named_args and set_params, using cross-validated mean performance and a negative value for accuracy.
Random search5:12
Explore random search with scikit-optimize to tune a gradient boosting classifier, defining the hyperparameter space and objective, using dummy_minimize and 50 samples, with visualizations of convergence and evaluations.
Bayesian search with Gaussian processes5:14
Explore bayesian optimization with gaussian processes using scikit-optimize to tune hyperparameters, visualize with partial dependency plots, and compare against random search on the Breast Cancer dataset.
Bayesian search with Random Forests2:53
Perform Bayesian optimization with scikit-optimize using random forests as the surrogate via forest_minimize, including initial points, acquisition function, and analysis of estimators and learning rate.
Bayesian search with GBMs3:03
Perform Bayesian optimization for machine learning using gradient boosting machines as the surrogate with gbrt_minimize. Identify how estimators, depths, learning rate, and minimum samples per split affect performance.
Parallelizing a Bayesian search2:53
Learn to perform Bayesian optimization in parallel using scikit-optimize and joblib, with a Gaussian-process surrogate to sample hyperparameters and improve accuracy.
Bayesian search with Scikit-learn wrapper4:03
Demonstrates Bayesian optimization of scikit-learn models using BayesSearchCV to tune a gradient boosting regressor on the Boston housing dataset, comparing with grid and random search.
Changing the kernel of a Gaussian Process3:24
Explore how gp_minimize uses the Matérn kernel by default and how to switch to kernels like the RBF to better infer the Gaussian process.
Optimizing xgboost0:07
Optimizing Hyperparameters of a CNN14:17
Optimize cnn hyperparameters for MNIST with Bayesian optimization using Scikit-Optimize, building a convolutional neural network in Keras, preparing data, defining search space and objective, and evaluating best model and convergence.
Analyzing the CNN search6:00
Analyze a Bayesian optimization run with scikit-optimize's plot_objective to see how learning rate and dense layer size affect CNN accuracy, and evaluate with a confusion matrix.

Hyperopt8:05
Explore Hyperopt, an open-source Python package for hyperparameter optimization, offering rand.suggest, tpe.suggest, and anneal.suggest via fmin, with nested hp spaces and optional MongoDB parallelization.
Section content1:50
Explore hyperparameter optimization with Hyperopt, covering distributions, the three sampling algorithms—random search, annealing, and tree parzen estimators—and nested hyperparameters across random forests, logistic regression, gradient boosting, and neural networks.
Search space configuration and distributions14:48
Learn to sample hyperparameters from hyperopt distributions, configure search spaces, and combine or nest spaces using hp.choice and pchoice.
Sampling from nested spaces4:28
Navigate nested hyperparameter spaces with Hyperopt, sampling naive_bayes, SVM, and decision tree algorithms and tuning C, kernel, width, depth, and minimum samples per split to find the best model.
Search algorithms7:52
Explore hyperparameter optimization with hyperopt, using fmin to perform randomized, annealing, and tpe searches on xgboost models, evaluating with cross-validated roc_auc.
Evaluating the search8:34
Explore how to capture and analyze hyperparameter search progress using hyperopt, define a hyperparameter space with hp, implement objective functions, and evaluate using cross-validation, losses, and trials.
Optimizing multiple ML models simultaneously9:31
Use hyperopt to jointly optimize model choice and hyperparameters, exploring logistic regression, random forest, and gradient boosting with nested parameter spaces and cross-validated loss.
Optimizing Hyperparameters of a CNN0:10
References0:03

Optuna4:58
Explore Optuna, a Python package for hyperparameter optimization, covering grid and random search, nested spaces, objective functions, and analysis tools with Pandas and SQL-like storage.
Optuna main functions7:45
Explore optuna's main functions to set up a hyperparameter search within the objective using a trial, define space with suggest_int for n_estimators and max_depth, and optimize with study.
Section content1:00
Explore hyperparameter optimization using optuna, implementing various search algorithms, tuning scikit-learn models and neural networks, and using optuna plotting functions to analyze search characteristics across notebooks.
Search algorithms7:38
Explore how Optuna selects and compares hyperparameter search algorithms, including grid search, random search, TPE, and CMA-ES, and tune a random forest on breast cancer data.
Optimizing multiple ML models with simultaneously7:21
Optimize hyperparameters for logistic regression, random forest, and gradient boosting classifiers with a single objective using Optuna and nested hyperparameters on the breast cancer dataset.
Optimizing hyperparameters of a CNN9:52
Optimize hyperparameters with Optuna in a Keras TensorFlow workflow on mnist, defining an objective function, sampling convolutional layers and dense layers, and evaluating accuracy to select the best model.
Optimizing a CNN - extended4:48
Explore per-layer hyperparameter sampling for a CNN with optuna and Keras sequential, varying conv and dense layer counts and tuning filters, kernel sizes, strides, and activations to maximize accuracy.
Evaluating the search with Optuna's built in functions9:41
Explore Optuna's plotting capabilities for neural network hyperparameter searches, using matplotlib or plotlib backends to visualize optimization history, contour plots, edf, intermediate values, and parameter importances.
References0:01
More examples0:03

Requirements

Python programming, including knowledge of NumPy, Pandas and Scikit-learn
Familiarity with basic machine learning algorithms, i.e., regression, support vector machines and nearest neighbours
Familiarity with decision tree algorithms and Random Forests
Familiarity with gradient boosting machines, i.e., xgboost, lightGBMs
Understanding of machine learning model evaluation metrics
Familiarity with Neuronal Networks

Description

Welcome to Hyperparameter Optimization for Machine Learning. In this course, you will learn multiple techniques to select the best hyperparameters and improve the performance of your machine learning models.

If you are regularly training machine learning models as a hobby or for your organization and want to improve the performance of your models, if you are keen to jump up in the leader board of a data science competition, or you simply want to learn more about how to tune hyperparameters of machine learning models, this course will show you how.

We'll take you step-by-step through engaging video tutorials and teach you everything you need to know about hyperparameter tuning. Throughout this comprehensive course, we cover almost every available approach to optimize hyperparameters, discussing their rationale, their advantages and shortcomings, the considerations to have when using the technique and their implementation in Python.

Specifically, you will learn:

What hyperparameters are and why tuning matters
The use of cross-validation and nested cross-validation for optimization
Grid search and Random search for hyperparameters
Bayesian Optimization
Tree-structured Parzen estimators
SMAC, Population Based Optimization and other SMBO algorithms
How to implement these techniques with available open source packages including Hyperopt, Optuna, Scikit-optimize, Keras Turner and others.

By the end of the course, you will be able to decide which approach you would like to follow and carry it out with available open-source libraries.

This comprehensive machine learning course includes over 50 lectures spanning about 8 hours of video, and ALL topics include hands-on Python code examples which you can use for reference and for practice, and re-use in your own projects.

So what are you waiting for? Enroll today, learn how to tune the hyperparameters of your models and build better machine learning models.

Who this course is for:

Students who want to know more about hyperparameter optimization algorithms
Students who want to understand advanced techniques for hyperparameter optimization
Students who want to learn to use multiple open source libraries for hyperparameter tuning
Students interested in building better performing machine learning models
Students interested in participating in data science competitions
Students seeking to expand their breadth of knowledge on machine learning

Hyperparameter Optimization for Machine Learning

What you'll learn

Explore related topics

Course content

Introduction9 lectures • 16min

Hyperparameter Tuning - Overview2 lectures • 20min

Performance metrics5 lectures • 24min

Cross-Validation7 lectures • 52min

Basic Search Algorithms10 lectures • 57min

Bayesian Optimization18 lectures • 2hr 40min

Other SMBO Algorithms9 lectures • 1hr 2min

Scikit-Optimize15 lectures • 1hr 4min

Hyperopt9 lectures • 55min

Optuna10 lectures • 53min

Requirements

Description

Who this course is for: