
Explore the basics of machine learning and classification models with scikit-learn, covering data analysis, regression and classification algorithms, and how to build and validate models from first principles.
Learn how to install Anaconda 3.6, a data science platform that includes Python libraries and Jupiter notebook, with step-by-step guidance for macOS and system-wide vs user installation.
Explore how the support vector machine, a binary classifier, uses a linear hyperplane to separate spam from ham and how transforming data to a higher dimensional space enables nonlinear separation.
Learn to build a support vector machine classifier using the adult dataset to predict whether annual income exceeds $50,000, with data loading, preprocessing, and visualization in Python.
build and tune a support vector classifier in scikit-learn using train test split, with features like education, occupation, age, and gender, plus correlation heatmaps, achieving about 80 percent accuracy.
Learn how decision trees classify outcomes by splitting inputs from root to leaves, using feature vectors and labels learned through recursive partitioning in supervised learning with scikit-learn.
Explore information gain and entropy in decision trees for classification, compare information content across outcomes, and note Gini impurity as used by scikit-learn.
Explore how to view and tweak a scikit-learn decision tree, inspect feature importances and alcohol content splits, and experiment with max depth, max features, and information gain versus Gini impurity.
Explore how cross-validation and regularization mitigate overfitting by balancing bias and variance, with data splits, multiple models, and feature combinations.
Explore ensemble learning as a strategy to mitigate overfitting by combining results from multiple models. Discover how random forests leverage diverse decision trees and majority voting to improve predictions.
Build a random forest classifier by ensembling decision trees with the Portuguese wine quality dataset; reuse decision-tree code and tune n_estimators, max_depth, and max_features to compare with a single tree.
This course will give you a fundamental understanding of Machine Learning overall with a focus on building classification models. Basic ML concepts of ML are explained, including Supervised and Unsupervised Learning; Regression and Classification; and Overfitting. There are 3 lab sections which focus on building classification models using Support Vector Machines, Decision Trees and Random Forests using real data sets. The implementation will be performed using the scikit-learn library for Python.
The Intro to ML Classification Models course is meant for developers or data scientists (or anybody else) who knows basic Python programming and wishes to learn about Machine Learning, with a focus on solving the problem of classification.