What you'll learn
- From Dataset to Machine Learning 5 Models scenarios Implementation
- Understanding the dataset
- Data Analysis (missing values, outliers, outliers detection techniques, correlation)
- Feature engineering
- Selecting algorithms
- Training the baseline
- Understanding the testing matrix (ROC, AUC, Accuracy, Kappa...)
- Testing the baseline model
- Problems with the existing approach
- Cross validation, Grid search, Models parameters tuning
- Models optimization, Ensembles
- and much more ....
Requirements
- Intermediate knowledge of python
- Prior exposure to Machine Learning algorithms
- Curiosity and Interest in Data
- Basic statistics
Description
One case study, five models from data preprocessing to implementation with Python, with some examples where no coding is required.
We will cover the following topics in this case study
Problem Statement
Data
Data Preprocessing 1
Understanding Dataset
Data change and Data Statistics
Data Preprocessing 2
Missing values
Replacing missing values
Correlation Matrix
Data Preprocessing 3
Outliers
Outliers Detection Techniques
Percentile-based outlier detection
Mean Absolute Deviation (MAD)-based outlier detection
Standard Deviation (STD)-based outlier detection
Majority-vote based outlier detection
Visualizing outlier
Data Preprocessing 4
Handling outliers
Feature Engineering
Models Selected
·K-Nearest Neighbor (KNN)
·Logistic regression
·AdaBoost
·GradientBoosting
·RandomForest
·Performing the Baseline Training
Understanding the testing matrix
·The Mean accuracy of the trained models
·The ROC-AUC score
ROC
AUC
Performing the Baseline Testing
Problems with this Approach
Optimization Techniques
·Understanding key concepts to optimize the approach
Cross-validation
The approach of using CV
Hyperparameter tuning
Grid search parameter tuning
Random search parameter tuning
Optimized Parameters Implementation
·Implementing a cross-validation based approach
·Implementing hyperparameter tuning
·Implementing and testing the revised approach
·Understanding problems with the revised approach
Implementation of the revised approach
·Implementing the best approach
Log transformation of features
Voting-based ensemble ML model
·Running ML models on real test data
Best approach & Summary
Examples with No Code
Downloads – Full Code
Who this course is for:
- For all students willing to have a career in machine learning
Instructor
With 22 years of mixed consulting experience in Cybersecurity, Machine Learning and Blockchain. I have worked with IBM, Cisco, EMC-RSA and others, and I have been an academics for a couple of years. I worked in four continents and travelled extensively.
I have a PhD in Engineering, an MSc in AI and an MBA, i am also a Certified Blockchain Expert.
I have written two (2) books:
1. IT Security Fundamentals- From The Firewall to Quantum Cryptography.
2. Corporate Information Security