Kaggle Masterclass - build a Machine Learning Portfolio

Name: Kaggle Masterclass - build a Machine Learning Portfolio
Rating: 3.8 (57 reviews)

Become a Kaggle Grandmaster. Build a Portfolio of Machine Learning Projects, and take your Career to the Next Level.

Created byTaimur Zahid

Last updated 4/2021

English

What you'll learn

Machine Learning
Deep Learning
Data Analytics
Exploratory Data Analysis
Kaggle
Data Science

Course content

12 sections • 61 lectures • 3h 31m total length

Kaggle Categories and Performance Tiers4:55
Kaggle categories and performance tiers are tracked independently across competition, colonels, and discussions, with five tiers - no Weiss, contributor, expert, master, and grandmaster - awarded based on medals earned.
Kaggle Medals2:05
Understand how Kaggle medals recognize top competition results, popular topics, and insightful comments with bronze, silver, and gold awards based on votes and post popularity.
More on Kaggle Progression2:03
Explore how Kaggle progression uses live category leaderboards, profiles and follows, with decay-based competition points, upvotes, and discussion points influencing rankings.

Kaggle Competitions9:16
Explore Kaggle competitions across featured, research, getting started, playground, recruitment, annual, and limited participation formats. For example, the Zillow prize and jigsaw toxic comment classification illustrate real-world, diverse predictive challenges.
Competition Formats4:08
Explore Kaggle competition formats—simple, stage, and kernels-only—within the cargo framework, and learn how data access, rules, and submissions shape model development and evaluation.
Joining a Competition2:37
Navigate the competition listing to join an active Kaggle competition, read and accept the rules, review the data, kernels, and overview tabs, and submit using the evaluation metrics.
Forming a Team2:23
Form a Kaggle team, collaborate to improve solutions, name your team, and manage the team leader, invites, merge requests, and merges within daily submission limits and deadlines.
Making a Submission3:14
Submit your model predictions to Kaggle to appear on public and private leaderboards, with up to five submissions; build a kernel, generate a solution file, and select the best score.
Data Leakage2:21
Explore data leakage in machine learning, where test or future information in training data inflates performance, with examples from the prostate cancer data and competition relaunch to correct leaks.

Types of Datasets on Kaggle4:21
Explore Kaggle's dataset formats, including csv, json, sqlite, and archives like zip, 7z, and learn how to upload non-proprietary data with clear kernels.
Searching for Datasets3:28
Search and filter datasets on Kaggle by size, format, licenses, and tags. Explore the community to discuss datasets, learn coding techniques, and build projects in kernels for your portfolio.
Creating a Dataset5:10
Create and share datasets on Kaggle to grow your data science portfolio, uploading files, choosing private or public access, and adding metadata, descriptions, and licenses for reproducible research.
Organizations and Dataset Collaborations2:51
Publish datasets under a personal account or as part of an organization, assign owners, invite collaborators with view or edit privileges, and manage private or public datasets, journals, and kernels.
Kaggle Datasets - Technical Specifications1:12
Understand Google datasets technical specifications, including 20 gigabyte per dataset, a 20 gigabyte private cap, and a 50 top level file limit; archives and data types appear in Data Explorer.

Types of Kaggle Kernels2:56
Kaggle kernels provide a cloud-based environment for data exploration, machine learning, and collaboration. Choose script or notebook, work in Python or R, and run, edit, and share code with markdown.
Searching for Kernels3:43
Explore Kaggle kernels, a collaborative, open-source repository of reproducible data science and machine learning code. Filter by category, language, or engagement, and reuse notebooks to build your portfolio.
Kernel Editor1:43
Operate the kernel editor, editing window, console, and settings to run scripts or notebooks for analysis and competition submissions. Share kernels, add data, and manage packages.
Data Sources1:10
Launch a kernel from a dataset or competition, or use add data to search; accept competition rules, mix data sources, and save up to 5 gb of output for reuse.
Collaborating on Kernels1:12
Collaborate on kernels by inviting others to view or edit, set public or private access in settings, grant privileges, and click save to trigger email notifications.
More on Kaggle Kernels4:13
Learn how Kaggle kernels run in docker containers with specific docker images and kernel versions, manage packages, and enable gpu environments to accelerate building a machine learning portfolio.
Kaggle Kernels - Technical Specifications1:37
Discover kaggle kernels technical specifications, including six hours of execution time, autosave and temporary storage limits, ram and dataset size caps, and commit-and-run rules.

Installing and Authenticating1:33
Install the Kaggle public API with Python and pip via the command line on Windows, Mac, or Linux. Authenticate by generating an API token from your account.
Kaggle API with Competitions1:06
Master the Kaggle API and its command line interface to interact with competitions, including authenticated setup, and learn which rules must be accepted on the competition page.
Listing Competitions and Competition Files2:25
List competitions and their files using a command-driven interface; filter by group and category, search terms, and sort by prize or deadlines, and export results as CSV.
Downloading a Competition1:14
Learn to download a Kaggle competition via the API using the download command, with help and competition arguments, specifying file name or suffix to target an item, after accepting rules.
Kaggle API with Datasets1:11
Learn to use the Kaggle API to download, create, and update datasets, schedule automatic updates with third-party tools, and consult GitHub for the latest CLI commands.
Listing and Downloading Datasets1:51
Learn to list, search, and download Kaggle datasets via the command line, with sorting, CSP-formatted results, and filters by type, license, tags, and owner.
Creating and Maintaining Datasets2:12
Learn to create and maintain datasets on Cagle by organizing files into folders, generating and updating a metadata file, and using commands to publish, version, and manage dataset options.
Kaggle API with Kernels1:11
Explore the Kaggle API with kernels to search, download, and run kernels using Kaggle compute resources, with installation, authentication, and command line interface commands described in official docs.
Listing Kaggle Kernels2:05
List Kaggle kernels with customizable filters, including competition or dataset, user, language, kernel type, and output type, then sort by hotness, score, or view count.
Kernel Metadata File1:11
Initialize the kernel-metadata.json via command line or api to upload and run a kernel, using a data folder and metadata for new or existing kernels.
Push and Pull a Kernel1:12
Push and pull a kernel using command line instructions. Specify the target folder and download location, and use the url and wp arguments to control metadata.
Checking the Status and Output of a Kernel1:11
Check the status and retrieve the output of a kernel using command options, specify target identifiers, choose download directories, enable force updates, and suppress progress messages.
Creating and Running a New Kernel1:12
Create and run a kernel on Keiko by organizing a folder of Python scripts and notebooks, generating a metadata file with the title and idea, and run it on Google.
Creating and Running a New Kernel Version1:11
Download the latest code and metadata for your kernel, rename if needed and update the I.T. field and the title field in your next push, then run the updated kernel.
Configurations2:09
Learn to configure the Kaggle API with the command line tool: view, set, and clear config values, download files, and consult GitHub docs for current commands.

Introduction to the Titanic - Machine Learning from Disaster Competition9:18
Explore building a binary machine learning model to predict Titanic survivors using features like gender, age, and class, with training and testing data and feature engineering for submission.
Implementing a Model to Predict Survival of Titanic Passengers14:15
Develop a Titanic survival model in Python using feature engineering and one hot encoding, train a random forest, and generate a Kaggle submission file.
Titanic - Machine Learning from Disaster Project Notebook0:05

Introduction to the Iris Species Dataset1:20
Analyze the iris species dataset to build algorithms that classify three iris species. Note the Fisher 1936 study, the UCI repository, six attributes, and 50 samples per class.
Coding an Iris Species Classifier in Sci-kit Learn8:21
Import python packages and explore iris dataset with pandas and seaborn to reveal species patterns, encode labels, split data, and evaluate a random forest and a support vector classifier.
Iris Species Classifier Project Notebook0:05

Introduction to House Prices - Advanced Regression Techniques Competition3:16
Predict house sale prices using 79 attributes in a regression competition, using train and test data, and submit in the provided format with rules and editorial resources.
House Prices Dataset Description6:06
Explore the house prices dataset, detailing the sale price target and a wide range of property features such as neighborhood, overall quality, year remodeled, foundation, basement, and heating.
House Price Prediction using Random Forest19:16
Explore the dataset, handle missing values, engineer features, and train a random forest to predict house prices, preparing Kaggle submissions.
House Price Prediction using Random Forest Project Notebook0:05

Introduction to Survival Analysis2:38
Survival analysis estimates the time to an event and lifespan, starting at time zero, and uses stratified sampling and simple random sampling to study births, deaths, and lifetimes.
Censorship3:49
Explore censorship in survival analysis, showing how censored survival times—right censoring, left censoring, and left truncation (late entry)—are handled to avoid bias when studying churn and other events.
The Survival Function and the Hazard Function2:36
Understand the survival function as the probability of no event by time t and the hazard intensity function as the instantaneous risk, with delta t approaching zero.
Kaplan-Meier Estimate and Nelson Aalen Fitter3:09
Explore Kaplan-Meier estimation to model survival probabilities and at risk populations, and apply the Nelson Aalen fitter to obtain an average view of survival using life tables and survival curves.
Survival Regression - Cox Proportional Hazard Regression Model6:39
Explore survival regression with covariates to model time-to-event using the Cox proportional hazards model, estimating regression coefficients via partial likelihood and addressing censoring and stratification.

Introduction to Telco Customer Churn Dataset3:04
Explore telco churn as a regression task to estimate tenure, using a dataset of 7,043 customers with 21 columns where Chern is the churn target.
Telco Customer Churn Pipeline: Data Analysis and Visualization7:52
Analyze and visualize telco customer churn with a machine learning pipeline, importing data, inspecting columns in pandas, and exploring distributions of gender, senior citizens, dependents, and internet services.
Telco Customer Churn Pipeline: Data Preprocessing4:19
Encode telco churn data by converting categorical features to numeric values, handle missing data with mean imputation, and prepare a pandas-driven ML-ready dataset.
Survival Analysis to predict Churn using Kaplan-Meier Estimate7:08
Explore survival analysis with the Kaplin Meyer estimate to predict churn, comparing groups by age, partner status, dependents, service types, and payment methods while interpreting median survival times.
Survival Regression Analysis using Cox Proportional Hazard Regression Model10:31
Learn to apply the Cox proportional hazards model for survival analysis, including data preparation, dummy encoding, and fitting with tenure and event columns.
Survival Regression Analysis to Predict Churn Project Notebook0:05

Requirements

Intermediate Python Programming Skills.

Description

This career-ready Masterclass is designed to help you gain hands-on and in-depth exposure to the domain of Data Science by adopting the learn by doing approach. And the best way to land your dream job is to build a portfolio of projects. And the best platform for a Data Scientist is Kaggle!

Over the years, Kaggle has become the most popular community for Data Scientists. Kaggle not only helps you learn new skills and apply new techniques, but it now plays a crucial role in your career as a Data Professional.

This course will give you in-depth hands-on experience with a variety of projects that include the necessary components to become a proficient data scientist. By completing the projects in this course, you will gain hands-on experience with these components and have a set of projects to reflect what you have learned. These components include the following:

Data Analysis and Wrangling using NumPy and Pandas.
Exploratory Data Analysis using Matplotlib and Seaborn.
Machine Learning using Scikit Learn.
Deep Learning using TensorFlow.
Time Series Forecasting using Facebook Prophet.
Time Series Forecasting using Scikit-Time.

This course primarily focuses on helping you stand out by building a portfolio comprising of a series of Jupyter Notebooks in Python that utilizes Competitions and Public Datasets hosted on the Kaggle platform. You will set up your Kaggle profile that will help you stand out for future employment opportunities.

Who this course is for:

Beginner Python Developers who want to get into Data Science.
Data Scientists looking forward to expand their skillset.
Data Scientists and Aspiring Data Scientists who wish to create a strong portfolio for potential career opportunities.

Kaggle Masterclass - build a Machine Learning Portfolio

What you'll learn

Explore related topics

Course content

Kaggle Progression System3 lectures • 9min

Kaggle Competitions6 lectures • 24min

Kaggle Datasets5 lectures • 17min

Kaggle Kernels7 lectures • 17min

Kaggle Public API15 lectures • 23min

Project: Titanic - Machine Learning from Disaster3 lectures • 24min

Project: Iris Species Classification3 lectures • 10min

Project: House Price Prediction4 lectures • 29min

Theory: Survival Regression Analysis5 lectures • 19min

Project: Telco Customer Churn6 lectures • 33min

Requirements

Description

Who this course is for: