
Kaggle categories and performance tiers are tracked independently across competition, colonels, and discussions, with five tiers - no Weiss, contributor, expert, master, and grandmaster - awarded based on medals earned.
Understand how Kaggle medals recognize top competition results, popular topics, and insightful comments with bronze, silver, and gold awards based on votes and post popularity.
Explore how Kaggle progression uses live category leaderboards, profiles and follows, with decay-based competition points, upvotes, and discussion points influencing rankings.
Explore Kaggle competitions across featured, research, getting started, playground, recruitment, annual, and limited participation formats. For example, the Zillow prize and jigsaw toxic comment classification illustrate real-world, diverse predictive challenges.
Explore Kaggle competition formats—simple, stage, and kernels-only—within the cargo framework, and learn how data access, rules, and submissions shape model development and evaluation.
Navigate the competition listing to join an active Kaggle competition, read and accept the rules, review the data, kernels, and overview tabs, and submit using the evaluation metrics.
Form a Kaggle team, collaborate to improve solutions, name your team, and manage the team leader, invites, merge requests, and merges within daily submission limits and deadlines.
Submit your model predictions to Kaggle to appear on public and private leaderboards, with up to five submissions; build a kernel, generate a solution file, and select the best score.
Explore data leakage in machine learning, where test or future information in training data inflates performance, with examples from the prostate cancer data and competition relaunch to correct leaks.
Explore Kaggle's dataset formats, including csv, json, sqlite, and archives like zip, 7z, and learn how to upload non-proprietary data with clear kernels.
Search and filter datasets on Kaggle by size, format, licenses, and tags. Explore the community to discuss datasets, learn coding techniques, and build projects in kernels for your portfolio.
Create and share datasets on Kaggle to grow your data science portfolio, uploading files, choosing private or public access, and adding metadata, descriptions, and licenses for reproducible research.
Publish datasets under a personal account or as part of an organization, assign owners, invite collaborators with view or edit privileges, and manage private or public datasets, journals, and kernels.
Understand Google datasets technical specifications, including 20 gigabyte per dataset, a 20 gigabyte private cap, and a 50 top level file limit; archives and data types appear in Data Explorer.
Kaggle kernels provide a cloud-based environment for data exploration, machine learning, and collaboration. Choose script or notebook, work in Python or R, and run, edit, and share code with markdown.
Explore Kaggle kernels, a collaborative, open-source repository of reproducible data science and machine learning code. Filter by category, language, or engagement, and reuse notebooks to build your portfolio.
Operate the kernel editor, editing window, console, and settings to run scripts or notebooks for analysis and competition submissions. Share kernels, add data, and manage packages.
Launch a kernel from a dataset or competition, or use add data to search; accept competition rules, mix data sources, and save up to 5 gb of output for reuse.
Collaborate on kernels by inviting others to view or edit, set public or private access in settings, grant privileges, and click save to trigger email notifications.
Learn how Kaggle kernels run in docker containers with specific docker images and kernel versions, manage packages, and enable gpu environments to accelerate building a machine learning portfolio.
Discover kaggle kernels technical specifications, including six hours of execution time, autosave and temporary storage limits, ram and dataset size caps, and commit-and-run rules.
Install the Kaggle public API with Python and pip via the command line on Windows, Mac, or Linux. Authenticate by generating an API token from your account.
Master the Kaggle API and its command line interface to interact with competitions, including authenticated setup, and learn which rules must be accepted on the competition page.
List competitions and their files using a command-driven interface; filter by group and category, search terms, and sort by prize or deadlines, and export results as CSV.
Learn to download a Kaggle competition via the API using the download command, with help and competition arguments, specifying file name or suffix to target an item, after accepting rules.
Learn to use the Kaggle API to download, create, and update datasets, schedule automatic updates with third-party tools, and consult GitHub for the latest CLI commands.
Learn to list, search, and download Kaggle datasets via the command line, with sorting, CSP-formatted results, and filters by type, license, tags, and owner.
Learn to create and maintain datasets on Cagle by organizing files into folders, generating and updating a metadata file, and using commands to publish, version, and manage dataset options.
Explore the Kaggle API with kernels to search, download, and run kernels using Kaggle compute resources, with installation, authentication, and command line interface commands described in official docs.
List Kaggle kernels with customizable filters, including competition or dataset, user, language, kernel type, and output type, then sort by hotness, score, or view count.
Initialize the kernel-metadata.json via command line or api to upload and run a kernel, using a data folder and metadata for new or existing kernels.
Push and pull a kernel using command line instructions. Specify the target folder and download location, and use the url and wp arguments to control metadata.
Check the status and retrieve the output of a kernel using command options, specify target identifiers, choose download directories, enable force updates, and suppress progress messages.
Create and run a kernel on Keiko by organizing a folder of Python scripts and notebooks, generating a metadata file with the title and idea, and run it on Google.
Download the latest code and metadata for your kernel, rename if needed and update the I.T. field and the title field in your next push, then run the updated kernel.
Learn to configure the Kaggle API with the command line tool: view, set, and clear config values, download files, and consult GitHub docs for current commands.
Explore building a binary machine learning model to predict Titanic survivors using features like gender, age, and class, with training and testing data and feature engineering for submission.
Develop a Titanic survival model in Python using feature engineering and one hot encoding, train a random forest, and generate a Kaggle submission file.
Analyze the iris species dataset to build algorithms that classify three iris species. Note the Fisher 1936 study, the UCI repository, six attributes, and 50 samples per class.
Import python packages and explore iris dataset with pandas and seaborn to reveal species patterns, encode labels, split data, and evaluate a random forest and a support vector classifier.
Predict house sale prices using 79 attributes in a regression competition, using train and test data, and submit in the provided format with rules and editorial resources.
Explore the house prices dataset, detailing the sale price target and a wide range of property features such as neighborhood, overall quality, year remodeled, foundation, basement, and heating.
Explore the dataset, handle missing values, engineer features, and train a random forest to predict house prices, preparing Kaggle submissions.
Survival analysis estimates the time to an event and lifespan, starting at time zero, and uses stratified sampling and simple random sampling to study births, deaths, and lifetimes.
Explore censorship in survival analysis, showing how censored survival times—right censoring, left censoring, and left truncation (late entry)—are handled to avoid bias when studying churn and other events.
Understand the survival function as the probability of no event by time t and the hazard intensity function as the instantaneous risk, with delta t approaching zero.
Explore Kaplan-Meier estimation to model survival probabilities and at risk populations, and apply the Nelson Aalen fitter to obtain an average view of survival using life tables and survival curves.
Explore survival regression with covariates to model time-to-event using the Cox proportional hazards model, estimating regression coefficients via partial likelihood and addressing censoring and stratification.
Explore telco churn as a regression task to estimate tenure, using a dataset of 7,043 customers with 21 columns where Chern is the churn target.
Analyze and visualize telco customer churn with a machine learning pipeline, importing data, inspecting columns in pandas, and exploring distributions of gender, senior citizens, dependents, and internet services.
Encode telco churn data by converting categorical features to numeric values, handle missing data with mean imputation, and prepare a pandas-driven ML-ready dataset.
Explore survival analysis with the Kaplin Meyer estimate to predict churn, comparing groups by age, partner status, dependents, service types, and payment methods while interpreting median survival times.
Learn to apply the Cox proportional hazards model for survival analysis, including data preparation, dummy encoding, and fitting with tenure and event columns.
This career-ready Masterclass is designed to help you gain hands-on and in-depth exposure to the domain of Data Science by adopting the learn by doing approach. And the best way to land your dream job is to build a portfolio of projects. And the best platform for a Data Scientist is Kaggle!
Over the years, Kaggle has become the most popular community for Data Scientists. Kaggle not only helps you learn new skills and apply new techniques, but it now plays a crucial role in your career as a Data Professional.
This course will give you in-depth hands-on experience with a variety of projects that include the necessary components to become a proficient data scientist. By completing the projects in this course, you will gain hands-on experience with these components and have a set of projects to reflect what you have learned. These components include the following:
Data Analysis and Wrangling using NumPy and Pandas.
Exploratory Data Analysis using Matplotlib and Seaborn.
Machine Learning using Scikit Learn.
Deep Learning using TensorFlow.
Time Series Forecasting using Facebook Prophet.
Time Series Forecasting using Scikit-Time.
This course primarily focuses on helping you stand out by building a portfolio comprising of a series of Jupyter Notebooks in Python that utilizes Competitions and Public Datasets hosted on the Kaggle platform. You will set up your Kaggle profile that will help you stand out for future employment opportunities.