CRISP-ML(Q)-Data Pre-processing Using Python(2026)

Name: CRISP-ML(Q)-Data Pre-processing Using Python(2026)
Rating: 4.7 (8 reviews)

Data Science - Data Pre-processing Using Python

Created by360DigiTMG Elearning

Last updated 3/2026

English

What you'll learn

Understand Project Management Methodology to Handle Data Related Projects in Structured Manner.
Understand Business Problem Definition, Setting Objectives & Constraints.
Understand Data Types as well as Data Collection Mechanisms.
Understand Exploratory Data Analytics (EDA) / Descriptive Statistics as well as Graphical Representation
Understand the various Data Cleansing /Pre-Processing Tasks using Python.

Course content

17 sections • 85 lectures • 19h 47m total length

Introduction to Project Management Methodology CRISP ML(Q)0:57
Agenda & Stages of Analytics3:18
What is Diagnostic Analytics ?1:15
What is Predictive Analytics ?1:52
Explore predictive analytics by forecasting futures from current data, such as covid-19 cases or vaccination rates. Assess the validity of predictions amid changing conditions and define appropriate time horizons.
What is Prescriptive Analytics ?11:35
What is CRISP ML (Q) ?3:02

Agenda Data Understanding0:43
Introduction to Data Understanding6:12
Data types-continuous data (vs) Discrete data.mp411:11
Categorical Data Vs Count Data6:39
Practical Data Understanding Using Realtime Example11:09
Scale of Measurement3:29
Quantitative (vs) Qualitative4:57
Structured Vs Unstructured Data12:58
Bigdata vs Not Big Data9:24
Explore big data versus non big data through the three v's—volume, velocity, and variety—and learn storage and compute decisions for structured, semi-structured, and unstructured data using SQL, NoSQL, and Hadoop.
Cross Sectional vs Time Series vs Panel/Longitudinal Data6:53
Identify cross-sectional, time series, and panel data and know when date and time matter. Distinguish data structures—multiple columns versus a single time column—to determine appropriate techniques.
Balanced vs Imbalanced (or) Rare Events15:29
Batch data(offline) vs Live streming data(Online)17:17

What is Data collection ?4:11
Identify primary and secondary data sources, distinguish their roles, map input variables to output and dependent variables, and convert unstructured data into structured formats for machine learning analysis.
Understanding Secondary Datasources13:26
Explore secondary data sources and how they differ from primary data sources. See telecom customer data, open data like Google Maps, drone analytics, and syndicated data to enrich insights.
Understanding Primary Datasources22:17
Harness primary data sources by combining bank data with outward data from social media to improve loan default predictions, and distinguish primary data from secondary data in IoT contexts.
Understanding Data collection using survey6:43
Understanding Data collection using DoE7:09
Understanding Possible Errors in Data Collection Stage16:15
Understanding Bias & Fairness5:11

Recap of Priliminaries Concepts2:09
Understanding Normal Distribution15:36
Understanding Standard Normal Distribution & Whats is Z Scores28:09
Understanding Measures of central tendency ( First moment business decession )26:36
Understanding Measures of Dispersion ( Second moment business decision)10:46
Understanding Box Plot(Diff B-w Percentile and Quantile and Quartile)6:09
Explore how box plots use percentiles, quartiles, and quintiles to display results, with min and max and 25th, 50th, and 75th percentiles (Q1, Q2, Q3) and 100th percentile (fourth quartile).
Understanding Graphical Techniques-Q-Q-Plot8:34
Assess normality with graphical techniques, including Q-Q plots, histograms, and box plots, and understand standardized values and theoretical quantiles to determine if data are normally distributed.
Understanding about Bivariate Scatter Plot35:31

Recap of Concepts until Phase-216:05
Understanding 1st & 2nd Moment Business Decision Using Python24:28
Understanding 3rd Moment Business Decision Using Python20:58
Explore the third moment, skewness, and its formula using (x minus mean) cubed over sigma cubed to reveal non-normal, positively or negatively skewed data; learn through histogram examples.
Understanding 4th Moment Business Decision Using Python20:34
Understanding Unvariate (Bar Plot & Histogram) Using Python14:14
Explore univariate visualizations, including bar plots and histograms, to interpret single-variable data through bins, frequency distributions, and differences between normal and non-normal patterns.
Understanding Unvariate Plots Using Python34:30
Explore univariate plots in Python with histograms, density plots, and box plots, and interpret skewness and distribution using pandas, numpy, seaborn, and MATLAB.
Understanding Unvariate Box Plot Using Python12:13
Understanding Unvariate Q-Q-Plot Using Python ?8:00
Understanding Bivariate Scatter Plot Using Python32:12
Load a dataset in python, read csv with pandas, and create a bivariate scatter plot of waist circumference versus adipose tissue, interpreting correlation and covariance for direction and strength.

Requirements

No Programming and No Statistics knowledge is needed because everything is taught right from scratch.
Basic Computer Knowledge and Primary School Mathematics Knowledge is sufficient.

Description

This program will help aspirants getting into the field of data science understand the concepts of project management methodology. This will be a structured approach in handling data science projects. Importance of understanding business problem alongside understanding the objectives, constraints and defining success criteria will be learnt. Success criteria will include Business, ML as well as Economic aspects. Learn about the first document which gets created on any project which is Project Charter. The various data types and the four measures of data will be explained alongside data collection mechanisms so that appropriate data is obtained for further analysis. Primary data collection techniques including surveys as well as experiments will be explained in detail. Exploratory Data Analysis or Descriptive Analytics will be explained with focus on all the ‘4’ moments of business moments as well as graphical representations, which also includes univariate, bivariate and multivariate plots. Box plots, Histograms, Scatter plots and Q-Q plots will be explained. Prime focus will be in understanding the data preprocessing techniques using Python. This will ensure that appropriate data is given as input for model building. Data preprocessing techniques including outlier analysis, imputation techniques, scaling techniques, etc., will be discussed using practical oriented datasets.

Who this course is for:

Beginners, Intermediate as well as Advanced learners
Freshers who are new of data science and want to embark into the field of data science
Working professionals who are working in different industries
Lecturers & Professors & Teachers whose primary role is to teach students on data related concepts

CRISP-ML(Q)-Data Pre-processing Using Python(2026)

What you'll learn

Explore related topics

Course content

Introduction6 lectures • 22min

Business Understanding Phase3 lectures • 37min

Data Understanding Phase | Data Types12 lectures • 1hr 46min

Data Understanding Phase | Data Collection7 lectures • 1hr 15min

Understanding Basic Statistics5 lectures • 43min

Data Preparation Phase | Exploratory Data Analysis (EDA)8 lectures • 2hr 14min

Python Installation & Set-up4 lectures • 46min

Data Preparation Phase | EDA Using Python9 lectures • 3hr 3min

Data Preparation Phase | Data Cleansing- Type Casting3 lectures • 30min

Data Preparation Phase | Data Cleansing- Handling Duplicates3 lectures • 42min

Requirements

Description

Who this course is for: