CRISP-ML(Q) - Business Understanding and Data Understanding

Name: CRISP-ML(Q) - Business Understanding and Data Understanding
Rating: 4.8 (35 reviews)

Data Science - Business Understanding and Data Understanding

Created byAISPRY TUTOR

Last updated 2/2024

English

What you'll learn

Understanding project management methodology to handle data related projects.
Understand business problem definition.
Understand data types as well as data collection mechanisms.
Understand exploratory Data Analytics (EDA) / Descriptive statistics as well
Understand the various Data cleaning / Pre-processing tasks Using Python.

Course content

4 sections • 28 lectures • 4h 3m total length

Introduction about Tutor3:14
The tutor introduces himself, highlighting a data science and consulting background across global companies and a humorous, engaging approach to teaching CRISP-ML(Q) data understanding.
Agenda and Stages of Analytics1:02
Explore the agenda and stages of analytics within a data science training program, and learn how a project management methodology governs real-world analytics from a high-level overview to details.
What is Diagnoistic Analytics ?1:21
Apply diagnostic analytics to explain why something happened, such as an increase in COVID-19 cases. Tag factors like lockdowns and vaccination to account for the increase and drop in cases.
What is Predicative Analytics ?1:57
Predictive analytics forecasts future outcomes using current data, such as COVID-19 cases and vaccination rates. Assess horizon validity and adapt predictions as conditions change.
What is Prescriptive Analytics ?11:41
Explore prescriptive analytics through what-if scenarios that translate predictions into actions. Learn the four analytics stages—descriptive, diagnostic, predictive, and prescriptive—with real-world examples.
What is CRISP-ML(Q) ?3:08
Explore the CRISP-ML(Q) framework and its six phases: business and data understanding, data preparation, model building and tuning, evaluation, deployment, and monitoring and maintenance, focused on ongoing data science projects.
Quiz Questions

Business Understanding - Define the Scope of Application18:44
Define scope of application and the business objective to minimize loan defaulters under constraints. Use inputs x and output y with survival analytics to predict default risk and balance profits.
Business Understanding - Define Success Criteria8:13
Define the business success criteria by aligning KPIs such as loan default rate with machine learning and ROI, balancing accuracy and performance with practical constraints.
Business Understanding - Use Cases9:59
Explore business understanding and use cases, balancing fraud minimization with customer convenience. Learn drone-driven precision farming, multispectral sensing, and cost-aware optimization.
Quiz Question

Agenda Data Understanding0:49
Explore the data understanding phase by identifying data types and scales of measurement, and define key terms while describing primary and secondary data collection techniques.
Introduction to Data Understanding6:18
Learn how data understanding drives analysis, modeling, predictions, and optimization to support management decisions, with examples of sales data, leading to what-if analysis and strategic levers.
Data Types - Continuous Data (vs) Discrete Data11:18
Explore the differences between continuous and discrete data, identifying decimal-representable values versus counts, with examples like time, money, height, and weight.
Categorical Data Vs Count Data6:45
Contrast categorical data with count data, highlighting binary (boolean) and multiple categorical types, then apply to churn, defaults, claims, and other business examples.
Practical Data Understanding using Realtime Examples11:15
Understand practical data concepts through real-time examples. Distinguish nominal data like flight numbers, ordinal data like gate numbers, interval data like temperatures, and ratio data like money with absolute zero.
Quantitative (vs) Qualitative5:04
Contrast quantitative and qualitative data by illustrating numeric versus descriptive measures, including continuous and count data alongside categorical data, and explain which data best informs decision making.
Scales of Measurement3:34
Explore the scales of measurement from nominal to ratio data, where nominal supports counts and frequencies, ordinal enables ranking, interval allows addition and subtraction, and ratio permits multiplication and division.
Structured Vs Unstructured Data13:04
Explain structured data in tabular form versus unstructured data like videos, images, audio, and text; show transformation into structured data and discuss semi-structured formats such as HTML, XML, and JSON.
Big Data Vs Not Big Data9:44
Compare big data and non-big data through the three Vs: volume, velocity, and variety, and choose appropriate storage and processing with SQL vs NoSQL and Hadoop.
Cross Sectional Vs Time Panel Vs Panel/Longitudinal Data7:01
Compare cross-sectional, time series, and panel longitudinal data, noting that cross-sectional ignores date and time while time series emphasizes them; panel data blends both properties with multiple columns and observations.
Balanced Vs Imbalanced Vs Rare Events15:36
Learn how balanced, imbalanced, and rare events affect data understanding. Explore two-class and multi-class examples such as employee attrition and fraud, and normal vs non-normal distributions.
Batch Data(Offline) Vs Live Streaming Data (Online)17:39
Compare batch offline processing with live streaming online data, illustrating loan default predictions and fraud detection through dashboards, rules, and automated versus manual decisions.
Quiz

What is Data Collection4:12
Master data collection concepts by distinguishing primary and secondary data sources, and learn dataset terms from input and output variables to structured and semi-structured data.
Understanding Secondary Datasources13:31
Understand secondary and primary data sources, and see how combining internal data, free Google Maps, and drone analytics yields insights for telecom 5G planning; primary data fills gaps.
Understanding Primary Datasources22:15
Explore how primary data sources, including social media and IoT sensor data, augment loan default predictions and data quality, while addressing data privacy and secondary data considerations.
Understanding Data Collection Using Survey6:46
Learn to design end-to-end data collection with surveys, linking business reality to root-cause analysis, form research objectives, and craft multidimensional constructs and targeted questions.
Understanding Data Collection Using DoE7:15
Apply design of experiments to data collection, comparing discount expiry, distance radius, and timing to reveal how customers redeem coupons and respond to promotions.
Understanding Possible Error in Data Collection Stage16:21
Identify possible data collection errors, including random, systematic, and device-related biases from harsh environments. Apply gauge R&R and SOPs to ensure data quality and representativeness.
Understanding Bias & Fairness5:17
Learn to identify bias and ensure fairness by using representative data, avoid using race or gender in loan default models, and prevent biased outcomes in applications like facial recognition.

Requirements

No programming and no statistics knowledge required.
Everything will be taught here from the very begining.
Basic computer Knowledge and primary school Mathematics knowledge is sufficient.

Description

This course will help you understand the basics of Data Science and EDA using Python and we shall also dive deep into the Project Management Methodology, CRISP-ML(Q). Cross-Industry Standard Process for Machine Learning with Quality Assurance is abbreviated as CRISP-ML(Q). Data Science is omnipresent in every sector. The purpose of Data Science is to find trends and patterns with the data that is available through various techniques. Data Scientists are also responsible for drawing insights after analyzing data. Data Science is a multidisciplinary field that involves mathematics, statistics, computer science, Python, machine learning, etc. Data Scientists need to be adept in these topics. This course will provide you with an understanding of all the aforementioned topics.

A detailed explanation of the 6 stages of CRISP-ML(Q) will be provided. These 6 stages are as follows:

Business and Data Understanding
Data Preparation
Model Building
Evaluation
Model Deployment
Monitoring & Maintenance

The importance of Business objectives and constraints, Business success criteria, Economic success criteria, and Project charter will be thoroughly understood. Elaborate descriptions of various data types - continuous, discrete, qualitative, quantitative, structured, semi-structured, unstructured, big, and non-big data, cross-sectional, time series and panel data, balanced and unbalanced data, and finally, offline and live streaming data. Various aspects of data collection will be looked into. Primary, and secondary, data version control, description, requirements, and verification will be analyzed.

Data Preparation involving data cleansing, EDA using Python or descriptive statistics, and feature engineering will be elaborately explained. Data cleansing involves numerous methods like typecasting, handling duplicates, outlier treatment, zero & near zero variance, missing values, discretization, dummy variables, transformation, standardization, and string manipulation. The realm of EDA using Python will be explored, This would include understanding measures of central tendency (mean, median, and mode), measures of dispersion (variance, standard deviation, and range), skewness, and kurtosis which are also termed first, second, third and fourth-moment business decisions. More about bar plots, Q-Q plots, box plots, histograms, scatter plots, etc., will be looked into in EDA using Python. Feature engineering, the last part of data cleansing, will also be given enough coverage.

Further, the model building also known as data mining or machine learning will also be thoroughly talked about. Model building involves supervised learning, unsupervised learning, and, forecasting which will be explored. Several model-building techniques like Simple Linear regression, Multilinear regression, Logistic regression, Decision-Tree, Naive Bayes, etc.

The last few steps of CRISP-ML(Q) are Evaluation, Model Deployment, and Monitoring & Maintenance.

The learning journey will include CRISP-ML(Q) using Python & Data Science and EDA using Python. Having a thorough understanding of these topics will enable you to build a career in the field of data science.

Who this course is for:

Beginners, Intermediate as well as advanced leaners.
Freshers who are new of data science and want to embark into the field of data science.
Working professionals who are working in different industries.
Lectures, Professors & Teachers whose primary role is to teach students on data related concepts.

CRISP-ML(Q) - Business Understanding and Data Understanding

What you'll learn

Explore related topics

Course content

Introduction6 lectures • 22min

Business Understanding Phase3 lectures • 37min

Data Understanding Phase | Data Types12 lectures • 1hr 48min

Data Understanding Phase7 lectures • 1hr 16min

Requirements

Description

Who this course is for: