Teach on Udemy

Turn what you know into an opportunity and reach millions around the world.

Learn More

Your cart is empty.

Keep shopping

Credit Risk Modeling in Python

Name: Credit Risk Modeling in Python
Rating: 4.5 (8061 reviews)

A complete data science case study: preprocessing, modeling, model validation and maintenance in Python

Bestseller

Created by365 Careers

Last updated 1/2026

English

Czech [Auto],English [Auto],

What you'll learn

Improve your Python modeling skills
Differentiate your data science portfolio with a hot topic
Fill up your resume with in demand data science skills
Build a complete credit risk model in Python
Impress interviewers by showing practical knowledge
How to preprocess real data in Python
Learn credit risk modeling theory
Apply state of the art data science techniques
Solve a real-life data science task
Be able to evaluate the effectiveness of your model
Perform linear and logistic regressions in Python

Course content

13 sections • 75 lectures • 6h 51m total length

What does the course cover5:46
Explore credit risk modeling in Python, from fundamentals to building PD, LGD, and EAD models. Learn preprocessing, scorecard creation, and Basel II/III compliance to estimate expected loss.
What is credit risk and why is it important?4:44
Learn how lenders assess credit risk to protect profits, using collateral and risk-based pricing to manage defaults on credit cards, home loans, and asset financing.
What is credit risk and why is it important?
Expected loss (EL) and its components: PD, LGD and EAD4:12
Explore how lenders estimate expected loss from credit risk using PD, LGD, and EAD. See how these components determine exposure and potential losses in a loan example.
Expected loss (EL) and its components: PD, LGD and EAD
Capital adequacy, regulations, and the Basel II accord4:32
Examine capital adequacy and Basel II regulations, focusing on capital requirements and risk weighted assets. Explore Basel II credit risk approaches: standardized, foundation IRB, and advanced IRB.
Capital adequacy, regulations, and the Basel II accord
Basel II approaches: SA, F-IRB, and A-IRB9:32
Basel II offers the standardised approach, foundation internal ratings based approach, and advanced internal ratings based approach to model expected loss from PD, LGD, and EAD.
Basel II approaches: SA, F-IRB, and A-IRB
Different facility types (asset classes) and credit risk modeling approaches9:22
Explore how facility types influence credit risk modeling in Python, using logistic regression for PD and beta regression for LGD and EAD, with risk based pricing insights.
Different facility types (asset classes) and credit risk modeling approaches

Setting up the environment - Do not skip, please!0:49
Set up the Python data science environment by installing Anaconda, Python 3, Jupyter Notebook, and the relevant packages, and learn the coding environment we will use throughout the course.
Why Python and why Jupyter4:53
Discover why Python and Jupyter power data science with open source, general purpose language benefits, and the IPython notebook workflow using kernels and notebooks.
Installing Anaconda3:03
Install Anaconda to get Python, Jupyter Notebook, and data science packages. Choose Windows, Mac, or Linux, then verify Python version and run the installer to open the Jupyter dashboard.
Jupyter Dashboard - Part 12:27
Navigate the Jupiter dashboard to manage files and folders with checkboxes, upload notebooks, create new python notebooks or text files, and run code in an interactive shell.
Jupyter Dashboard - Part 25:14
Jupyter dashboard part 2 covers working with input and output cells, code and markdown cells, and essential shortcuts for executing, inserting, and deleting cells.
Installing the sklearn package1:29
Install key libraries for credit risk modeling—scikit-learn, matplotlib, seaborn, and pickle—via the Anaconda prompt and pip, with notes on numpy, scipy, and pandas.

Our example: consumer loans. A first look at the dataset3:11
Explore a Lending Club consumer loan dataset to build initial expected loss models, starting with data exploration in Excel before transitioning to Python for preprocessing.
Our example: consumer loans. A first look at the dataset
Dependent variables and independent variables6:26
Learn to build pd, lgd, and ead models in Python using logistic and beta regression, and preprocess data with dummy encoding and coarse and fine classing of variables.
Dependent variables and independent variables

Importing the data into Python4:24
Import data into Python using numpy and pandas, load a CSV into a dataframe, create backups and copies for preprocessing, inspect data types, and prepare for preprocessing challenges.
Importing the data into Python
Preprocessing few continuous variables13:28
Convert employment length and term to numeric by cleaning text formats. Compute months since earliest credit line and impute negative values with the maximum observed.
Preprocessing few continuous variables
Preprocessing few continuous variables: Homework0:33
Preprocessing few discrete variables7:09
Preprocess discrete variables by turning categorical features into dummy variables with get_dummies. Prefix names and concatenate the resulting dummies to the loan data frame for modeling.
Preprocessing few discrete variables
Check for missing values and clean3:20
Learn to detect and clean missing data using pandas isnull, count gaps by variable, and impute values with fill, exemplified on total revolving limit using funded amount.
Check for missing values and clean
Check for missing values and clean: Homework0:17

How is the PD model going to look like?3:50
Learn to build a credit risk model by calculating expected loss from pd, lgd, and ead using logistic regression with dummy variables, and define default by 90 days past due.
How is the PD model going to look like?
Dependent variable: Good/ Bad (default) definition5:18
Define the dependent variable for default by creating a good_bad indicator from loan status. Use the default statuses to assign zero or one for logistic regression modeling.
Dependent variable: Good/ Bad (default) definition
Fine classing, weight of evidence, and coarse classing6:24
Explore fine classing and coarse classing to convert discrete and continuous variables into effective dummies, using weight of evidence to gauge each category's predictive power for credit risk modeling.
Fine classing, weight of evidence, and coarse classing
Information value4:59
Compute information value from weight of evidence to measure how a variable explains the dependent variable and support pre-selection with category weights and practical calculation examples.
Information value
Data preparation. Splitting data8:27
Learn data preparation for credit risk modeling by performing train-test splits to prevent overfitting and underfitting, configure random state, and evaluate logistic regression with sklearn.
Data preparation. Splitting data
Data preparation. An example8:20
Explore data preparation for a discrete grade variable by computing weight of evidence and information value for credit risk modeling in Python, with steps for grouping, proportions, and pre-processing.
Data preparation. An example
Data preparation. Preprocessing discrete variables: automating calculations5:57
Automate weight of evidence and information value calculations for discrete variables by building a reusable pandas function that handles any categorical variable and its outcome.
Data preparation. Preprocessing discrete variables: automating calculations
Data preparation. Preprocessing discrete variables: visualizing results9:35
Learn to visualize weight of evidence for discrete variables using matplotlib and seaborn with a plot_by_weight_of_evidence function that plots categories versus weight of evidence and rotates x axis labels.
Data preparation. Preprocessing discrete variables: creating dummies (Part 1)7:12
Apply weight of evidence to discrete variables, create and combine dummy variables for the PD model, designate worst-risk references, and preserve grade A-G as separate dummies.
Data preparation. Preprocessing discrete variables: creating dummies (Part 1)
Data preparation. Preprocessing discrete variables: creating dummies (Part 2)11:15
Preprocess discrete variables by converting the address state into dummy variables using weight of evidence, then plot results to inform category groupings and the regression reference category.
Data preparation. Preprocessing discrete variables: creating dummies (Part 2)
Data preparation. Preprocessing discrete variables. Homework.0:42
Data preparation. Preprocessing continuous variables: Automating calculations4:35
Automate preprocessing of continuous variables by fine classing, calculate weight of evidence for each category, and plot results, reusing the discrete-variable code with minimal changes for ordered categories.
Data preparation. Preprocessing continuous variables: Automating calculations
Data preparation. Preprocessing continuous variables: creating dummies (Part 1)7:20
Create dummy variables for term and employment length to prep model in python. Use np.where, is in, and range to form categories and establish reference category for weight of evidence.
Data preparation. Preprocessing continuous variables: creating dummies (Part 1)
Data preparation. Preprocessing continuous variables: creating dummies (Part 2)14:01
Preprocess continuous variables with fine and coarse classing using pandas cut; create dummies and weight of evidence plots for months since issue date and interest rate; skip funded amount.
Data preparation. Preprocessing continuous variables: creating dummies (Part 2)
Data preparation. Preprocessing continuous variables: creating dummies. Homework1:01
Data preparation. Preprocessing continuous variables: creating dummies (Part 3)12:31
Learn how to preprocess continuous variables for credit risk modeling in Python by creating dummies and weight of evidence, with examples on annual income and months since last delinquency.
Data preparation. Preprocessing continuous variables: creating dummies (Part 3)
Data preparation. Preprocessing continuous variables: creating dummies. Homework0:47
Data preparation. Preprocessing the test dataset4:11
Prepare the test set by applying the same dummy variables used for training, copying preprocessing steps from training to test, and saving the preprocessed data as csv files for modeling.
PD model: data preparation notebooks0:03

The PD model. Logistic regression with dummy variables8:21
Explore the probability of default model built with logistic regression and dummy variables, interpreting log-odds and odds to show how factors like income affect default probability.
The PD model. Logistic regression with dummy variables
Loading the data and selecting the features5:31
Load and prepare data for a credit risk PD model by selecting relevant dummy variables, removing reference categories to avoid the dummy trap, and readying inputs for model estimation.
PD model estimation3:44
estimate a pd model using logistic regression in python with sklearn, fit inputs and targets, and create a summary table of feature names with coefficients plus the intercept.
Build a logistic regression model with p-values10:44
Explore building a logistic regression model with multivariate p-values to identify which borrower attributes (dummy variables) truly explain default, using a custom p-value aware class and cutoff thresholds.
Build a logistic regression model with p-values
Interpreting the coefficients in the PD model5:57
Interpret the coefficients of a PD logistic model by comparing dummy categories to the reference, using odds ratios to show how A–F increase the odds of being good over G.
Interpreting the coefficients in the PD model

Out-of-sample validation (test)6:56
Assess out-of-sample performance of a credit risk model by applying the trained PD model to test data, using predict_proba to obtain good-borrower probabilities and analyze metrics.
Out-of-sample validation (test)
Evaluation of model performance: accuracy and area under the curve (AUC)11:00
Assess credit risk model performance by using confusion matrices and thresholds, then evaluate accuracy and ROC AUC to compare models and balance false positives and true positives.
Evaluation of model performance: accuracy and area under the curve (AUC)
Evaluation of model performance: Gini and Kolmogorov-Smirnov9:59
Explore how Gini and Kolmogorov-Smirnov assess classification model performance in credit risk, comparing good and bad borrowers using predicted probabilities and cumulative distributions.
Evaluation of model performance: Gini and Kolmogorov-Smirnov

Calculating probability of default for a single customer4:31
Learn to compute a borrower's probability of default from a PD model, convert log odds to probability, and see how scorecards standardize credit risk models.
Creating a scorecard12:54
Create a scorecard from the PD model to produce interpretable credit scores between 300 and 850, by aligning coefficients, including reference categories, and applying a careful rounding adjustment.
Creating a scorecard
Calculating credit score6:03
Learn to compute credit scores using a scorecard by summing dummy-variable coefficients, including the intercept, and applying a dot-product with test data to score borrowers.
Calculating credit score
From credit score to PD3:06
Learn to transform credit scores into probability of default using PD model coefficients in credit risk modeling, compare direct PD with score-based estimates, and examine rounding effects for loan decisions.
From credit score to PD
Setting cut-offs8:38
Set loan approval cutoffs using probability of default or probability of being good and credit scores, balancing approval rates with loan quality via roc curves.
Setting cut-offs
Setting cut-offs. Homework0:38
PD model: logistic regression notebooks0:02

PD model monitoring via assessing population stability4:59
Assess your PD model over time by comparing original and new populations with the population stability index (PSI). Use PSI thresholds to decide when to keep or rebuild the model.
PD model monitoring via assessing population stability
Population stability index: preprocessing11:42
Preprocess data to compute population stability index by aligning train and 2015 datasets, building dummy variables, and scoring with the scorecard for both inputs and the credit score.
Population stability index: calculation and interpretation10:46
Calculate the population stability index for all features using dummy variable categories and proportions, then interpret PSI against thresholds; detect shifts in initial list status and score, prompting retraining.
Population stability index: calculation and interpretation
Homework: building an updated PD model0:31

LGD and EAD models: independent variables.6:22
Learn how to prepare data for LGD and EAD models using independent variables as the PD model, selecting charged-off accounts and applying dummy and continuous variables with zero imputation.
LGD and EAD models: independent variables
LGD and EAD models: dependent variables4:51
Define lgd and ead dependent variables: recovery rate and credit conversion factor, recovery rate = recoveries over funded amount, credit conversion factor = (funded minus recovered principal) over funded amount.
LGD and EAD models: dependent variables
LGD and EAD models: distribution of recovery rates and credit conversion factors5:35
Explore LGD and EAD modeling with a two-stage LGD approach—logistic regression for zero versus nonzero recovery, then linear regression for the amount—and use multiple linear regression for CCF.
LGD and EAD models: distribution of recovery rates and credit conversion factors

Requirements

No prior experience is required. We will start from the very basics
You’ll need to install Anaconda and Python. We will show you how to do that step by step

Description

Hi! Welcome to Credit Risk Modeling in Python. This is the only online course that teaches you how banks use data science modeling in Python to improve their performance and comply with regulatory requirements. This is the perfect course for you, if you are interested in a data science career. Here’s why:

· The instructor is a proven expert, holding a PhD from the Norwegian Business school and having taught in world renowned universities such as HEC, the University of Texas, and the Norwegian Business school).

· The course is suitable for beginners. We start with theory and initial data pre-processing and gradually solve a complete exercise in front of you

· Everything we cover is up-to-date and relevant in today’s development of Python models for the banking industry

· This is the only online course that provides a complete picture of credit risk in Python (using state of the art techniques to model all three aspects of the expected loss equation - PD, LGD, and EAD) including creating a scorecard from scratch

· Here we show you how to create models that are compliant with Basel II and Basel III regulations that other courses rarely touch upon

· We are not going to work with fake data. The dataset used in this course is an actual real-world example

· You get to differentiate your data science portfolio by showing skills that are highly demanded in the job marketplace

· What is most important – you get to see first-hand how a data science task is solved in the real-world

Most data science courses cover several frameworks but skip the pre-processing and theoretical part. This is like learning how to taste wine before being able to open a bottle of wine.

We don’t do that. Our goal is to help you build a solid foundation. We want you to study the theory, learn how to pre-process data that does not necessarily come in the ‘’friendliest’’ format, and of course, only then we will show you how to build a state of the art model and how to evaluate its effectiveness.

Throughout the course, we will cover several important data science techniques.

- Weight of evidence

- Information value

- Fine classing

- Coarse classing

- Linear regression

- Logistic regression

- Area Under the Curve

- Receiver Operating Characteristic Curve

- Gini Coefficient

- Kolmogorov-Smirnov

- Assessing Population Stability

- Maintaining a model

Along with the video lessons you will receive several valuable resources that will help you learn as much as possible:

· Lectures

· Notebook files

· Homework

· Quiz questions

· Slides

· Downloads

· Access to Q&A where you could reach out and contact the course tutor.

Signing up for the course today could be a great step towards your career in data science. Make sure that you take full advantage of this amazing opportunity!

See you on the inside!

Who this course is for:

You should take this course if you are a data science student interested in improving their skills
You should take this course if you want to specialize in credit risk modeling
The course is also ideal for beginners, as it starts from the fundamentals and gradually builds up your skills
This course is for you if you want a great career

Credit Risk Modeling in Python

What you'll learn

Explore related topics

Course content

Introduction6 lectures • 38min

Setting up the working environment6 lectures • 18min

Dataset description2 lectures • 10min

General preprocessing6 lectures • 29min

PD Model: Data Preparation19 lectures • 1hr 56min

PD model estimation5 lectures • 34min

PD model validation3 lectures • 28min

Applying the PD Model for decision making7 lectures • 36min

PD model monitoring4 lectures • 28min

LGD and EAD Models: Preparing the data3 lectures • 17min

Requirements

Description

Who this course is for: