
Explore credit risk modeling in Python, from fundamentals to building PD, LGD, and EAD models. Learn preprocessing, scorecard creation, and Basel II/III compliance to estimate expected loss.
Learn how lenders assess credit risk to protect profits, using collateral and risk-based pricing to manage defaults on credit cards, home loans, and asset financing.
Explore how lenders estimate expected loss from credit risk using PD, LGD, and EAD. See how these components determine exposure and potential losses in a loan example.
Examine capital adequacy and Basel II regulations, focusing on capital requirements and risk weighted assets. Explore Basel II credit risk approaches: standardized, foundation IRB, and advanced IRB.
Basel II offers the standardised approach, foundation internal ratings based approach, and advanced internal ratings based approach to model expected loss from PD, LGD, and EAD.
Explore how facility types influence credit risk modeling in Python, using logistic regression for PD and beta regression for LGD and EAD, with risk based pricing insights.
Set up the Python data science environment by installing Anaconda, Python 3, Jupyter Notebook, and the relevant packages, and learn the coding environment we will use throughout the course.
Discover why Python and Jupyter power data science with open source, general purpose language benefits, and the IPython notebook workflow using kernels and notebooks.
Install Anaconda to get Python, Jupyter Notebook, and data science packages. Choose Windows, Mac, or Linux, then verify Python version and run the installer to open the Jupyter dashboard.
Navigate the Jupiter dashboard to manage files and folders with checkboxes, upload notebooks, create new python notebooks or text files, and run code in an interactive shell.
Jupyter dashboard part 2 covers working with input and output cells, code and markdown cells, and essential shortcuts for executing, inserting, and deleting cells.
Install key libraries for credit risk modeling—scikit-learn, matplotlib, seaborn, and pickle—via the Anaconda prompt and pip, with notes on numpy, scipy, and pandas.
Explore a Lending Club consumer loan dataset to build initial expected loss models, starting with data exploration in Excel before transitioning to Python for preprocessing.
Learn to build pd, lgd, and ead models in Python using logistic and beta regression, and preprocess data with dummy encoding and coarse and fine classing of variables.
Import data into Python using numpy and pandas, load a CSV into a dataframe, create backups and copies for preprocessing, inspect data types, and prepare for preprocessing challenges.
Convert employment length and term to numeric by cleaning text formats. Compute months since earliest credit line and impute negative values with the maximum observed.
Preprocess discrete variables by turning categorical features into dummy variables with get_dummies. Prefix names and concatenate the resulting dummies to the loan data frame for modeling.
Learn to detect and clean missing data using pandas isnull, count gaps by variable, and impute values with fill, exemplified on total revolving limit using funded amount.
Learn to build a credit risk model by calculating expected loss from pd, lgd, and ead using logistic regression with dummy variables, and define default by 90 days past due.
Define the dependent variable for default by creating a good_bad indicator from loan status. Use the default statuses to assign zero or one for logistic regression modeling.
Explore fine classing and coarse classing to convert discrete and continuous variables into effective dummies, using weight of evidence to gauge each category's predictive power for credit risk modeling.
Compute information value from weight of evidence to measure how a variable explains the dependent variable and support pre-selection with category weights and practical calculation examples.
Learn data preparation for credit risk modeling by performing train-test splits to prevent overfitting and underfitting, configure random state, and evaluate logistic regression with sklearn.
Explore data preparation for a discrete grade variable by computing weight of evidence and information value for credit risk modeling in Python, with steps for grouping, proportions, and pre-processing.
Automate weight of evidence and information value calculations for discrete variables by building a reusable pandas function that handles any categorical variable and its outcome.
Learn to visualize weight of evidence for discrete variables using matplotlib and seaborn with a plot_by_weight_of_evidence function that plots categories versus weight of evidence and rotates x axis labels.
Apply weight of evidence to discrete variables, create and combine dummy variables for the PD model, designate worst-risk references, and preserve grade A-G as separate dummies.
Preprocess discrete variables by converting the address state into dummy variables using weight of evidence, then plot results to inform category groupings and the regression reference category.
Automate preprocessing of continuous variables by fine classing, calculate weight of evidence for each category, and plot results, reusing the discrete-variable code with minimal changes for ordered categories.
Create dummy variables for term and employment length to prep model in python. Use np.where, is in, and range to form categories and establish reference category for weight of evidence.
Preprocess continuous variables with fine and coarse classing using pandas cut; create dummies and weight of evidence plots for months since issue date and interest rate; skip funded amount.
Learn how to preprocess continuous variables for credit risk modeling in Python by creating dummies and weight of evidence, with examples on annual income and months since last delinquency.
Prepare the test set by applying the same dummy variables used for training, copying preprocessing steps from training to test, and saving the preprocessed data as csv files for modeling.
Explore the probability of default model built with logistic regression and dummy variables, interpreting log-odds and odds to show how factors like income affect default probability.
Load and prepare data for a credit risk PD model by selecting relevant dummy variables, removing reference categories to avoid the dummy trap, and readying inputs for model estimation.
estimate a pd model using logistic regression in python with sklearn, fit inputs and targets, and create a summary table of feature names with coefficients plus the intercept.
Explore building a logistic regression model with multivariate p-values to identify which borrower attributes (dummy variables) truly explain default, using a custom p-value aware class and cutoff thresholds.
Interpret the coefficients of a PD logistic model by comparing dummy categories to the reference, using odds ratios to show how A–F increase the odds of being good over G.
Assess out-of-sample performance of a credit risk model by applying the trained PD model to test data, using predict_proba to obtain good-borrower probabilities and analyze metrics.
Assess credit risk model performance by using confusion matrices and thresholds, then evaluate accuracy and ROC AUC to compare models and balance false positives and true positives.
Explore how Gini and Kolmogorov-Smirnov assess classification model performance in credit risk, comparing good and bad borrowers using predicted probabilities and cumulative distributions.
Learn to compute a borrower's probability of default from a PD model, convert log odds to probability, and see how scorecards standardize credit risk models.
Create a scorecard from the PD model to produce interpretable credit scores between 300 and 850, by aligning coefficients, including reference categories, and applying a careful rounding adjustment.
Learn to compute credit scores using a scorecard by summing dummy-variable coefficients, including the intercept, and applying a dot-product with test data to score borrowers.
Learn to transform credit scores into probability of default using PD model coefficients in credit risk modeling, compare direct PD with score-based estimates, and examine rounding effects for loan decisions.
Set loan approval cutoffs using probability of default or probability of being good and credit scores, balancing approval rates with loan quality via roc curves.
Assess your PD model over time by comparing original and new populations with the population stability index (PSI). Use PSI thresholds to decide when to keep or rebuild the model.
Preprocess data to compute population stability index by aligning train and 2015 datasets, building dummy variables, and scoring with the scorecard for both inputs and the credit score.
Calculate the population stability index for all features using dummy variable categories and proportions, then interpret PSI against thresholds; detect shifts in initial list status and score, prompting retraining.
Learn how to prepare data for LGD and EAD models using independent variables as the PD model, selecting charged-off accounts and applying dummy and continuous variables with zero imputation.
Define lgd and ead dependent variables: recovery rate and credit conversion factor, recovery rate = recoveries over funded amount, credit conversion factor = (funded minus recovered principal) over funded amount.
Explore LGD and EAD modeling with a two-stage LGD approach—logistic regression for zero versus nonzero recovery, then linear regression for the amount—and use multiple linear regression for CCF.
Hi! Welcome to Credit Risk Modeling in Python. This is the only online course that teaches you how banks use data science modeling in Python to improve their performance and comply with regulatory requirements. This is the perfect course for you, if you are interested in a data science career. Here’s why:
· The instructor is a proven expert, holding a PhD from the Norwegian Business school and having taught in world renowned universities such as HEC, the University of Texas, and the Norwegian Business school).
· The course is suitable for beginners. We start with theory and initial data pre-processing and gradually solve a complete exercise in front of you
· Everything we cover is up-to-date and relevant in today’s development of Python models for the banking industry
· This is the only online course that provides a complete picture of credit risk in Python (using state of the art techniques to model all three aspects of the expected loss equation - PD, LGD, and EAD) including creating a scorecard from scratch
· Here we show you how to create models that are compliant with Basel II and Basel III regulations that other courses rarely touch upon
· We are not going to work with fake data. The dataset used in this course is an actual real-world example
· You get to differentiate your data science portfolio by showing skills that are highly demanded in the job marketplace
· What is most important – you get to see first-hand how a data science task is solved in the real-world
Most data science courses cover several frameworks but skip the pre-processing and theoretical part. This is like learning how to taste wine before being able to open a bottle of wine.
We don’t do that. Our goal is to help you build a solid foundation. We want you to study the theory, learn how to pre-process data that does not necessarily come in the ‘’friendliest’’ format, and of course, only then we will show you how to build a state of the art model and how to evaluate its effectiveness.
Throughout the course, we will cover several important data science techniques.
- Weight of evidence
- Information value
- Fine classing
- Coarse classing
- Linear regression
- Logistic regression
- Area Under the Curve
- Receiver Operating Characteristic Curve
- Gini Coefficient
- Kolmogorov-Smirnov
- Assessing Population Stability
- Maintaining a model
Along with the video lessons you will receive several valuable resources that will help you learn as much as possible:
· Lectures
· Notebook files
· Homework
· Quiz questions
· Slides
· Downloads
· Access to Q&A where you could reach out and contact the course tutor.
Signing up for the course today could be a great step towards your career in data science. Make sure that you take full advantage of this amazing opportunity!
See you on the inside!