Credit Scoring Models
Expert-defined terms from the Certificate in Credit Risk Analytics in Python course at LearnUNI. Free to read, free to share, paired with a professional course.
Acquisition Cost – the total expense incurred to obtain a new borrower, i… #
Example: A bank spends $150 on advertising and $50 on staff time for each approved loan, resulting in an acquisition cost of $200 per customer. Understanding acquisition cost helps firms set pricing strategies and evaluate the profitability of credit products.
Alternative Data – non‑traditional information sources used to supplement… #
Practical application: Machine‑learning models ingest alternative data to improve score accuracy for thin‑file applicants. Challenge: Ensuring data privacy compliance and mitigating bias introduced by unconventional variables.
Annual Percentage Rate (APR) – the yearly cost of borrowing expressed as… #
Example: A loan with a 6% nominal rate and $100 in fees on a $1,000 principal may have an APR of approximately 7.2%. APR is a key output of credit scoring models when estimating borrower affordability.
Balance Sheet Ratios – financial metrics derived from a borrower’s balanc… #
Application: These ratios serve as predictive variables in logistic regression or tree‑based models to assess default risk. Challenge: Ratios can be distorted by accounting policies, requiring careful preprocessing.
Bootstrap Aggregating (Bagging) – an ensemble technique that builds multi… #
Example: Random Forests are a bagging method applied to decision trees for credit scoring. Bagging improves model stability, especially when dealing with noisy financial datasets.
Calibration – the process of adjusting predicted probabilities from a sco… #
Technique: Platt scaling or isotonic regression are common calibration methods. Accurate calibration ensures that a score of 0.02 Truly reflects a 2% default probability, which is crucial for risk‑based pricing.
Coefficient of Determination (R²) – a statistical measure indicating the… #
In credit scoring, R² is less informative than classification metrics but can be used for regression‑based loss‑given‑default models. Note: High R² does not guarantee good discrimination between good and bad borrowers.
Confusion Matrix – a tabular representation of classification outcomes #
True positives (TP), false positives (FP), true negatives (TN), and false negatives (FN). From this matrix, metrics such as accuracy, precision, recall, and F1‑score are derived. Practical use: Evaluating a logistic‑regression credit score on a validation set.
Cross‑Validation – a resampling technique that partitions data into K fol… #
Benefit: Provides robust performance estimates and guards against overfitting. In credit risk analytics, stratified K‑fold cross‑validation is preferred to preserve default rates across folds.
Cut‑off Score – the threshold probability or score above which a loan app… #
Determining the optimal cut‑off involves balancing acceptance rates, expected loss, and profitability. Example: A bank may set a cut‑off of 0.30, Meaning applicants with predicted default probability below 30% are approved.
Decision Tree – a hierarchical model that splits data based on feature th… #
Application: CART (Classification and Regression Trees) are widely used for interpretable credit scoring models. Challenge: Trees can overfit; pruning and limiting depth are essential safeguards.
Default Probability (PD) – the likelihood that a borrower will fail to me… #
Formula: PD = Number of defaults / Total number of borrowers in the cohort. PD estimates feed into capital allocation, pricing, and provisioning.
Delinquency – the condition of a loan being past due, commonly measured i… #
G., 30‑Day delinquency). Relevance: Delinquency status is an early indicator of credit deterioration and often used as a target variable for short‑term scoring models.
Discriminatory Power – the ability of a credit scoring model to separate… #
Measured by the Gini coefficient or the Area Under the Receiver Operating Characteristic Curve (AUC‑ROC). Higher discriminatory power implies more effective risk differentiation.
Distributional Shift – changes in the statistical properties of input fea… #
Example: Economic downturns can alter default rates, causing a model trained on pre‑crisis data to under‑predict risk. Monitoring and periodic model retraining mitigate this issue.
Elastic Net – a regularization technique that combines L1 (Lasso) and L2… #
Use case: In high‑dimensional credit datasets, Elastic Net helps prevent overfitting while retaining important predictors.
Empirical Bayes – a statistical approach that leverages observed data to… #
Application: Improves PD estimates for small loan portfolios by borrowing strength from larger, related groups.
Ensemble Modeling – the practice of combining multiple predictive models… #
Techniques include bagging, boosting, and stacking. In credit scoring, ensembles can achieve higher AUC‑ROC than any individual model.
Feature Engineering – the process of creating, transforming, and selectin… #
Common techniques include binning continuous variables, generating interaction terms, and encoding categorical data. Challenge: Excessive feature creation can lead to multicollinearity and overfitting.
Feature Importance – a measure indicating how much each predictor contrib… #
In tree‑based models, importance can be derived from impurity reduction; in linear models, from absolute coefficient values. Understanding importance aids interpretability and regulatory compliance.
Fine‑Tuning – the adjustment of hyperparameters (e #
G., Learning rate, max depth) after initial model training to optimize performance. Grid search, random search, and Bayesian optimization are common fine‑tuning strategies for credit scoring algorithms.
Gini Coefficient – a metric derived from the Lorenz curve that quantifies… #
In credit scoring, the Gini is expressed as twice the area between the ROC curve and the diagonal, ranging from 0 (no discrimination) to 1 (perfect discrimination). It is frequently reported alongside AUC.
Gradient Boosting Machine (GBM) – an ensemble method that builds sequenti… #
Popular implementations include XGBoost, LightGBM, and CatBoost. GBMs often achieve state‑of‑the‑art performance on credit scoring datasets but require careful regularization to avoid overfitting.
Imbalanced Data – a situation where the number of good borrowers vastly e… #
Mitigation: Techniques such as SMOTE oversampling, under‑sampling, and cost‑sensitive learning help balance the training process.
Information Value (IV) – a statistic that quantifies the predictive power… #
Calculated as Σ (WOE_i × (Distribution_good_i – Distribution_bad_i)). IV > 0.3 Is considered strong; IV < 0.02 Is weak. IV guides variable selection and transformation.
Interpretability – the degree to which a model’s predictions can be under… #
Linear models and shallow decision trees are highly interpretable, while deep neural networks are less so. Regulatory frameworks often demand transparent scoring models, making interpretability a key design consideration.
K‑Fold Cross‑Validation – a specific cross‑validation technique where the… #
Each fold serves as a validation set once, and performance metrics are averaged across K runs. Commonly K=5 or K=10 in credit risk projects.
Logistic Regression – a statistical classification method that models the… #
Widely used in credit scoring for its simplicity, interpretability, and ease of calibration. Coefficients represent the change in log‑odds per unit increase in the predictor.
Loss Given Default (LGD) – the proportion of exposure that is not recover… #
Expressed as a percentage, LGD = (Exposure – Recovery) / Exposure. LGD models often employ regression techniques on collateral, industry, and macroeconomic variables.
Macro‑Economic Variables – external factors such as unemployment rate, GD… #
Incorporating macro variables into scoring models improves predictive power during economic cycles. Challenge: Timely data acquisition and lag effects must be accounted for.
Margin of Error – the range within which a sample estimate is expected to… #
In credit scoring, margin of error informs the reliability of PD estimates derived from limited samples.
Model Drift – the gradual degradation of model performance over time due… #
Continuous monitoring of key performance indicators (KPIs) like AUC and KS statistic helps detect drift early.
Multicollinearity – a condition where two or more predictors are highly c… #
Detection methods include variance inflation factor (VIF) analysis. Remedies involve removing redundant variables or applying dimensionality reduction.
Neural Network – a set of algorithms inspired by biological neurons, capa… #
Deep learning architectures (e.G., Feed‑forward, recurrent) are increasingly explored for credit scoring, especially when large, unstructured data sources are available. Challenge: Balancing predictive gains against interpretability and regulatory acceptance.
One‑Hot Encoding – a technique for converting categorical variables into… #
Essential for algorithms that cannot handle non‑numeric inputs, such as linear regression and many tree‑based methods.
Out‑of‑Bag (OOB) Error – an internal estimate of model error for bagging… #
OOB error provides a convenient validation metric without needing a separate hold‑out set.
Partial Dependence Plot (PDP) – a visual tool that shows the marginal eff… #
PDPs aid interpretation of complex models like GBMs by illustrating how changes in a variable influence default probability.
Performance Monitoring – the ongoing assessment of a deployed credit scor… #
Key metrics include AUC, KS statistic, population stability index (PSI), and calibration curves. Alerts trigger model review or retraining when thresholds are breached.
Probability of Default (PD) Curve – a graphical representation of predict… #
A well‑calibrated curve will closely align predicted and observed values.
Quantile Binning – a method of discretizing continuous variables by divid… #
G., Deciles). Binning reduces noise, facilitates Weight of Evidence (WoE) calculation, and improves model stability.
Receiver Operating Characteristic (ROC) Curve – a plot of true positive r… #
The area under the ROC curve (AUC) quantifies overall discriminative ability. In credit scoring, AUC values above 0.70 Are generally considered acceptable.
Regularization – a set of techniques (L1, L2, Elastic Net) that penalize… #
Regularization is especially important in high‑dimensional credit datasets where the number of predictors may approach or exceed the number of observations.
Risk‑Adjusted Return on Capital (RAROC) – a metric that compares expected… #
RAROC = (Expected Return – Expected Loss) / Economic Capital. Credit scoring models feed PD and LGD estimates into RAROC calculations for portfolio optimization.
Sample Weighting – assigning different importance to observations during… #
In logistic regression, weights can be set equal to the loan amount to prioritize high‑value accounts.
Segmentation – dividing a borrower population into homogeneous groups bas… #
Segmented models may capture distinct risk patterns and enable targeted pricing strategies.
Shapley Values – a game‑theoretic method for attributing contribution of… #
Shapley values provide consistent, locally accurate explanations, making them valuable for interpreting complex models like gradient‑boosted trees.
Stability Index (PSI) – a statistical measure that compares the distribut… #
PSI > 0.25 Typically signals a significant shift, prompting model review.
Stratified Sampling – a technique that ensures each class (e #
G., Default vs. Non‑default) is proportionally represented in training and test splits. This approach preserves the original default rate, leading to more reliable performance estimates.
Supervised Learning – a class of machine‑learning algorithms that learn a… #
G., Default flag). Credit scoring is a classic supervised learning problem, with labels derived from historical loan outcomes.
Target Leakage – the inadvertent inclusion of information that would not… #
Leakage inflates apparent model performance and must be eliminated during data preparation.
Temporal Validation – evaluating a model on a hold‑out period that chrono… #
Temporal validation reveals how well the model generalizes to future data, accounting for macro‑economic trends.
Threshold Optimization – the process of selecting a cut‑off score that ma… #
Optimization may involve solving a simple profit equation or running a grid search over possible thresholds.
Tree‑Based Models – algorithms that partition the feature space into rect… #
Examples include CART, Random Forest, and Gradient Boosting. Tree‑based models handle nonlinearities and interactions automatically, making them popular for credit scoring.
Underwriting – the assessment process through which lenders evaluate borr… #
Automated underwriting systems rely on scoring models to make rapid, consistent decisions.
Validation Set – a subset of data reserved for tuning model hyperparamete… #
In credit risk projects, a separate validation set helps guard against overfitting to the training data.
Weight of Evidence (WoE) – a transformation of categorical or binned nume… #
WoE encoding preserves monotonic relationships and simplifies logistic regression coefficient interpretation.
Yield Curve – a graph showing the relationship between interest rates and… #
While not a direct scoring input, yield curve movements affect borrower cost of capital and can be incorporated into macro‑economic features.
Z‑Score – a statistical measure representing the number of standard devia… #
In credit risk, Z‑scores are used for outlier detection and for standardizing variables before modeling.
Zero‑Inflated Models – statistical techniques that handle datasets with a… #
Zero‑inflated Poisson or negative binomial models can be employed when modeling count‑based loss events.
Algorithmic Bias – systematic discrimination that arises when a model’s p… #
G., Based on gender or ethnicity). Mitigation strategies include fairness constraints, re‑weighting, and careful variable selection.
Back‑Testing – the retrospective evaluation of a scoring model by applyin… #
Back‑testing validates model assumptions and informs adjustments before deployment.
Bootstrapping – a resampling method that creates multiple datasets by sam… #
Used for estimating confidence intervals of model metrics and for generating OOB error estimates in ensemble methods.
Business Rule Engine – a rule‑based system that applies deterministic cri… #
G., Minimum income, maximum debt‑to‑income) before or alongside statistical scoring. Business rules provide a safety net for extreme cases where the statistical model may be unreliable.
Calibration Plot – a visual tool that compares predicted probability bins… #
A perfectly calibrated model lies on the 45‑degree line.
Data Imputation – the process of filling missing values in a dataset #
Common techniques include mean/median substitution, k‑nearest neighbors, and model‑based imputation. Proper imputation prevents loss of valuable observations and reduces bias.
Environ‑mental Stress Testing – scenario analysis that evaluates model pe… #
G., Recession, high unemployment). Stress testing ensures that scoring models remain robust during economic shocks.
Feature Selection – the identification of a subset of predictors that con… #
Methods include recursive feature elimination, mutual information ranking, and regularization paths. Reducing feature count improves interpretability and reduces computational cost.
Hyperparameter Tuning – the adjustment of algorithm‑specific settings (e #
G., Number of trees, learning rate) that are not learned from the data. Proper tuning can significantly enhance model accuracy and generalization.
In‑Sample vs #
Out‑of‑Sample – In‑sample performance refers to metrics calculated on the data used to train the model, while out‑of‑sample performance uses unseen data. A large gap indicates overfitting; stable models exhibit similar metrics across both.
Jensen’s Inequality – a mathematical principle stating that the transform… #
In credit risk, this underlies the bias introduced when aggregating predicted probabilities without proper weighting.
KPI Dashboard – a visual interface displaying key performance indicators… #
Dashboards enable risk managers to monitor model health in real time and trigger alerts when thresholds are breached.
Log‑Odds – the natural logarithm of the odds of default; central to logis… #
Converting predicted probabilities to log‑odds simplifies linear interpretation of coefficient effects.
Monte Carlo Simulation – a computational technique that generates a large… #
G., Portfolio loss). Monte Carlo methods incorporate PD, LGD, and exposure at default (EAD) to estimate capital requirements.
Noise Ratio – the proportion of random error relative to the signal in a… #
High noise reduces model predictability and may necessitate dimensionality reduction or more robust algorithms.
Outlier Detection – the identification of observations that deviate marke… #
Techniques include Z‑score thresholds, isolation forests, and robust Mahalanobis distance. Handling outliers prevents distortion of model coefficients.
Performance Attribution – the decomposition of portfolio results into com… #
G., Pricing, selection, risk) to understand the contribution of the scoring model to overall profitability.
Quantitative Risk Management – the systematic application of statistical… #
Credit scoring is a core quantitative tool within this discipline.
Random Forest – an ensemble of decision trees built on bootstrapped sampl… #
Random Forests provide strong predictive performance and built‑in measures of feature importance, while reducing overfitting compared to single trees.
Sample Bias – distortion arising when the training data are not represent… #
Correcting sample bias may involve re‑weighting or augmenting the dataset.
Scorecard – a tabular representation that assigns points to each predicto… #
Traditional scorecards translate logistic‑regression coefficients into integer points, facilitating manual underwriting and regulatory reporting.
Segmentation‑Specific Models – separate scoring models built for distinct… #
G., SME vs. Consumer). Tailored models capture unique risk drivers and improve discrimination within each segment.
Threshold‑Based Alerts – automated notifications triggered when a borrowe… #
Threshold alerts support proactive risk management.
Unsupervised Learning – algorithms that discover patterns without labeled… #
While not directly used for scoring, unsupervised techniques can inform feature engineering and segmentation.
Variance Inflation Factor (VIF) – a diagnostic metric that quantifies the… #
VIF > 10 often signals problematic correlation requiring remedial action.
Weighted Average Cost of Capital (WACC) – the average rate a company is e… #
In credit risk, WACC informs the discount rate used in expected loss calculations and profitability analysis.
eXtreme Gradient Boosting (XGBoost) – a high‑performance implementation o… #
XGBoost is a popular choice for credit scoring competitions due to its speed and accuracy.
Yield‑to‑Maturity (YTM) – the total return anticipated on a bond if held… #
Though more relevant to fixed‑income analysis, YTM trends can be incorporated as macro indicators influencing borrower creditworthiness.
Zero‑Coupon Bond – a bond that pays no periodic interest and is sold at a… #
Its pricing dynamics provide insight into long‑term interest rate expectations, which can affect credit risk modeling.
Zero‑Inflated Poisson (ZIP) Model – a statistical model for count data wi… #
In credit risk, ZIP models may be applied to count of missed payments before default.