Credit Scoring Models — Glossary · Certificate in Credit Risk Analytics in Python

Acquisition Cost – the total expense incurred to obtain a new borrower, i… #

Example: A bank spends $150 on advertising and $50 on staff time for each approved loan, resulting in an acquisition cost of $200 per customer. Understanding acquisition cost helps firms set pricing strategies and evaluate the profitability of credit products.

Alternative Data – non‑traditional information sources used to supplement… #

Practical application: Machine‑learning models ingest alternative data to improve score accuracy for thin‑file applicants. Challenge: Ensuring data privacy compliance and mitigating bias introduced by unconventional variables.

Annual Percentage Rate (APR) – the yearly cost of borrowing expressed as… #

Example: A loan with a 6% nominal rate and $100 in fees on a $1,000 principal may have an APR of approximately 7.2%. APR is a key output of credit scoring models when estimating borrower affordability.

Balance Sheet Ratios – financial metrics derived from a borrower’s balanc… #

Application: These ratios serve as predictive variables in logistic regression or tree‑based models to assess default risk. Challenge: Ratios can be distorted by accounting policies, requiring careful preprocessing.

Bootstrap Aggregating (Bagging) – an ensemble technique that builds multi… #

Example: Random Forests are a bagging method applied to decision trees for credit scoring. Bagging improves model stability, especially when dealing with noisy financial datasets.

Calibration – the process of adjusting predicted probabilities from a sco… #

Technique: Platt scaling or isotonic regression are common calibration methods. Accurate calibration ensures that a score of 0.02 Truly reflects a 2% default probability, which is crucial for risk‑based pricing.

Coefficient of Determination (R²) – a statistical measure indicating the… #

In credit scoring, R² is less informative than classification metrics but can be used for regression‑based loss‑given‑default models. Note: High R² does not guarantee good discrimination between good and bad borrowers.

Confusion Matrix – a tabular representation of classification outcomes #

True positives (TP), false positives (FP), true negatives (TN), and false negatives (FN). From this matrix, metrics such as accuracy, precision, recall, and F1‑score are derived. Practical use: Evaluating a logistic‑regression credit score on a validation set.

Cross‑Validation – a resampling technique that partitions data into K fol… #

Benefit: Provides robust performance estimates and guards against overfitting. In credit risk analytics, stratified K‑fold cross‑validation is preferred to preserve default rates across folds.

Cut‑off Score – the threshold probability or score above which a loan app… #

Determining the optimal cut‑off involves balancing acceptance rates, expected loss, and profitability. Example: A bank may set a cut‑off of 0.30, Meaning applicants with predicted default probability below 30% are approved.

Decision Tree – a hierarchical model that splits data based on feature th… #

Application: CART (Classification and Regression Trees) are widely used for interpretable credit scoring models. Challenge: Trees can overfit; pruning and limiting depth are essential safeguards.

Default Probability (PD) – the likelihood that a borrower will fail to me… #

Formula: PD = Number of defaults / Total number of borrowers in the cohort. PD estimates feed into capital allocation, pricing, and provisioning.

Delinquency – the condition of a loan being past due, commonly measured i… #

G., 30‑Day delinquency). Relevance: Delinquency status is an early indicator of credit deterioration and often used as a target variable for short‑term scoring models.

Discriminatory Power – the ability of a credit scoring model to separate… #

Measured by the Gini coefficient or the Area Under the Receiver Operating Characteristic Curve (AUC‑ROC). Higher discriminatory power implies more effective risk differentiation.

Distributional Shift – changes in the statistical properties of input fea… #

Example: Economic downturns can alter default rates, causing a model trained on pre‑crisis data to under‑predict risk. Monitoring and periodic model retraining mitigate this issue.

Elastic Net – a regularization technique that combines L1 (Lasso) and L2… #

Use case: In high‑dimensional credit datasets, Elastic Net helps prevent overfitting while retaining important predictors.

Empirical Bayes – a statistical approach that leverages observed data to… #

Application: Improves PD estimates for small loan portfolios by borrowing strength from larger, related groups.

Ensemble Modeling – the practice of combining multiple predictive models… #

Techniques include bagging, boosting, and stacking. In credit scoring, ensembles can achieve higher AUC‑ROC than any individual model.

Feature Engineering – the process of creating, transforming, and selectin… #

Common techniques include binning continuous variables, generating interaction terms, and encoding categorical data. Challenge: Excessive feature creation can lead to multicollinearity and overfitting.

Feature Importance – a measure indicating how much each predictor contrib… #

In tree‑based models, importance can be derived from impurity reduction; in linear models, from absolute coefficient values. Understanding importance aids interpretability and regulatory compliance.

Fine‑Tuning – the adjustment of hyperparameters (e #

G., Learning rate, max depth) after initial model training to optimize performance. Grid search, random search, and Bayesian optimization are common fine‑tuning strategies for credit scoring algorithms.

Gini Coefficient – a metric derived from the Lorenz curve that quantifies… #

In credit scoring, the Gini is expressed as twice the area between the ROC curve and the diagonal, ranging from 0 (no discrimination) to 1 (perfect discrimination). It is frequently reported alongside AUC.

Gradient Boosting Machine (GBM) – an ensemble method that builds sequenti… #

Popular implementations include XGBoost, LightGBM, and CatBoost. GBMs often achieve state‑of‑the‑art performance on credit scoring datasets but require careful regularization to avoid overfitting.

Imbalanced Data – a situation where the number of good borrowers vastly e… #

Mitigation: Techniques such as SMOTE oversampling, under‑sampling, and cost‑sensitive learning help balance the training process.

Information Value (IV) – a statistic that quantifies the predictive power… #

Calculated as Σ (WOE_i × (Distribution_good_i – Distribution_bad_i)). IV > 0.3 Is considered strong; IV < 0.02 Is weak. IV guides variable selection and transformation.

Interpretability – the degree to which a model’s predictions can be under… #

Linear models and shallow decision trees are highly interpretable, while deep neural networks are less so. Regulatory frameworks often demand transparent scoring models, making interpretability a key design consideration.

K‑Fold Cross‑Validation – a specific cross‑validation technique where the… #

Each fold serves as a validation set once, and performance metrics are averaged across K runs. Commonly K=5 or K=10 in credit risk projects.

Logistic Regression – a statistical classification method that models the… #

Widely used in credit scoring for its simplicity, interpretability, and ease of calibration. Coefficients represent the change in log‑odds per unit increase in the predictor.

Loss Given Default (LGD) – the proportion of exposure that is not recover… #

Expressed as a percentage, LGD = (Exposure – Recovery) / Exposure. LGD models often employ regression techniques on collateral, industry, and macroeconomic variables.

Macro‑Economic Variables – external factors such as unemployment rate, GD… #

Incorporating macro variables into scoring models improves predictive power during economic cycles. Challenge: Timely data acquisition and lag effects must be accounted for.

Margin of Error – the range within which a sample estimate is expected to… #

In credit scoring, margin of error informs the reliability of PD estimates derived from limited samples.

Model Drift – the gradual degradation of model performance over time due… #

Continuous monitoring of key performance indicators (KPIs) like AUC and KS statistic helps detect drift early.

Multicollinearity – a condition where two or more predictors are highly c… #

Detection methods include variance inflation factor (VIF) analysis. Remedies involve removing redundant variables or applying dimensionality reduction.

Neural Network – a set of algorithms inspired by biological neurons, capa… #

Deep learning architectures (e.G., Feed‑forward, recurrent) are increasingly explored for credit scoring, especially when large, unstructured data sources are available. Challenge: Balancing predictive gains against interpretability and regulatory acceptance.

One‑Hot Encoding – a technique for converting categorical variables into… #

Essential for algorithms that cannot handle non‑numeric inputs, such as linear regression and many tree‑based methods.

Out‑of‑Bag (OOB) Error – an internal estimate of model error for bagging… #

OOB error provides a convenient validation metric without needing a separate hold‑out set.

Partial Dependence Plot (PDP) – a visual tool that shows the marginal eff… #

PDPs aid interpretation of complex models like GBMs by illustrating how changes in a variable influence default probability.

Performance Monitoring – the ongoing assessment of a deployed credit scor… #

Key metrics include AUC, KS statistic, population stability index (PSI), and calibration curves. Alerts trigger model review or retraining when thresholds are breached.

Probability of Default (PD) Curve – a graphical representation of predict… #

A well‑calibrated curve will closely align predicted and observed values.

Quantile Binning – a method of discretizing continuous variables by divid… #

G., Deciles). Binning reduces noise, facilitates Weight of Evidence (WoE) calculation, and improves model stability.

Receiver Operating Characteristic (ROC) Curve – a plot of true positive r… #

The area under the ROC curve (AUC) quantifies overall discriminative ability. In credit scoring, AUC values above 0.70 Are generally considered acceptable.

Regularization – a set of techniques (L1, L2, Elastic Net) that penalize… #

Regularization is especially important in high‑dimensional credit datasets where the number of predictors may approach or exceed the number of observations.

Risk‑Adjusted Return on Capital (RAROC) – a metric that compares expected… #

RAROC = (Expected Return – Expected Loss) / Economic Capital. Credit scoring models feed PD and LGD estimates into RAROC calculations for portfolio optimization.

Sample Weighting – assigning different importance to observations during… #

In logistic regression, weights can be set equal to the loan amount to prioritize high‑value accounts.

Segmentation – dividing a borrower population into homogeneous groups bas… #

Segmented models may capture distinct risk patterns and enable targeted pricing strategies.

Shapley Values – a game‑theoretic method for attributing contribution of… #

Shapley values provide consistent, locally accurate explanations, making them valuable for interpreting complex models like gradient‑boosted trees.

Stability Index (PSI) – a statistical measure that compares the distribut… #

PSI > 0.25 Typically signals a significant shift, prompting model review.

Stratified Sampling – a technique that ensures each class (e #

G., Default vs. Non‑default) is proportionally represented in training and test splits. This approach preserves the original default rate, leading to more reliable performance estimates.

Supervised Learning – a class of machine‑learning algorithms that learn a… #

G., Default flag). Credit scoring is a classic supervised learning problem, with labels derived from historical loan outcomes.

Target Leakage – the inadvertent inclusion of information that would not… #

Leakage inflates apparent model performance and must be eliminated during data preparation.

Temporal Validation – evaluating a model on a hold‑out period that chrono… #

Temporal validation reveals how well the model generalizes to future data, accounting for macro‑economic trends.

Threshold Optimization – the process of selecting a cut‑off score that ma… #

Optimization may involve solving a simple profit equation or running a grid search over possible thresholds.

Tree‑Based Models – algorithms that partition the feature space into rect… #

Examples include CART, Random Forest, and Gradient Boosting. Tree‑based models handle nonlinearities and interactions automatically, making them popular for credit scoring.

Underwriting – the assessment process through which lenders evaluate borr… #

Automated underwriting systems rely on scoring models to make rapid, consistent decisions.

Validation Set – a subset of data reserved for tuning model hyperparamete… #

In credit risk projects, a separate validation set helps guard against overfitting to the training data.

Weight of Evidence (WoE) – a transformation of categorical or binned nume… #

WoE encoding preserves monotonic relationships and simplifies logistic regression coefficient interpretation.

Yield Curve – a graph showing the relationship between interest rates and… #

While not a direct scoring input, yield curve movements affect borrower cost of capital and can be incorporated into macro‑economic features.

Z‑Score – a statistical measure representing the number of standard devia… #

In credit risk, Z‑scores are used for outlier detection and for standardizing variables before modeling.

Zero‑Inflated Models – statistical techniques that handle datasets with a… #

Zero‑inflated Poisson or negative binomial models can be employed when modeling count‑based loss events.

Algorithmic Bias – systematic discrimination that arises when a model’s p… #

G., Based on gender or ethnicity). Mitigation strategies include fairness constraints, re‑weighting, and careful variable selection.

Back‑Testing – the retrospective evaluation of a scoring model by applyin… #

Back‑testing validates model assumptions and informs adjustments before deployment.

Bootstrapping – a resampling method that creates multiple datasets by sam… #

Used for estimating confidence intervals of model metrics and for generating OOB error estimates in ensemble methods.

Business Rule Engine – a rule‑based system that applies deterministic cri… #

G., Minimum income, maximum debt‑to‑income) before or alongside statistical scoring. Business rules provide a safety net for extreme cases where the statistical model may be unreliable.

Calibration Plot – a visual tool that compares predicted probability bins… #

A perfectly calibrated model lies on the 45‑degree line.

Data Imputation – the process of filling missing values in a dataset #

Common techniques include mean/median substitution, k‑nearest neighbors, and model‑based imputation. Proper imputation prevents loss of valuable observations and reduces bias.

Environ‑mental Stress Testing – scenario analysis that evaluates model pe… #

G., Recession, high unemployment). Stress testing ensures that scoring models remain robust during economic shocks.

Feature Selection – the identification of a subset of predictors that con… #

Methods include recursive feature elimination, mutual information ranking, and regularization paths. Reducing feature count improves interpretability and reduces computational cost.

Gaussian Naïve Bayes – a probabilistic classifier assuming feature indepe… #

Though simplistic, it can serve as a baseline model for credit scoring, especially when data are limited.

Hyperparameter Tuning – the adjustment of algorithm‑specific settings (e #

G., Number of trees, learning rate) that are not learned from the data. Proper tuning can significantly enhance model accuracy and generalization.

In‑Sample vs #

Out‑of‑Sample – In‑sample performance refers to metrics calculated on the data used to train the model, while out‑of‑sample performance uses unseen data. A large gap indicates overfitting; stable models exhibit similar metrics across both.

Jensen’s Inequality – a mathematical principle stating that the transform… #

In credit risk, this underlies the bias introduced when aggregating predicted probabilities without proper weighting.

KPI Dashboard – a visual interface displaying key performance indicators… #

Dashboards enable risk managers to monitor model health in real time and trigger alerts when thresholds are breached.

Log‑Odds – the natural logarithm of the odds of default; central to logis… #

Converting predicted probabilities to log‑odds simplifies linear interpretation of coefficient effects.

Monte Carlo Simulation – a computational technique that generates a large… #

G., Portfolio loss). Monte Carlo methods incorporate PD, LGD, and exposure at default (EAD) to estimate capital requirements.

Noise Ratio – the proportion of random error relative to the signal in a… #

High noise reduces model predictability and may necessitate dimensionality reduction or more robust algorithms.

Outlier Detection – the identification of observations that deviate marke… #

Techniques include Z‑score thresholds, isolation forests, and robust Mahalanobis distance. Handling outliers prevents distortion of model coefficients.

Performance Attribution – the decomposition of portfolio results into com… #

G., Pricing, selection, risk) to understand the contribution of the scoring model to overall profitability.

Quantitative Risk Management – the systematic application of statistical… #

Credit scoring is a core quantitative tool within this discipline.

Random Forest – an ensemble of decision trees built on bootstrapped sampl… #

Random Forests provide strong predictive performance and built‑in measures of feature importance, while reducing overfitting compared to single trees.

Sample Bias – distortion arising when the training data are not represent… #

Correcting sample bias may involve re‑weighting or augmenting the dataset.

Scorecard – a tabular representation that assigns points to each predicto… #

Traditional scorecards translate logistic‑regression coefficients into integer points, facilitating manual underwriting and regulatory reporting.

Segmentation‑Specific Models – separate scoring models built for distinct… #

G., SME vs. Consumer). Tailored models capture unique risk drivers and improve discrimination within each segment.

Threshold‑Based Alerts – automated notifications triggered when a borrowe… #

Threshold alerts support proactive risk management.

Unsupervised Learning – algorithms that discover patterns without labeled… #

While not directly used for scoring, unsupervised techniques can inform feature engineering and segmentation.

Variance Inflation Factor (VIF) – a diagnostic metric that quantifies the… #

VIF > 10 often signals problematic correlation requiring remedial action.

Weighted Average Cost of Capital (WACC) – the average rate a company is e… #

In credit risk, WACC informs the discount rate used in expected loss calculations and profitability analysis.

eXtreme Gradient Boosting (XGBoost) – a high‑performance implementation o… #

XGBoost is a popular choice for credit scoring competitions due to its speed and accuracy.

Yield‑to‑Maturity (YTM) – the total return anticipated on a bond if held… #

Though more relevant to fixed‑income analysis, YTM trends can be incorporated as macro indicators influencing borrower creditworthiness.

Zero‑Coupon Bond – a bond that pays no periodic interest and is sold at a… #

Its pricing dynamics provide insight into long‑term interest rate expectations, which can affect credit risk modeling.

Zero‑Inflated Poisson (ZIP) Model – a statistical model for count data wi… #

In credit risk, ZIP models may be applied to count of missed payments before default.