What’s the Price of Monotonicity? A Multi-Dataset Benchmark of Monotone-Constrained Gradient Boosting for Credit PD

ArXiv ID: 2512.17945 “View on arXiv”

Authors: Petr Koklev

Abstract

Financial institutions face a trade-off between predictive accuracy and interpretability when deploying machine learning models for credit risk. Monotonicity constraints align model behavior with domain knowledge, but their performance cost - the price of monotonicity - is not well quantified. This paper benchmarks monotone-constrained versus unconstrained gradient boosting models for credit probability of default across five public datasets and three libraries. We define the Price of Monotonicity (PoM) as the relative change in standard performance metrics when moving from unconstrained to constrained models, estimated via paired comparisons with bootstrap uncertainty. In our experiments, PoM in AUC ranges from essentially zero to about 2.9 percent: constraints are almost costless on large datasets (typically less than 0.2 percent, often indistinguishable from zero) and most costly on smaller datasets with extensive constraint coverage (around 2-3 percent). Thus, appropriately specified monotonicity constraints can often deliver interpretability with small accuracy losses, particularly in large-scale credit portfolios.

Keywords: Credit Risk, Gradient Boosting, Monotonicity Constraints, Probability of Default, Interpretability

Complexity vs Empirical Score

  • Math Complexity: 3.0/10
  • Empirical Rigor: 7.0/10
  • Quadrant: Street Traders
  • Why: The paper’s core methodology relies on standard statistical metrics (AUC) and bootstrap uncertainty quantification, with relatively light mathematical formalism. However, it demonstrates high empirical rigor through its multi-dataset benchmark across five public credit datasets, three libraries, and a paired evaluation design with bootstrap uncertainty, making it heavily data- and implementation-focused.
  flowchart TD
    A["Research Goal<br>Quantify Price of Monotonicity (PoM)<br>in Credit PD Models"] --> B["Methodology<br>Gradient Boosting:<br>Unconstrained vs Monotone-Constrained"]
    B --> C["Data Processing<br>5 Public Credit Datasets<br>(5-Fold Cross-Validation)"]
    C --> D["Computational Process<br>Bootstrap Paired Comparisons<br>Estimate Performance & Uncertainty"]
    D --> E["Key Findings<br>PoM in AUC: 0% to 2.9%<br>Low cost on large datasets<br>Higher cost on small datasets"]