false

What's the Price of Monotonicity? A Multi-Dataset Benchmark of Monotone-Constrained Gradient Boosting for Credit PD

What’s the Price of Monotonicity? A Multi-Dataset Benchmark of Monotone-Constrained Gradient Boosting for Credit PD ArXiv ID: 2512.17945 “View on arXiv” Authors: Petr Koklev Abstract Financial institutions face a trade-off between predictive accuracy and interpretability when deploying machine learning models for credit risk. Monotonicity constraints align model behavior with domain knowledge, but their performance cost - the price of monotonicity - is not well quantified. This paper benchmarks monotone-constrained versus unconstrained gradient boosting models for credit probability of default across five public datasets and three libraries. We define the Price of Monotonicity (PoM) as the relative change in standard performance metrics when moving from unconstrained to constrained models, estimated via paired comparisons with bootstrap uncertainty. In our experiments, PoM in AUC ranges from essentially zero to about 2.9 percent: constraints are almost costless on large datasets (typically less than 0.2 percent, often indistinguishable from zero) and most costly on smaller datasets with extensive constraint coverage (around 2-3 percent). Thus, appropriately specified monotonicity constraints can often deliver interpretability with small accuracy losses, particularly in large-scale credit portfolios. ...

December 14, 2025 · 2 min · Research Team

Bridging Human Cognition and AI: A Framework for Explainable Decision-Making Systems

Bridging Human Cognition and AI: A Framework for Explainable Decision-Making Systems ArXiv ID: 2509.02388 “View on arXiv” Authors: N. Jean, G. Le Pera Abstract Explainability in AI and ML models is critical for fostering trust, ensuring accountability, and enabling informed decision making in high stakes domains. Yet this objective is often unmet in practice. This paper proposes a general purpose framework that bridges state of the art explainability techniques with Malle’s five category model of behavior explanation: Knowledge Structures, Simulation/Projection, Covariation, Direct Recall, and Rationalization. The framework is designed to be applicable across AI assisted decision making systems, with the goal of enhancing transparency, interpretability, and user trust. We demonstrate its practical relevance through real world case studies, including credit risk assessment and regulatory analysis powered by large language models (LLMs). By aligning technical explanations with human cognitive mechanisms, the framework lays the groundwork for more comprehensible, responsible, and ethical AI systems. ...

September 2, 2025 · 2 min · Research Team

Bayesian Estimation of Corporate Default Spreads

Bayesian Estimation of Corporate Default Spreads ArXiv ID: 2503.02991 “View on arXiv” Authors: Unknown Abstract Risk-averse investors often wish to exclude stocks from their portfolios that bear high credit risk, which is a measure of a firm’s likelihood of bankruptcy. This risk is commonly estimated by constructing signals from quarterly accounting items, such as debt and income volatility. While such information may provide a rich description of a firm’s credit risk, the low-frequency with which the data is released implies that investors may be operating with outdated information. In this paper we circumvent this problem by developing a high-frequency credit risk proxy via corporate default spreads which are estimated from daily bond price data. We accomplish this by adapting classic yield curve estimation methods to a corporate bond setting, leveraging advances in Bayesian estimation to ensure higher model stability when working with small sample data which also allows us to directly model the uncertainty of our predictions. ...

March 4, 2025 · 2 min · Research Team

Quantum Powered Credit Risk Assessment: A Novel Approach using hybrid Quantum-Classical Deep Neural Network for Row-Type Dependent Predictive Analysis

Quantum Powered Credit Risk Assessment: A Novel Approach using hybrid Quantum-Classical Deep Neural Network for Row-Type Dependent Predictive Analysis ArXiv ID: 2502.07806 “View on arXiv” Authors: Unknown Abstract The integration of Quantum Deep Learning (QDL) techniques into the landscape of financial risk analysis presents a promising avenue for innovation. This study introduces a framework for credit risk assessment in the banking sector, combining quantum deep learning techniques with adaptive modeling for Row-Type Dependent Predictive Analysis (RTDPA). By leveraging RTDPA, the proposed approach tailors predictive models to different loan categories, aiming to enhance the accuracy and efficiency of credit risk evaluation. While this work explores the potential of integrating quantum methods with classical deep learning for risk assessment, it focuses on the feasibility and performance of this hybrid framework rather than claiming transformative industry-wide impacts. The findings offer insights into how quantum techniques can complement traditional financial analysis, paving the way for further advancements in predictive modeling for credit risk. ...

February 6, 2025 · 2 min · Research Team

Hybrid Quantum Neural Networks with Amplitude Encoding: Advancing Recovery Rate Predictions

Hybrid Quantum Neural Networks with Amplitude Encoding: Advancing Recovery Rate Predictions ArXiv ID: 2501.15828 “View on arXiv” Authors: Unknown Abstract Recovery rate prediction plays a pivotal role in bond investment strategies by enhancing risk assessment, optimizing portfolio allocation, improving pricing accuracy, and supporting effective credit risk management. However, accurate forecasting remains challenging due to complex nonlinear dependencies, high-dimensional feature spaces, and limited sample sizes-conditions under which classical machine learning models are prone to overfitting. We propose a hybrid Quantum Machine Learning (QML) model with Amplitude Encoding, leveraging the unitarity constraint of Parametrized Quantum Circuits (PQC) and the exponential data compression capability of qubits. We evaluate the model on a global recovery rate dataset comprising 1,725 observations and 256 features from 1996 to 2023. Our hybrid method significantly outperforms both classical neural networks and QML models using Angle Encoding, achieving a lower Root Mean Squared Error (RMSE) of 0.228, compared to 0.246 and 0.242, respectively. It also performs competitively with ensemble tree methods such as XGBoost. While practical implementation challenges remain for Noisy Intermediate-Scale Quantum (NISQ) hardware, our quantum simulation and preliminary results on noisy simulators demonstrate the promise of hybrid quantum-classical architectures in enhancing the accuracy and robustness of recovery rate forecasting. These findings illustrate the potential of quantum machine learning in shaping the future of credit risk prediction. ...

January 27, 2025 · 2 min · Research Team

Defaultable bond liquidity spread estimation: an option-based approach

Defaultable bond liquidity spread estimation: an option-based approach ArXiv ID: 2501.11427 “View on arXiv” Authors: Unknown Abstract This paper extends an option-theoretic approach to estimate liquidity spreads for corporate bonds. Inspired by Longstaff’s equity market framework and subsequent work by Koziol and Sauerbier on risk-free zero-coupon bonds, the model views liquidity as a look-back option. The model accounts for the interplay of risk-free rate volatility and credit risk. A numerical analysis highlights the impact of these factors on the liquidity spread, particularly for bonds with different maturities and credit ratings. The methodology is applied to estimate the liquidity spread for unquoted bonds, with a specific case study on the Republic of Italy’s debt, leveraging market data to calibrate model parameters and classify liquid versus illiquid emissions. This approach provides a robust tool for pricing illiquid bonds, emphasizing the importance of marketability in debt security valuation. ...

January 20, 2025 · 2 min · Research Team

Optimizing Fintech Marketing: A Comparative Study of Logistic Regression and XGBoost

Optimizing Fintech Marketing: A Comparative Study of Logistic Regression and XGBoost ArXiv ID: 2412.16333 “View on arXiv” Authors: Unknown Abstract As several studies have shown, predicting credit risk is still a major concern for the financial services industry and is receiving a lot of scholarly interest. This area of study is crucial because it aids financial organizations in determining the probability that borrowers would default, which has a direct bearing on lending choices and risk management tactics. Despite the progress made in this domain, there is still a substantial knowledge gap concerning consumer actions that take place prior to the filing of credit card applications. The objective of this study is to predict customer responses to mail campaigns and assess the likelihood of default among those who engage. This research employs advanced machine learning techniques, specifically logistic regression and XGBoost, to analyze consumer behavior and predict responses to direct mail campaigns. By integrating different data preprocessing strategies, including imputation and binning, we enhance the robustness and accuracy of our predictive models. The results indicate that XGBoost consistently outperforms logistic regression across various metrics, particularly in scenarios using categorical binning and custom imputation. These findings suggest that XGBoost is particularly effective in handling complex data structures and provides a strong predictive capability in assessing credit risk. ...

December 20, 2024 · 2 min · Research Team

A Spatio-Temporal Machine Learning Model for Mortgage Credit Risk: Default Probabilities and Loan Portfolios

A Spatio-Temporal Machine Learning Model for Mortgage Credit Risk: Default Probabilities and Loan Portfolios ArXiv ID: 2410.02846 “View on arXiv” Authors: Unknown Abstract We introduce a novel machine learning model for credit risk by combining tree-boosting with a latent spatio-temporal Gaussian process model accounting for frailty correlation. This allows for modeling non-linearities and interactions among predictor variables in a flexible data-driven manner and for accounting for spatio-temporal variation that is not explained by observable predictor variables. We also show how estimation and prediction can be done in a computationally efficient manner. In an application to a large U.S. mortgage credit risk data set, we find that both predictive default probabilities for individual loans and predictive loan portfolio loss distributions obtained with our novel approach are more accurate compared to conventional independent linear hazard models and also linear spatio-temporal models. Using interpretability tools for machine learning models, we find that the likely reasons for this outperformance are strong interaction and non-linear effects in the predictor variables and the presence of spatio-temporal frailty effects. ...

October 3, 2024 · 2 min · Research Team

Economic effects on households of an augmentation of the cash back duration of real estate loan

Economic effects on households of an augmentation of the cash back duration of real estate loan ArXiv ID: 2409.14748 “View on arXiv” Authors: Unknown Abstract This article examines the economic effects of an increase in the duration of home loans on households, focusing on the French real estate market. It highlights trends in the property market, existing loan systems in other countries (such as bullet loans in Sweden and Japanese home loans), the current state of the property market in France, the potential effects of an increase in the amortization period of home loans, and the financial implications for households.The article points out that increasing the repayment period on home loans could reduce the amount of monthly instalments to be repaid, thereby facilitating access to credit for the most modest households. However, this measure also raises concerns about overall credit costs, financial stability and the impact on property prices. In addition, it highlights the differences between existing lending systems in other countries, such as the bullet loan in Sweden and Japanese home loans, and the current characteristics of home loans in France, notably interest rates and house price trends. The article proposes a model of the potential effects of an increase in the amortization period of home loans on housing demand, housing supply, property prices and the associated financial risks.In conclusion, the article highlights the crucial importance of household debt for individual and economic financial stability. It highlights the distortion between supply and demand for home loans as amortization periods increase, and the significant rise in overall loan costs for households. It also underlines the need to address structural issues such as the sustainable reduction in interest rates, the stabilization of banks’ equity capital and the development of a regulatory framework for intergenerational lending to ensure a properly functioning market. ...

September 23, 2024 · 3 min · Research Team

The TruEnd-procedure: Treating trailing zero-valued balances in credit data

The TruEnd-procedure: Treating trailing zero-valued balances in credit data ArXiv ID: 2404.17008 “View on arXiv” Authors: Unknown Abstract A novel procedure is presented for finding the true but latent endpoints within the repayment histories of individual loans. The monthly observations beyond these true endpoints are false, largely due to operational failures that delay account closure, thereby corrupting some loans. Detecting these false observations is difficult at scale since each affected loan history might have a different sequence of trailing zero (or very small) month-end balances. Identifying these trailing balances requires an exact definition of a “small balance”, which our method informs. We demonstrate this procedure and isolate the ideal small-balance definition using two different South African datasets. Evidently, corrupted loans are remarkably prevalent and have excess histories that are surprisingly long, which ruin the timing of risk events and compromise any subsequent time-to-event model, e.g., survival analysis. Having discarded these excess histories, we demonstrably improve the accuracy of both the predicted timing and severity of risk events, without materially impacting the portfolio. The resulting estimates of credit losses are lower and less biased, which augurs well for raising accurate credit impairments under IFRS 9. Our work therefore addresses a pernicious data error, which highlights the pivotal role of data preparation in producing credible forecasts of credit risk. ...

April 25, 2024 · 2 min · Research Team