Towards modelling lifetime default risk: Exploring different subtypes of recurrent event Cox-regression models

ArXiv ID: 2505.01044 “View on arXiv”

Authors: Arno Botha, Tanja Verster, Bernard Scheepers

Abstract

In the pursuit of modelling a loan’s probability of default (PD) over its lifetime, repeat default events are often ignored when using Cox Proportional Hazard (PH) models. Excluding such events may produce biased and inaccurate PD-estimates, which can compromise financial buffers against future losses. Accordingly, we investigate a few subtypes of Cox-models that can incorporate recurrent default events. Using South African mortgage data, we explore both the Andersen-Gill (AG) and the Prentice-Williams-Peterson (PWP) spell-time models. These models are compared against a baseline that deliberately ignores recurrent events, called the time to first default (TFD) model. Models are evaluated using Harrell’s c-statistic, adjusted Cox-Sell residuals, and a novel extension of time-dependent receiver operating characteristic (ROC) analysis. From these Cox-models, we demonstrate how to derive a portfolio-level term-structure of default risk, which is a series of marginal PD-estimates at each point of the average loan’s lifetime. While the TFD- and PWP-models do not differ significantly across all diagnostics, the AG-model underperformed expectations. Depending on the prevalence of recurrent defaults, one may therefore safely ignore them when estimating lifetime default risk. Accordingly, our work enhances the current practice of using Cox-modelling in producing timeous and accurate PD-estimates under IFRS 9.

Keywords: Cox Proportional Hazard, Default Modeling, Survival Analysis, Recurrent Events, IFRS 9, Credit

Complexity vs Empirical Score

  • Math Complexity: 7.0/10
  • Empirical Rigor: 8.0/10
  • Quadrant: Holy Grail
  • Why: The paper employs advanced survival analysis and recurrent event models (Cox variants) with mathematical derivations, while rigorously testing models on real mortgage data using statistical diagnostics and time-dependent ROC analysis.
  flowchart TD
    A["Research Goal: Model Lifetime PD<br>for loans, including recurrent defaults"] --> B{"Data Source"}
    B --> C["Mortgage Data<br>South African Loans"]
    C --> D["Modeling Strategy"]
    subgraph D ["Methodology: Recurrent Event Cox Models"]
        D1["Time to First Default<br>Baseline"]
        D2["Anderssen-Gill AG<br>Recurrent Event"]
        D3["Prentice-Williams-Peterson PWP<br>Spell-Time"]
    end
    D --> E["Model Evaluation"]
    subgraph E ["Validation Metrics"]
        E1["Harrell's c-statistic"]
        E2["Adjusted Cox-Sell Residuals"]
        E3["Time-Dependent ROC"]
    end
    E --> F["Key Outcomes"]
    subgraph F ["Findings"]
        F1["Portfolio-level<br>Term Structure of PD"]
        F2["TFD & PWP models<br>perform comparably"]
        F3["AG model<br>underperformed"]
        F4["Recurrent defaults may be<br>ignored if prevalent"]
    end