Unleashing the power of text for credit default prediction: Comparing human-written and generative AI-refined texts

ArXiv ID: 2503.18029 “View on arXiv”

Authors: Unknown

Abstract

This study explores the integration of a representative large language model, ChatGPT, into lending decision-making with a focus on credit default prediction. Specifically, we use ChatGPT to analyse and interpret loan assessments written by loan officers and generate refined versions of these texts. Our comparative analysis reveals significant differences between generative artificial intelligence (AI)-refined and human-written texts in terms of text length, semantic similarity, and linguistic representations. Using deep learning techniques, we show that incorporating unstructured text data, particularly ChatGPT-refined texts, alongside conventional structured data significantly enhances credit default predictions. Furthermore, we demonstrate how the contents of both human-written and ChatGPT-refined assessments contribute to the models’ prediction and show that the effect of essential words is highly context-dependent. Moreover, we find that ChatGPT’s analysis of borrower delinquency contributes the most to improving predictive accuracy. We also evaluate the business impact of the models based on human-written and ChatGPT-refined texts, and find that, in most cases, the latter yields higher profitability than the former. This study provides valuable insights into the transformative potential of generative AI in financial services.

Keywords: Credit default prediction, Large Language Models (LLM), ChatGPT, Lending decision, Natural Language Processing (NLP), Credit / Lending

Complexity vs Empirical Score

Math Complexity: 4.0/10
Empirical Rigor: 7.5/10
Quadrant: Street Traders
Why: The paper uses established deep learning and NLP techniques (like LIME and transformer models) but focuses heavily on empirical validation with a specific loan dataset, backtesting profit impact, and comparing multiple generative AI tools, making it highly practical for implementation.

  flowchart TD
    A["Research Goal<br/>'Unleashing the power of text for credit default prediction:<br/>Comparing human-written and generative AI-refined texts'"] --> B["Data & Inputs"]
    B --> C["Methodology<br/>1. Human-written loan assessments<br/>2. ChatGPT-refined versions<br/>3. Structured loan data"]
    C --> D["Computational Process<br/>Deep Learning Models<br/>(e.g., BERT, LSTM, Fusion Models)"]
    D --> E["Comparative Analysis<br/>Length, Semantic Similarity,<br/>Linguistic Features"]
    E --> F["Key Findings & Outcomes"]
    F --> F1["AI-refined text significantly<br/>enhances prediction accuracy"]
    F --> F2["ChatGPT analysis of borrower<br/>delinquency contributes most"]
    F --> F3["Business Impact:<br/>AI-refined models yield<br/>higher profitability"]
    F --> F4["Context-dependent effect<br/>of essential words"]

Unleashing the power of text for credit default prediction: Comparing human-written and generative AI-refined texts#

Abstract#

Complexity vs Empirical Score#

Unleashing the power of text for credit default prediction: Comparing human-written and generative AI-refined texts

Abstract

Complexity vs Empirical Score