CreditARF: A Framework for Corporate Credit Rating with Annual Report and Financial Feature Integration
ArXiv ID: 2508.02738 “View on arXiv”
Authors: Yumeng Shi, Zhongliang Yang, DiYang Lu, Yisi Wang, Yiting Zhou, Linna Zhou
Abstract
Corporate credit rating serves as a crucial intermediary service in the market economy, playing a key role in maintaining economic order. Existing credit rating models rely on financial metrics and deep learning. However, they often overlook insights from non-financial data, such as corporate annual reports. To address this, this paper introduces a corporate credit rating framework that integrates financial data with features extracted from annual reports using FinBERT, aiming to fully leverage the potential value of unstructured text data. In addition, we have developed a large-scale dataset, the Comprehensive Corporate Rating Dataset (CCRD), which combines both traditional financial data and textual data from annual reports. The experimental results show that the proposed method improves the accuracy of the rating predictions by 8-12%, significantly improving the effectiveness and reliability of corporate credit ratings.
Keywords: Credit Rating, FinBERT, Natural Language Processing (NLP), Annual Report Analysis, Deep Learning, Corporate Credit
Complexity vs Empirical Score
- Math Complexity: 4.0/10
- Empirical Rigor: 8.5/10
- Quadrant: Street Traders
- Why: The paper uses standard deep learning architectures (GRU, FinBERT) without complex mathematical derivations, but demonstrates strong empirical rigor with a purpose-built dataset (CCRD), specific accuracy improvements (8-12%), and clear implementation details for NLP feature extraction.
flowchart TD
A["Research Goal: <br> Improve Credit Rating Accuracy <br> by Integrating Annual Report Insights"] --> B
subgraph B["Methodology & Data"]
direction LR
B1["Dataset: CCRD<br>Financial + Annual Reports"] --> B2["Feature Engineering<br>FinBERT for Text Features"]
end
B --> C["Computational Process<br>Deep Learning Model<br>Integrated Financial & Text Features"]
C --> D["Key Findings & Outcomes"]
D --> D1["8-12% Accuracy Improvement"]
D --> D2["Enhanced Reliability of Ratings"]
D --> D3["Unlocks Value of Unstructured Data"]