FinAI-BERT: A Transformer-Based Model for Sentence-Level Detection of AI Disclosures in Financial Reports
ArXiv ID: 2507.01991 “View on arXiv”
Authors: Muhammad Bilal Zafar
Abstract
The proliferation of artificial intelligence (AI) in financial services has prompted growing demand for tools that can systematically detect AI-related disclosures in corporate filings. While prior approaches often rely on keyword expansion or document-level classification, they fall short in granularity, interpretability, and robustness. This study introduces FinAI-BERT, a domain-adapted transformer-based language model designed to classify AI-related content at the sentence level within financial texts. The model was fine-tuned on a manually curated and balanced dataset of 1,586 sentences drawn from 669 annual reports of U.S. banks (2015 to 2023). FinAI-BERT achieved near-perfect classification performance (accuracy of 99.37 percent, F1 score of 0.993), outperforming traditional baselines such as Logistic Regression, Naive Bayes, Random Forest, and XGBoost. Interpretability was ensured through SHAP-based token attribution, while bias analysis and robustness checks confirmed the model’s stability across sentence lengths, adversarial inputs, and temporal samples. Theoretically, the study advances financial NLP by operationalizing fine-grained, theme-specific classification using transformer architectures. Practically, it offers a scalable, transparent solution for analysts, regulators, and scholars seeking to monitor the diffusion and framing of AI across financial institutions.
Keywords: FinAI-BERT, Transformer Models, Natural Language Processing (NLP), Financial Disclosure, Regulatory Compliance
Complexity vs Empirical Score
- Math Complexity: 3.5/10
- Empirical Rigor: 8.0/10
- Quadrant: Street Traders
- Why: The paper employs standard deep learning fine-tuning and statistical metrics (F1, accuracy) with SHAP for interpretability, resulting in moderate mathematical complexity. It demonstrates high empirical rigor through a detailed, manually curated dataset, explicit implementation instructions (Hugging Face), and robustness checks.
flowchart TD
A["Research Goal<br>Detect AI Disclosures<br>in Financial Reports"] --> B
subgraph B ["Methodology & Data"]
direction LR
B1["Dataset<br>1,586 Sentences<br>U.S. Banks (2015-2023)"] --> B2["FinAI-BERT<br>Domain-Adapted Transformer"]
end
B --> C["Computational Process<br>Fine-tuning & Classification"]
C --> D{"Evaluation & Analysis"}
D --> E["Outcomes<br>99.37% Accuracy / F1 0.993<br>SHAP Interpretability"]