Advancing Financial Engineering with Foundation Models: Progress, Applications, and Challenges
ArXiv ID: 2507.18577 “View on arXiv”
Authors: Liyuan Chen, Shuoling Liu, Jiangpeng Yan, Xiaoyu Wang, Henglin Liu, Chuang Li, Kecheng Jiao, Jixuan Ying, Yang Veronica Liu, Qiang Yang, Xiu Li
Abstract
The advent of foundation models (FMs), large-scale pre-trained models with strong generalization capabilities, has opened new frontiers for financial engineering. While general-purpose FMs such as GPT-4 and Gemini have demonstrated promising performance in tasks ranging from financial report summarization to sentiment-aware forecasting, many financial applications remain constrained by unique domain requirements such as multimodal reasoning, regulatory compliance, and data privacy. These challenges have spurred the emergence of financial foundation models (FFMs): a new class of models explicitly designed for finance. This survey presents a comprehensive overview of FFMs, with a taxonomy spanning three key modalities: financial language foundation models (FinLFMs), financial time-series foundation models (FinTSFMs), and financial visual-language foundation models (FinVLFMs). We review their architectures, training methodologies, datasets, and real-world applications. Furthermore, we identify critical challenges in data availability, algorithmic scalability, and infrastructure constraints and offer insights into future research opportunities. We hope this survey can serve as both a comprehensive reference for understanding FFMs and a practical roadmap for future innovation.
Keywords: Financial foundation models (FFMs), Multimodal reasoning, Time-series foundation models, Financial language models, Financial visual-language models, Multi-Asset
Complexity vs Empirical Score
- Math Complexity: 4.0/10
- Empirical Rigor: 3.0/10
- Quadrant: Philosophers
- Why: The paper is a survey that categorizes financial foundation models and discusses high-level architectures and challenges, but contains no mathematical derivations, formulas, or statistical metrics. It lacks specific implementation details, backtesting results, or dataset specifics, focusing instead on a conceptual overview and taxonomy.
flowchart TD
A["Research Goal:<br/>"Advancing Financial Engineering with<br/>Foundation Models: Progress, Applications, and Challenges""] --> B
subgraph B ["Methodology: Comprehensive Survey & Taxonomy"]
direction LR
B1["Financial Language<br/>Foundation Models"] --> B2["Financial Time-Series<br/>Foundation Models"] --> B3["Financial Visual-Language<br/>Foundation Models"]
end
B --> C["Computational Processes &<br/>Real-World Applications"]
C --> D["Key Findings & Outcomes:<br/>1. Scalability & Generalization achieved via FFMs<br/>2. Unique Challenges: Data scarcity, Regulation, Privacy<br/>3. Future Roadmap: Multimodal reasoning & Infrastructure"]