false

Re(Visiting) Large Language Models inFinance

Re(Visiting) Large Language Models inFinance ArXiv ID: ssrn-4963618 “View on arXiv” Authors: Unknown Abstract This study evaluates the effectiveness of specialised large language models (LLMs) developed for accounting and finance. Empirical analysis demonstrates that th Keywords: Large Language Models, Accounting, Financial Analysis, Natural Language Processing Complexity vs Empirical Score Math Complexity: 6.0/10 Empirical Rigor: 7.5/10 Quadrant: Holy Grail Why: The paper demonstrates high empirical rigor through extensive data handling, robustness checks, and a clear backtest-ready methodology (out-of-sample testing, look-ahead bias mitigation). Math complexity is moderate-to-high due to the advanced transformer architectures and the statistical foundations of LLMs, though the focus is on applied implementation rather than deep theoretical derivations. flowchart TD A["Research Goal: Assess effectiveness of specialised LLMs for Accounting & Finance"] --> B["Methodology: Empirical Analysis of FinanceBench & FinEval"] B --> C["Computational Process: Instruction-Tuning & In-Context Learning"] C --> D{"Key Findings"} D --> E["Specialised Models outperform general LLMs"] D --> F["Instruction-tuning significantly boosts financial accuracy"] D --> G["Task-specific prompting (ICL) improves performance"]

January 25, 2026 · 1 min · Research Team

Comparing LLMs for Sentiment Analysis in Financial Market News

Comparing LLMs for Sentiment Analysis in Financial Market News ArXiv ID: 2510.15929 “View on arXiv” Authors: Lucas Eduardo Pereira Teles, Carlos M. S. Figueiredo Abstract This article presents a comparative study of large language models (LLMs) in the task of sentiment analysis of financial market news. This work aims to analyze the performance difference of these models in this important natural language processing task within the context of finance. LLM models are compared with classical approaches, allowing for the quantification of the benefits of each tested model or approach. Results show that large language models outperform classical models in the vast majority of cases. ...

October 3, 2025 · 2 min · Research Team

FINCH: Financial Intelligence using Natural language for Contextualized SQL Handling

FINCH: Financial Intelligence using Natural language for Contextualized SQL Handling ArXiv ID: 2510.01887 “View on arXiv” Authors: Avinash Kumar Singh, Bhaskarjit Sarmah, Stefano Pasquali Abstract Text-to-SQL, the task of translating natural language questions into SQL queries, has long been a central challenge in NLP. While progress has been significant, applying it to the financial domain remains especially difficult due to complex schema, domain-specific terminology, and high stakes of error. Despite this, there is no dedicated large-scale financial dataset to advance research, creating a critical gap. To address this, we introduce a curated financial dataset (FINCH) comprising 292 tables and 75,725 natural language-SQL pairs, enabling both fine-tuning and rigorous evaluation. Building on this resource, we benchmark reasoning models and language models of varying scales, providing a systematic analysis of their strengths and limitations in financial Text-to-SQL tasks. Finally, we propose a finance-oriented evaluation metric (FINCH Score) that captures nuances overlooked by existing measures, offering a more faithful assessment of model performance. ...

October 2, 2025 · 2 min · Research Team

News Sentiment Embeddings for Stock Price Forecasting

News Sentiment Embeddings for Stock Price Forecasting ArXiv ID: 2507.01970 “View on arXiv” Authors: Ayaan Qayyum Abstract This paper will discuss how headline data can be used to predict stock prices. The stock price in question is the SPDR S&P 500 ETF Trust, also known as SPY that tracks the performance of the largest 500 publicly traded corporations in the United States. A key focus is to use news headlines from the Wall Street Journal (WSJ) to predict the movement of stock prices on a daily timescale with OpenAI-based text embedding models used to create vector encodings of each headline with principal component analysis (PCA) to exact the key features. The challenge of this work is to capture the time-dependent and time-independent, nuanced impacts of news on stock prices while handling potential lag effects and market noise. Financial and economic data were collected to improve model performance; such sources include the U.S. Dollar Index (DXY) and Treasury Interest Yields. Over 390 machine-learning inference models were trained. The preliminary results show that headline data embeddings greatly benefit stock price prediction by at least 40% compared to training and optimizing a machine learning system without headline data embeddings. ...

June 19, 2025 · 2 min · Research Team

Towards Competent AI for Fundamental Analysis in Finance: A Benchmark Dataset and Evaluation

Towards Competent AI for Fundamental Analysis in Finance: A Benchmark Dataset and Evaluation ArXiv ID: 2506.07315 “View on arXiv” Authors: Zonghan Wu, Congyuan Zou, Junlin Wang, Chenhan Wang, Hangjing Yang, Yilei Shao Abstract Generative AI, particularly large language models (LLMs), is beginning to transform the financial industry by automating tasks and helping to make sense of complex financial information. One especially promising use case is the automatic creation of fundamental analysis reports, which are essential for making informed investment decisions, evaluating credit risks, guiding corporate mergers, etc. While LLMs attempt to generate these reports from a single prompt, the risks of inaccuracy are significant. Poor analysis can lead to misguided investments, regulatory issues, and loss of trust. Existing financial benchmarks mainly evaluate how well LLMs answer financial questions but do not reflect performance in real-world tasks like generating financial analysis reports. In this paper, we propose FinAR-Bench, a solid benchmark dataset focusing on financial statement analysis, a core competence of fundamental analysis. To make the evaluation more precise and reliable, we break this task into three measurable steps: extracting key information, calculating financial indicators, and applying logical reasoning. This structured approach allows us to objectively assess how well LLMs perform each step of the process. Our findings offer a clear understanding of LLMs current strengths and limitations in fundamental analysis and provide a more practical way to benchmark their performance in real-world financial settings. ...

May 22, 2025 · 2 min · Research Team

Words That Unite The World: A Unified Framework for Deciphering Central Bank Communications Globally

Words That Unite The World: A Unified Framework for Deciphering Central Bank Communications Globally ArXiv ID: 2505.17048 “View on arXiv” Authors: Agam Shah, Siddhant Sukhani, Huzaifa Pardawala, Saketh Budideti, Riya Bhadani, Rudra Gopal, Siddhartha Somani, Rutwik Routu, Michael Galarnyk, Soungmin Lee, Arnav Hiray, Akshar Ravichandran, Eric Kim, Pranav Aluru, Joshua Zhang, Sebastian Jaskowski, Veer Guda, Meghaj Tarte, Liqin Ye, Spencer Gosden, Rachel Yuh, Sloka Chava, Sahasra Chava, Dylan Patrick Kelly, Aiden Chiang, Harsit Mittal, Sudheer Chava ...

May 15, 2025 · 2 min · Research Team

Financial Analysis: Intelligent Financial Data Analysis System Based on LLM-RAG

Financial Analysis: Intelligent Financial Data Analysis System Based on LLM-RAG ArXiv ID: 2504.06279 “View on arXiv” Authors: Unknown Abstract In the modern financial sector, the exponential growth of data has made efficient and accurate financial data analysis increasingly crucial. Traditional methods, such as statistical analysis and rule-based systems, often struggle to process and derive meaningful insights from complex financial information effectively. These conventional approaches face inherent limitations in handling unstructured data, capturing intricate market patterns, and adapting to rapidly evolving financial contexts, resulting in reduced accuracy and delayed decision-making processes. To address these challenges, this paper presents an intelligent financial data analysis system that integrates Large Language Models (LLMs) with Retrieval-Augmented Generation (RAG) technology. Our system incorporates three key components: a specialized preprocessing module for financial data standardization, an efficient vector-based storage and retrieval system, and a RAG-enhanced query processing module. Using the NASDAQ financial fundamentals dataset from 2010 to 2023, we conducted comprehensive experiments to evaluate system performance. Results demonstrate significant improvements across multiple metrics: the fully optimized configuration (gpt-3.5-turbo-1106+RAG) achieved 78.6% accuracy and 89.2% recall, surpassing the baseline model by 23 percentage points in accuracy while reducing response time by 34.8%. The system also showed enhanced efficiency in handling complex financial queries, though with a moderate increase in memory utilization. Our findings validate the effectiveness of integrating RAG technology with LLMs for financial analysis tasks and provide valuable insights for future developments in intelligent financial data processing systems. ...

March 20, 2025 · 2 min · Research Team

Are Large Language Models Good In-context Learners for Financial Sentiment Analysis?

Are Large Language Models Good In-context Learners for Financial Sentiment Analysis? ArXiv ID: 2503.04873 “View on arXiv” Authors: Unknown Abstract Recently, large language models (LLMs) with hundreds of billions of parameters have demonstrated the emergent ability, surpassing traditional methods in various domains even without fine-tuning over domain-specific data. However, when it comes to financial sentiment analysis (FSA)$\unicode{“x2013”}$a fundamental task in financial AI$\unicode{“x2013”}$these models often encounter various challenges, such as complex financial terminology, subjective human emotions, and ambiguous inclination expressions. In this paper, we aim to answer the fundamental question: whether LLMs are good in-context learners for FSA? Unveiling this question can yield informative insights on whether LLMs can learn to address the challenges by generalizing in-context demonstrations of financial document-sentiment pairs to the sentiment analysis of new documents, given that finetuning these models on finance-specific data is difficult, if not impossible at all. To the best of our knowledge, this is the first paper exploring in-context learning for FSA that covers most modern LLMs (recently released DeepSeek V3 included) and multiple in-context sample selection methods. Comprehensive experiments validate the in-context learning capability of LLMs for FSA. ...

March 6, 2025 · 2 min · Research Team

Chronologically Consistent Large Language Models

Chronologically Consistent Large Language Models ArXiv ID: 2502.21206 “View on arXiv” Authors: Unknown Abstract Large language models are increasingly used in social sciences, but their training data can introduce lookahead bias and training leakage. A good chronologically consistent language model requires efficient use of training data to maintain accuracy despite time-restricted data. Here, we overcome this challenge by training a suite of chronologically consistent large language models, ChronoBERT and ChronoGPT, which incorporate only the text data that would have been available at each point in time. Despite this strict temporal constraint, our models achieve strong performance on natural language processing benchmarks, outperforming or matching widely used models (e.g., BERT), and remain competitive with larger open-weight models. Lookahead bias is model and application-specific because even if a chronologically consistent language model has poorer language comprehension, a regression or prediction model applied on top of the language model can compensate. In an asset pricing application predicting next-day stock returns from financial news, we find that ChronoBERT and ChronoGPT’s real-time outputs achieve Sharpe ratios comparable to a much larger Llama model, indicating that lookahead bias is modest. Our results demonstrate a scalable, practical framework to mitigate training leakage, ensuring more credible backtests and predictions across finance and other social science domains. ...

February 28, 2025 · 2 min · Research Team

Multimodal Stock Price Prediction

Multimodal Stock Price Prediction ArXiv ID: 2502.05186 “View on arXiv” Authors: Unknown Abstract In an era where financial markets are heavily influenced by many static and dynamic factors, it has become increasingly critical to carefully integrate diverse data sources with machine learning for accurate stock price prediction. This paper explores a multimodal machine learning approach for stock price prediction by combining data from diverse sources, including traditional financial metrics, tweets, and news articles. We capture real-time market dynamics and investor mood through sentiment analysis on these textual data using both ChatGPT-4o and FinBERT models. We look at how these integrated data streams augment predictions made with a standard Long Short-Term Memory (LSTM model) to illustrate the extent of performance gains. Our study’s results indicate that incorporating the mentioned data sources considerably increases the forecast effectiveness of the reference model by up to 5%. We also provide insights into the individual and combined predictive capacities of these modalities, highlighting the substantial impact of incorporating sentiment analysis from tweets and news articles. This research offers a systematic and effective framework for applying multimodal data analytics techniques in financial time series forecasting that provides a new view for investors to leverage data for decision-making. ...

January 23, 2025 · 2 min · Research Team