false

Trade-R1: Bridging Verifiable Rewards to Stochastic Environments via Process-Level Reasoning Verification

Trade-R1: Bridging Verifiable Rewards to Stochastic Environments via Process-Level Reasoning Verification ArXiv ID: 2601.03948 “View on arXiv” Authors: Rui Sun, Yifan Sun, Sheng Xu, Li Zhao, Jing Li, Daxin Jiang, Cheng Hua, Zuo Bai Abstract Reinforcement Learning (RL) has enabled Large Language Models (LLMs) to achieve remarkable reasoning in domains like mathematics and coding, where verifiable rewards provide clear signals. However, extending this paradigm to financial decision is challenged by the market’s stochastic nature: rewards are verifiable but inherently noisy, causing standard RL to degenerate into reward hacking. To address this, we propose Trade-R1, a model training framework that bridges verifiable rewards to stochastic environments via process-level reasoning verification. Our key innovation is a verification method that transforms the problem of evaluating reasoning over lengthy financial documents into a structured Retrieval-Augmented Generation (RAG) task. We construct a triangular consistency metric, assessing pairwise alignment between retrieved evidence, reasoning chains, and decisions to serve as a validity filter for noisy market returns. We explore two reward integration strategies: Fixed-effect Semantic Reward (FSR) for stable alignment signals, and Dynamic-effect Semantic Reward (DSR) for coupled magnitude optimization. Experiments on different country asset selection demonstrate that our paradigm reduces reward hacking, with DSR achieving superior cross-market generalization while maintaining the highest reasoning consistency. ...

January 7, 2026 · 2 min · Research Team

VERAFI: Verified Agentic Financial Intelligence through Neurosymbolic Policy Generation

VERAFI: Verified Agentic Financial Intelligence through Neurosymbolic Policy Generation ArXiv ID: 2512.14744 “View on arXiv” Authors: Adewale Akinfaderin, Shreyas Subramanian Abstract Financial AI systems suffer from a critical blind spot: while Retrieval-Augmented Generation (RAG) excels at finding relevant documents, language models still generate calculation errors and regulatory violations during reasoning, even with perfect retrieval. This paper introduces VERAFI (Verified Agentic Financial Intelligence), an agentic framework with neurosymbolic policy generation for verified financial intelligence. VERAFI combines state-of-the-art dense retrieval and cross-encoder reranking with financial tool-enabled agents and automated reasoning policies covering GAAP compliance, SEC requirements, and mathematical validation. Our comprehensive evaluation on FinanceBench demonstrates remarkable improvements: while traditional dense retrieval with reranking achieves only 52.4% factual correctness, VERAFI’s integrated approach reaches 94.7%, an 81% relative improvement. The neurosymbolic policy layer alone contributes a 4.3 percentage point gain over pure agentic processing, specifically targeting persistent mathematical and logical errors. By integrating financial domain expertise directly into the reasoning process, VERAFI offers a practical pathway toward trustworthy financial AI that meets the stringent accuracy demands of regulatory compliance, investment decisions, and risk management. ...

December 12, 2025 · 2 min · Research Team

FinReflectKG -- MultiHop: Financial QA Benchmark for Reasoning with Knowledge Graph Evidence

FinReflectKG – MultiHop: Financial QA Benchmark for Reasoning with Knowledge Graph Evidence ArXiv ID: 2510.02906 “View on arXiv” Authors: Abhinav Arun, Reetu Raj Harsh, Bhaskarjit Sarmah, Stefano Pasquali Abstract Multi-hop reasoning over financial disclosures is often a retrieval problem before it becomes a reasoning or generation problem: relevant facts are dispersed across sections, filings, companies, and years, and LLMs often expend excessive tokens navigating noisy context. Without precise Knowledge Graph (KG)-guided selection of relevant context, even strong reasoning models either fail to answer or consume excessive tokens, whereas KG-linked evidence enables models to focus their reasoning on composing already retrieved facts. We present FinReflectKG - MultiHop, a benchmark built on FinReflectKG, a temporally indexed financial KG that links audited triples to source chunks from S&P 100 filings (2022-2024). Mining frequent 2-3 hop subgraph patterns across sectors (via GICS taxonomy), we generate financial analyst style questions with exact supporting evidence from the KG. A two-phase pipeline first creates QA pairs via pattern-specific prompts, followed by a multi-criteria quality control evaluation to ensure QA validity. We then evaluate three controlled retrieval scenarios: (S1) precise KG-linked paths; (S2) text-only page windows centered on relevant text spans; and (S3) relevant page windows with randomizations and distractors. Across both reasoning and non-reasoning models, KG-guided precise retrieval yields substantial gains on the FinReflectKG - MultiHop QA benchmark dataset, boosting correctness scores by approximately 24 percent while reducing token utilization by approximately 84.5 percent compared to the page window setting, which reflects the traditional vector retrieval paradigm. Spanning intra-document, inter-year, and cross-company scopes, our work underscores the pivotal role of knowledge graphs in efficiently connecting evidence for multi-hop financial QA. We also release a curated subset of the benchmark (555 QA Pairs) to catalyze further research. ...

October 3, 2025 · 3 min · Research Team

Financial Analysis: Intelligent Financial Data Analysis System Based on LLM-RAG

Financial Analysis: Intelligent Financial Data Analysis System Based on LLM-RAG ArXiv ID: 2504.06279 “View on arXiv” Authors: Unknown Abstract In the modern financial sector, the exponential growth of data has made efficient and accurate financial data analysis increasingly crucial. Traditional methods, such as statistical analysis and rule-based systems, often struggle to process and derive meaningful insights from complex financial information effectively. These conventional approaches face inherent limitations in handling unstructured data, capturing intricate market patterns, and adapting to rapidly evolving financial contexts, resulting in reduced accuracy and delayed decision-making processes. To address these challenges, this paper presents an intelligent financial data analysis system that integrates Large Language Models (LLMs) with Retrieval-Augmented Generation (RAG) technology. Our system incorporates three key components: a specialized preprocessing module for financial data standardization, an efficient vector-based storage and retrieval system, and a RAG-enhanced query processing module. Using the NASDAQ financial fundamentals dataset from 2010 to 2023, we conducted comprehensive experiments to evaluate system performance. Results demonstrate significant improvements across multiple metrics: the fully optimized configuration (gpt-3.5-turbo-1106+RAG) achieved 78.6% accuracy and 89.2% recall, surpassing the baseline model by 23 percentage points in accuracy while reducing response time by 34.8%. The system also showed enhanced efficiency in handling complex financial queries, though with a moderate increase in memory utilization. Our findings validate the effectiveness of integrating RAG technology with LLMs for financial analysis tasks and provide valuable insights for future developments in intelligent financial data processing systems. ...

March 20, 2025 · 2 min · Research Team

SusGen-GPT: A Data-Centric LLM for Financial NLP and Sustainability Report Generation

SusGen-GPT: A Data-Centric LLM for Financial NLP and Sustainability Report Generation ArXiv ID: 2412.10906 “View on arXiv” Authors: Unknown Abstract The rapid growth of the financial sector and the rising focus on Environmental, Social, and Governance (ESG) considerations highlight the need for advanced NLP tools. However, open-source LLMs proficient in both finance and ESG domains remain scarce. To address this gap, we introduce SusGen-30K, a category-balanced dataset comprising seven financial NLP tasks and ESG report generation, and propose TCFD-Bench, a benchmark for evaluating sustainability report generation. Leveraging this dataset, we developed SusGen-GPT, a suite of models achieving state-of-the-art performance across six adapted and two off-the-shelf tasks, trailing GPT-4 by only 2% despite using 7-8B parameters compared to GPT-4’s 1,700B. Based on this, we propose the SusGen system, integrated with Retrieval-Augmented Generation (RAG), to assist in sustainability report generation. This work demonstrates the efficiency of our approach, advancing research in finance and ESG. ...

December 14, 2024 · 2 min · Research Team

RAG-IT: Retrieval-Augmented Instruction Tuning for Automated Financial Analysis -- A Case Study for the Semiconductor Sector

RAG-IT: Retrieval-Augmented Instruction Tuning for Automated Financial Analysis – A Case Study for the Semiconductor Sector ArXiv ID: 2412.08179 “View on arXiv” Authors: Unknown Abstract Financial analysis relies heavily on the interpretation of earnings reports to assess company performance and guide decision-making. Traditional methods for generating such analyzes require significant financial expertise and are often time-consuming. With the rapid advancement of Large Language Models (LLMs), domain-specific adaptations have emerged for financial tasks such as sentiment analysis and entity recognition. This paper introduces RAG-IT (Retrieval-Augmented Instruction Tuning), a novel framework designed to automate the generation of earnings report analysis through an LLM fine-tuned specifically for the financial domain. Our approach integrates retrieval augmentation with instruction-based fine-tuning to enhance factual accuracy, contextual relevance, and domain adaptability. We construct a sector-specific financial instruction dataset derived from semiconductor industry documents to guide the LLM adaptation to specialized financial reasoning. Using NVIDIA, AMD, and Broadcom as representative companies, our case study demonstrates that RAG-IT substantially improves a general-purpose open-source LLM and achieves performance comparable to commercial systems like GPT-3.5 on financial report generation tasks. This research highlights the potential of retrieval-augmented instruction tuning to streamline and elevate financial analysis automation, advancing the broader field of intelligent financial reporting. ...

December 11, 2024 · 2 min · Research Team

Combining Financial Data and News Articles for Stock Price Movement Prediction Using Large Language Models

Combining Financial Data and News Articles for Stock Price Movement Prediction Using Large Language Models ArXiv ID: 2411.01368 “View on arXiv” Authors: Unknown Abstract Predicting financial markets and stock price movements requires analyzing a company’s performance, historic price movements, industry-specific events alongside the influence of human factors such as social media and press coverage. We assume that financial reports (such as income statements, balance sheets, and cash flow statements), historical price data, and recent news articles can collectively represent aforementioned factors. We combine financial data in tabular format with textual news articles and employ pre-trained Large Language Models (LLMs) to predict market movements. Recent research in LLMs has demonstrated that they are able to perform both tabular and text classification tasks, making them our primary model to classify the multi-modal data. We utilize retrieval augmentation techniques to retrieve and attach relevant chunks of news articles to financial metrics related to a company and prompt the LLMs in zero, two, and four-shot settings. Our dataset contains news articles collected from different sources, historic stock price, and financial report data for 20 companies with the highest trading volume across different industries in the stock market. We utilized recently released language models for our LLM-based classifier, including GPT- 3 and 4, and LLaMA- 2 and 3 models. We introduce an LLM-based classifier capable of performing classification tasks using combination of tabular (structured) and textual (unstructured) data. By using this model, we predicted the movement of a given stock’s price in our dataset with a weighted F1-score of 58.5% and 59.1% and Matthews Correlation Coefficient of 0.175 for both 3-month and 6-month periods. ...

November 2, 2024 · 2 min · Research Team

MoA is All You Need: Building LLM Research Team using Mixture of Agents

MoA is All You Need: Building LLM Research Team using Mixture of Agents ArXiv ID: 2409.07487 “View on arXiv” Authors: Unknown Abstract Large Language Models (LLMs) research in the financial domain is particularly complex due to the sheer number of approaches proposed in literature. Retrieval-Augmented Generation (RAG) has emerged as one of the leading methods in the sector due to its inherent groundedness and data source variability. In this work, we introduce a RAG framework called Mixture of Agents (MoA) and demonstrate its viability as a practical, customizable, and highly effective approach for scaling RAG applications. MoA is essentially a layered network of individually customized small language models (Hoffmann et al., 2022) collaborating to answer questions and extract information. While there are many theoretical propositions for such an architecture and even a few libraries for generally applying the structure in practice, there are limited documented studies evaluating the potential of this framework considering real business constraints such as cost and speed. We find that the MoA framework, consisting of small language models (Hoffmann et al., 2022), produces higher quality and more grounded responses across various financial domains that are core to Vanguard’s business while simultaneously maintaining low costs. ...

September 4, 2024 · 2 min · Research Team

HybridRAG: Integrating Knowledge Graphs and Vector Retrieval Augmented Generation for Efficient Information Extraction

HybridRAG: Integrating Knowledge Graphs and Vector Retrieval Augmented Generation for Efficient Information Extraction ArXiv ID: 2408.04948 “View on arXiv” Authors: Unknown Abstract Extraction and interpretation of intricate information from unstructured text data arising in financial applications, such as earnings call transcripts, present substantial challenges to large language models (LLMs) even using the current best practices to use Retrieval Augmented Generation (RAG) (referred to as VectorRAG techniques which utilize vector databases for information retrieval) due to challenges such as domain specific terminology and complex formats of the documents. We introduce a novel approach based on a combination, called HybridRAG, of the Knowledge Graphs (KGs) based RAG techniques (called GraphRAG) and VectorRAG techniques to enhance question-answer (Q&A) systems for information extraction from financial documents that is shown to be capable of generating accurate and contextually relevant answers. Using experiments on a set of financial earning call transcripts documents which come in the form of Q&A format, and hence provide a natural set of pairs of ground-truth Q&As, we show that HybridRAG which retrieves context from both vector database and KG outperforms both traditional VectorRAG and GraphRAG individually when evaluated at both the retrieval and generation stages in terms of retrieval accuracy and answer generation. The proposed technique has applications beyond the financial domain ...

August 9, 2024 · 2 min · Research Team

ECC Analyzer: Extract Trading Signal from Earnings Conference Calls using Large Language Model for Stock Performance Prediction

ECC Analyzer: Extract Trading Signal from Earnings Conference Calls using Large Language Model for Stock Performance Prediction ArXiv ID: 2404.18470 “View on arXiv” Authors: Unknown Abstract In the realm of financial analytics, leveraging unstructured data, such as earnings conference calls (ECCs), to forecast stock volatility is a critical challenge that has attracted both academics and investors. While previous studies have used multimodal deep learning-based models to obtain a general view of ECCs for volatility predicting, they often fail to capture detailed, complex information. Our research introduces a novel framework: \textbf{“ECC Analyzer”}, which utilizes large language models (LLMs) to extract richer, more predictive content from ECCs to aid the model’s prediction performance. We use the pre-trained large models to extract textual and audio features from ECCs and implement a hierarchical information extraction strategy to extract more fine-grained information. This strategy first extracts paragraph-level general information by summarizing the text and then extracts fine-grained focus sentences using Retrieval-Augmented Generation (RAG). These features are then fused through multimodal feature fusion to perform volatility prediction. Experimental results demonstrate that our model outperforms traditional analytical benchmarks, confirming the effectiveness of advanced LLM techniques in financial analysis. ...

April 29, 2024 · 2 min · Research Team