NLP | Quant Finance Research Hub

A three-step machine learning approach to predict market bubbles with financial news

A three-step machine learning approach to predict market bubbles with financial news ArXiv ID: 2510.16636 “View on arXiv” Authors: Abraham Atsiwo Abstract This study presents a three-step machine learning framework to predict bubbles in the S&P 500 stock market by combining financial news sentiment with macroeconomic indicators. Building on traditional econometric approaches, the proposed approach predicts bubble formation by integrating textual and quantitative data sources. In the first step, bubble periods in the S&P 500 index are identified using a right-tailed unit root test, a widely recognized real-time bubble detection method. The second step extracts sentiment features from large-scale financial news articles using natural language processing (NLP) techniques, which capture investors’ expectations and behavioral patterns. In the final step, ensemble learning methods are applied to predict bubble occurrences based on high sentiment-based and macroeconomic predictors. Model performance is evaluated through k-fold cross-validation and compared against benchmark machine learning algorithms. Empirical results indicate that the proposed three-step ensemble approach significantly improves predictive accuracy and robustness, providing valuable early warning insights for investors, regulators, and policymakers in mitigating systemic financial risks. ...

AI-Powered (Finance) Scholarship

AI-Powered (Finance) Scholarship ArXiv ID: ssrn-5103553 “View on arXiv” Authors: Unknown Abstract This paper describes a process for automatically generating academic finance papers using large language models (LLMs). It demonstrates the process’ efficacy by Keywords: Generative AI, Large Language Models (LLMs), Automated Research, Financial Modeling, NLP, Technology Complexity vs Empirical Score Math Complexity: 1.0/10 Empirical Rigor: 0.5/10 Quadrant: Philosophers Why: The paper focuses on the process of using LLMs to generate academic content, lacking advanced mathematical derivations, while showing minimal evidence of backtesting or implementation-heavy data analysis. flowchart TD A["Research Goal Automate Finance Paper Generation"] --> B["Inputs Financial Data + LLM Prompts"] B --> C{"Methodology Multi-Step Chain-of-Thought"} C --> D["Computational Process LLM Synthesis & Modeling"] D --> E{"Evaluation Human Expert Review"} E --> F["Outcomes High-Quality Finance Papers"] E --> G["Outcomes Validation of LLM Efficacy"] F --> H["Final Result AI-Powered Scholarship Pipeline"] G --> H

MoA is All You Need: Building LLM Research Team using Mixture of Agents

MoA is All You Need: Building LLM Research Team using Mixture of Agents ArXiv ID: 2409.07487 “View on arXiv” Authors: Unknown Abstract Large Language Models (LLMs) research in the financial domain is particularly complex due to the sheer number of approaches proposed in literature. Retrieval-Augmented Generation (RAG) has emerged as one of the leading methods in the sector due to its inherent groundedness and data source variability. In this work, we introduce a RAG framework called Mixture of Agents (MoA) and demonstrate its viability as a practical, customizable, and highly effective approach for scaling RAG applications. MoA is essentially a layered network of individually customized small language models (Hoffmann et al., 2022) collaborating to answer questions and extract information. While there are many theoretical propositions for such an architecture and even a few libraries for generally applying the structure in practice, there are limited documented studies evaluating the potential of this framework considering real business constraints such as cost and speed. We find that the MoA framework, consisting of small language models (Hoffmann et al., 2022), produces higher quality and more grounded responses across various financial domains that are core to Vanguard’s business while simultaneously maintaining low costs. ...