FinVision: A Multi-Agent Framework for Stock Market Prediction

ArXiv ID: 2411.08899 “View on arXiv”

Authors: Unknown

Abstract

Financial trading has been a challenging task, as it requires the integration of vast amounts of data from various modalities. Traditional deep learning and reinforcement learning methods require large training data and often involve encoding various data types into numerical formats for model input, which limits the explainability of model behavior. Recently, LLM-based agents have demonstrated remarkable advancements in handling multi-modal data, enabling them to execute complex, multi-step decision-making tasks while providing insights into their thought processes. This research introduces a multi-modal multi-agent system designed specifically for financial trading tasks. Our framework employs a team of specialized LLM-based agents, each adept at processing and interpreting various forms of financial data, such as textual news reports, candlestick charts, and trading signal charts. A key feature of our approach is the integration of a reflection module, which conducts analyses of historical trading signals and their outcomes. This reflective process is instrumental in enhancing the decision-making capabilities of the system for future trading scenarios. Furthermore, the ablation studies indicate that the visual reflection module plays a crucial role in enhancing the decision-making capabilities of our framework.

Keywords: large language models, multi-agent systems, reinforcement learning, multi-modal data, algorithmic trading, Algorithmic Trading

Complexity vs Empirical Score

  • Math Complexity: 1.5/10
  • Empirical Rigor: 5.0/10
  • Quadrant: Street Traders
  • Why: The paper uses straightforward LLM prompting and agent workflows without advanced mathematical derivations, but it includes a specific backtest over seven months on three stocks, ablation studies, and performance metrics compared to baselines.
  flowchart TD
    A["Research Goal: Explainable Multi-Modal Stock Market Prediction"] --> B["Data Input: Text News, Candlestick & Trading Signal Charts"]
    B --> C["Framework: FinVision Multi-Agent System"]
    C --> D["Core Process: Multi-Modal LLM Agents + Reflection Module"]
    D --> E["Computational Process: Historical Signal Analysis & Visual Reflection"]
    E --> F["Key Finding: Visual Reflection significantly enhances decision-making"]
    F --> G["Outcome: Highly Explainable & Adaptive Algorithmic Trading"]