MCI-GRU: Stock Prediction Model Based on Multi-Head Cross-Attention and Improved GRU

ArXiv ID: 2410.20679 “View on arXiv”

Authors: Unknown

Abstract

As financial markets grow increasingly complex in the big data era, accurate stock prediction has become more critical. Traditional time series models, such as GRUs, have been widely used but often struggle to capture the intricate nonlinear dynamics of markets, particularly in the flexible selection and effective utilization of key historical information. Recently, methods like Graph Neural Networks and Reinforcement Learning have shown promise in stock prediction but require high data quality and quantity, and they tend to exhibit instability when dealing with data sparsity and noise. Moreover, the training and inference processes for these models are typically complex and computationally expensive, limiting their broad deployment in practical applications. Existing approaches also generally struggle to capture unobservable latent market states effectively, such as market sentiment and expectations, microstructural factors, and participant behavior patterns, leading to an inadequate understanding of market dynamics and subsequently impact prediction accuracy. To address these challenges, this paper proposes a stock prediction model, MCI-GRU, based on a multi-head cross-attention mechanism and an improved GRU. First, we enhance the GRU model by replacing the reset gate with an attention mechanism, thereby increasing the model’s flexibility in selecting and utilizing historical information. Second, we design a multi-head cross-attention mechanism for learning unobservable latent market state representations, which are further enriched through interactions with both temporal features and cross-sectional features. Finally, extensive experiments on four main stock markets show that the proposed method outperforms SOTA techniques across multiple metrics. Additionally, its successful application in real-world fund management operations confirms its effectiveness and practicality.

Keywords: MCI-GRU, multi-head cross-attention, improved GRU, latent market states, stock prediction, Equities (Stocks)

Complexity vs Empirical Score

  • Math Complexity: 6.0/10
  • Empirical Rigor: 8.0/10
  • Quadrant: Holy Grail
  • Why: The paper introduces novel architectural modifications (attention-based GRU gate, multi-head cross-attention) and uses specific datasets (CSI 300/500, NASDAQ 100, S&P 500) with multiple metrics, coupled with a successful real-world fund management deployment, demonstrating high empirical rigor; the mathematical complexity is moderate to high due to the integration of GATs and custom attention mechanisms without overly dense derivations.
  flowchart TD
    A["Research Goal:<br>Improve Stock Prediction Accuracy"] --> B{"Data Input"}
    B --> C["Four Main Stock Markets<br>Time Series & Cross-Sectional Data"]
    C --> D["Step 1: Improved GRU<br>Reset Gate replaced with Attention"]
    D --> E["Step 2: Multi-Head Cross-Attention<br>Learn Latent Market States"]
    E --> F["Model Output:<br>MCI-GRU Architecture"]
    F --> G["Key Outcomes"]
    G --> H["SOTA Performance on<br>4 Stock Markets"]
    G --> I["Successful Real-World<br>Fund Management Application"]