Higher Order Transformers: Enhancing Stock Movement Prediction On Multimodal Time-Series Data

ArXiv ID: 2412.10540 “View on arXiv”

Authors: Unknown

Abstract

In this paper, we tackle the challenge of predicting stock movements in financial markets by introducing Higher Order Transformers, a novel architecture designed for processing multivariate time-series data. We extend the self-attention mechanism and the transformer architecture to a higher order, effectively capturing complex market dynamics across time and variables. To manage computational complexity, we propose a low-rank approximation of the potentially large attention tensor using tensor decomposition and employ kernel attention, reducing complexity to linear with respect to the data size. Additionally, we present an encoder-decoder model that integrates technical and fundamental analysis, utilizing multimodal signals from historical prices and related tweets. Our experiments on the Stocknet dataset demonstrate the effectiveness of our method, highlighting its potential for enhancing stock movement prediction in financial markets.

Keywords: Higher Order Transformers, Tensor Decomposition, Multivariate Time Series, Attention Mechanism, Multimodal Learning, Equities

Complexity vs Empirical Score

  • Math Complexity: 8.5/10
  • Empirical Rigor: 5.5/10
  • Quadrant: Holy Grail
  • Why: The paper introduces a novel, mathematically complex higher-order transformer architecture using tensor decomposition and kernel attention, with rigorous theoretical formulations. It provides empirical validation on a specific dataset (Stocknet) with multimodal data (prices and tweets), though the summary lacks details on code, backtesting specifics, or extensive statistical metrics, making it more research-oriented than immediately backtest-ready.
  flowchart TD
    A["Research Goal: Enhance Stock Movement Prediction<br>on Multimodal Time-Series Data"] --> B["Data/Inputs"]
    B --> B1["Historical Prices<br>Technical Analysis"]
    B --> B2["Related Tweets<br>Fundamental Analysis"]
    B1 & B2 --> C["Methodology: Higher Order Transformers"]
    C --> D["Computational Processes"]
    D --> D1["Extend Self-Attention to Higher Order"]
    D --> D2["Low-Rank Tensor Decomposition<br>+ Kernel Attention<br>Complexity: Linear O(N)"]
    D1 & D2 --> E["Key Findings/Outcomes"]
    E --> E1["Effective Capture of Complex<br>Market Dynamics"]
    E --> E2["Enhanced Prediction Accuracy<br>on Stocknet Dataset"]