TCGPN: Temporal-Correlation Graph Pre-trained Network for Stock Forecasting

ArXiv ID: 2407.18519 “View on arXiv”

Authors: Unknown

Abstract

Recently, the incorporation of both temporal features and the correlation across time series has become an effective approach in time series prediction. Spatio-Temporal Graph Neural Networks (STGNNs) demonstrate good performance on many Temporal-correlation Forecasting Problem. However, when applied to tasks lacking periodicity, such as stock data prediction, the effectiveness and robustness of STGNNs are found to be unsatisfactory. And STGNNs are limited by memory savings so that cannot handle problems with a large number of nodes. In this paper, we propose a novel approach called the Temporal-Correlation Graph Pre-trained Network (TCGPN) to address these limitations. TCGPN utilize Temporal-correlation fusion encoder to get a mixed representation and pre-training method with carefully designed temporal and correlation pre-training tasks. Entire structure is independent of the number and order of nodes, so better results can be obtained through various data enhancements. And memory consumption during training can be significantly reduced through multiple sampling. Experiments are conducted on real stock market data sets CSI300 and CSI500 that exhibit minimal periodicity. We fine-tune a simple MLP in downstream tasks and achieve state-of-the-art results, validating the capability to capture more robust temporal correlation patterns.

Keywords: Spatio-Temporal Graph Neural Networks (STGNNs), Pre-training, Stock Prediction, Graph Theory, Temporal Correlation, Equities

Complexity vs Empirical Score

  • Math Complexity: 7.5/10
  • Empirical Rigor: 8.0/10
  • Quadrant: Holy Grail
  • Why: The paper introduces a novel Temporal-Correlation Fusion Encoder and pre-training tasks with mathematical formulations, indicating advanced ML theory. It is heavily backed by experiments on real stock datasets (CSI300, CSI500) with SOTA results, ablation studies, and implementation details like memory optimization, making it empirically rigorous.
  flowchart TD
    A["Research Goal: Improve Stock Prediction<br>for non-periodic time series data"] --> B["Data Input: CSI300 & CSI500<br>(Stock market datasets)"]
    B --> C["Methodology: TCGPN Architecture<br>Temporal-Correlation Fusion Encoder"]
    C --> D["Pre-training Tasks<br>Temporal & Correlation tasks"]
    D --> E["Memory Optimization<br>Node-independent structure + Sampling"]
    E --> F["Downstream Task: MLP Fine-tuning"]
    F --> G["Outcomes:<br>State-of-the-Art Results<br>Robust Pattern Capture<br>Reduced Memory Consumption"]