Forecasting Equity Correlations with Hybrid Transformer Graph Neural Network
ArXiv ID: 2601.04602 “View on arXiv”
Authors: Jack Fanshawe, Rumi Masih, Alexander Cameron
Abstract
This paper studies forward-looking stock-stock correlation forecasting for S&P 500 constituents and evaluates whether learned correlation forecasts can improve graph-based clustering used in basket trading strategies. We cast 10-day ahead correlation prediction in Fisher-z space and train a Temporal-Heterogeneous Graph Neural Network (THGNN) to predict residual deviations from a rolling historical baseline. The architecture combines a Transformer-based temporal encoder, which captures non-stationary, complex, temporal dependencies, with an edge-aware graph attention network that propagates cross-asset information over the equity network. Inputs span daily returns, technicals, sector structure, previous correlations, and macro signals, enabling regime-aware forecasts and attention-based feature and neighbor importance to provide interpretability. Out-of-sample results from 2019-2024 show that the proposed model meaningfully reduces correlation forecasting error relative to rolling-window estimates. When integrated into a graph-based clustering framework, forward-looking correlations produce adaptable and economically meaningfully baskets, particularly during periods of market stress. These findings suggest that improvements in correlation forecasts translate into meaningful gains during portfolio construction tasks.
Keywords: Stock Correlation Forecasting, Graph Neural Networks (GNN), Transformer, Basket Trading, Regime Detection
Complexity vs Empirical Score
- Math Complexity: 8.5/10
- Empirical Rigor: 7.0/10
- Quadrant: Holy Grail
- Why: The paper employs advanced deep learning architectures (Transformer and GNN) with hybrid modeling and Fisher-z transformations, indicating high mathematical density. Empirical results are validated on recent S&P 500 data (2019-2024) with out-of-sample error metrics and portfolio integration, demonstrating significant backtest-ready implementation.
flowchart TD
A["Research Goal: Forecast Stock Correlations<br>and Evaluate Trading Applications"] --> B["Data & Inputs<br>Historical Returns, Technicals, Sectors, Macro Signals"]
B --> C["Methodology: Hybrid THGNN<br>Transformer Temporal Encoder + Graph Attention Network"]
C --> D["Model Training: Predict Residuals<br>in Fisher-z Space (10-Day Horizon)"]
D --> E["Out-of-Sample Validation<br>Period: 2019-2024"]
E --> F{"Key Outcomes"}
F --> G["Forecasting<br>Reduced Error vs. Rolling Baseline"]
F --> H["Application: Graph Clustering<br>Economically Meaningful Baskets<br>Strong Performance During Market Stress"]