Stock Volatility Prediction Based on Transformer Model Using Mixed-Frequency Data

ArXiv ID: 2309.16196 “View on arXiv”

Authors: Unknown

Abstract

With the increasing volume of high-frequency data in the information age, both challenges and opportunities arise in the prediction of stock volatility. On one hand, the outcome of prediction using tradition method combining stock technical and macroeconomic indicators still leaves room for improvement; on the other hand, macroeconomic indicators and peoples’ search record on those search engines affecting their interested topics will intuitively have an impact on the stock volatility. For the convenience of assessment of the influence of these indicators, macroeconomic indicators and stock technical indicators are then grouped into objective factors, while Baidu search indices implying people’s interested topics are defined as subjective factors. To align different frequency data, we introduce GARCH-MIDAS model. After mixing all the above data, we then feed them into Transformer model as part of the training data. Our experiments show that this model outperforms the baselines in terms of mean square error. The adaption of both types of data under Transformer model significantly reduces the mean square error from 1.00 to 0.86.

Keywords: Transformer, GARCH-MIDAS, Volatility Prediction, Macroeconomic Indicators, Search Volume Data, Equities

Complexity vs Empirical Score

  • Math Complexity: 5.0/10
  • Empirical Rigor: 6.0/10
  • Quadrant: Street Traders
  • Why: The paper incorporates a mix of advanced statistical modeling (GARCH-MIDAS) and modern deep learning (Transformers), reflecting moderate mathematical complexity, and provides empirical results with specific error metrics (MSE reduction from 1.00 to 0.86) on real-world financial data, indicating practical implementation and testing.
  flowchart TD
    A["Research Goal:<br>Predict Stock Volatility<br>with Mixed-Frequency Data"] --> B{"Data Collection"}
    
    B --> C["Objective Factors<br>Technical & Macro Indicators"]
    B --> D["Subjective Factors<br>Baidu Search Indices"]
    
    C --> E["GARCH-MIDAS Model<br>for Data Alignment"]
    D --> E
    
    E --> F["Transformer Model<br>Input: Mixed-Frequency Data"]
    
    F --> G["Key Finding: Model Outperforms Baselines<br>MSE: 1.00 → 0.86<br>Significant reduction with hybrid data"]