Improving S&P 500 Volatility Forecasting through Regime-Switching Methods
ArXiv ID: 2510.03236 “View on arXiv”
Authors: Ava C. Blake, Nivika A. Gandhi, Anurag R. Jakkula
Abstract
Accurate prediction of financial market volatility is critical for risk management, derivatives pricing, and investment strategy. In this study, we propose a multitude of regime-switching methods to improve the prediction of S&P 500 volatility by capturing structural changes in the market across time. We use eleven years of SPX data, from May 1st, 2014 to May 27th, 2025, to compute daily realized volatility (RV) from 5-minute intraday log returns, adjusted for irregular trading days. To enhance forecast accuracy, we engineered features to capture both historical dynamics and forward-looking market sentiment across regimes. The regime-switching methods include a soft Markov switching algorithm to estimate soft-regime probabilities, a distributional spectral clustering method that uses XGBoost to assign clusters at prediction time, and a coefficient-based soft regime algorithm that extracts HAR coefficients from time segments segmented through the Mood test and clusters through Bayesian GMM for soft regime weights, using XGBoost to predict regime probabilities. Models were evaluated across three time periods–before, during, and after the COVID-19 pandemic. The coefficient-based clustering algorithm outperformed all other models, including the baseline autoregressive model, during all time periods. Additionally, each model was evaluated on its recursive forecasting performance for 5- and 10-day horizons during each time period. The findings of this study demonstrate the value of regime-aware modeling frameworks and soft clustering approaches in improving volatility forecasting, especially during periods of heightened uncertainty and structural change.
Keywords: volatility forecasting, regime switching, XGBoost, S&P 500, time series analysis, Equities
Complexity vs Empirical Score
- Math Complexity: 7.0/10
- Empirical Rigor: 8.0/10
- Quadrant: Holy Grail
- Why: The paper employs advanced econometric methods including Markov switching, distributional spectral clustering, Bayesian GMM, and XGBoost, indicating high mathematical density. It uses substantial real-world data (11 years of 5-minute S&P 500 data) with detailed preprocessing, out-of-sample backtesting across distinct market regimes, and recursive forecasting, demonstrating strong empirical implementation.
flowchart TD
A["Research Goal<br>Improve S&P 500 Volatility<br>Forecasting via Regime Switching"] --> B{"Data Acquisition & Feature Engineering"}
B --> C["Methodology<br>Regime-Switching Algorithms"]
subgraph B ["Data & Features"]
B1["SPX Data<br>2014-2025"]
B2["Daily Realized Volatility RV"]
B3["Historical & Sentiment Features"]
end
subgraph C ["Core Methods"]
C1["Soft Markov Switching"]
C2["Spectral Clustering + XGBoost"]
C3["Coefficient-Based + Bayesian GMM + XGBoost"]
end
C --> D{"Model Evaluation"}
subgraph D ["Evaluation Framework"]
D1["Time Periods<br>Pre/During/Post COVID"]
D2["Horizons<br>1, 5, 10 Days"]
D3["Baseline: Autoregressive Model"]
end
D --> E["Key Findings"]
subgraph E ["Outcomes"]
E1["Coefficient-Based Algorithm<br>Best Overall Performance"]
E2["Outperformed Baseline &<br>All Other Regime Methods"]
E3["High Value in<br>Uncertain/Regime-Shifting Periods"]
end