false

Stochastic Volatility Modelling with LSTM Networks: A Hybrid Approach for S&P 500 Index Volatility Forecasting

Stochastic Volatility Modelling with LSTM Networks: A Hybrid Approach for S&P 500 Index Volatility Forecasting ArXiv ID: 2512.12250 “View on arXiv” Authors: Anna Perekhodko, Robert Ślepaczuk Abstract Accurate volatility forecasting is essential in banking, investment, and risk management, because expectations about future market movements directly influence current decisions. This study proposes a hybrid modelling framework that integrates a Stochastic Volatility model with a Long Short Term Memory neural network. The SV model improves statistical precision and captures latent volatility dynamics, especially in response to unforeseen events, while the LSTM network enhances the model’s ability to detect complex nonlinear patterns in financial time series. The forecasting is conducted using daily data from the S and P 500 index, covering the period from January 1 1998 to December 31 2024. A rolling window approach is employed to train the model and generate one step ahead volatility forecasts. The performance of the hybrid SV-LSTM model is evaluated through both statistical testing and investment simulations. The results show that the hybrid approach outperforms both the standalone SV and LSTM models and contributes to the development of volatility modelling techniques, providing a foundation for improving risk assessment and strategic investment planning in the context of the S and P 500. ...

December 13, 2025 · 2 min · Research Team

A Practical Machine Learning Approach for Dynamic Stock Recommendation

A Practical Machine Learning Approach for Dynamic Stock Recommendation ArXiv ID: 2511.12129 “View on arXiv” Authors: Hongyang Yang, Xiao-Yang Liu, Qingwei Wu Abstract Stock recommendation is vital to investment companies and investors. However, no single stock selection strategy will always win while analysts may not have enough time to check all S&P 500 stocks (the Standard & Poor’s 500). In this paper, we propose a practical scheme that recommends stocks from S&P 500 using machine learning. Our basic idea is to buy and hold the top 20% stocks dynamically. First, we select representative stock indicators with good explanatory power. Secondly, we take five frequently used machine learning methods, including linear regression, ridge regression, stepwise regression, random forest and generalized boosted regression, to model stock indicators and quarterly log-return in a rolling window. Thirdly, we choose the model with the lowest Mean Square Error in each period to rank stocks. Finally, we test the selected stocks by conducting portfolio allocation methods such as equally weighted, mean-variance, and minimum-variance. Our empirical results show that the proposed scheme outperforms the long-only strategy on the S&P 500 index in terms of Sharpe ratio and cumulative returns. This work is fully open-sourced at \href{“https://github.com/AI4Finance-Foundation/Dynamic-Stock-Recommendation-Machine_Learning-Published-Paper-IEEE"}{"GitHub"}. ...

November 15, 2025 · 2 min · Research Team

A three-step machine learning approach to predict market bubbles with financial news

A three-step machine learning approach to predict market bubbles with financial news ArXiv ID: 2510.16636 “View on arXiv” Authors: Abraham Atsiwo Abstract This study presents a three-step machine learning framework to predict bubbles in the S&P 500 stock market by combining financial news sentiment with macroeconomic indicators. Building on traditional econometric approaches, the proposed approach predicts bubble formation by integrating textual and quantitative data sources. In the first step, bubble periods in the S&P 500 index are identified using a right-tailed unit root test, a widely recognized real-time bubble detection method. The second step extracts sentiment features from large-scale financial news articles using natural language processing (NLP) techniques, which capture investors’ expectations and behavioral patterns. In the final step, ensemble learning methods are applied to predict bubble occurrences based on high sentiment-based and macroeconomic predictors. Model performance is evaluated through k-fold cross-validation and compared against benchmark machine learning algorithms. Empirical results indicate that the proposed three-step ensemble approach significantly improves predictive accuracy and robustness, providing valuable early warning insights for investors, regulators, and policymakers in mitigating systemic financial risks. ...

October 18, 2025 · 2 min · Research Team

Improving S&P 500 Volatility Forecasting through Regime-Switching Methods

Improving S&P 500 Volatility Forecasting through Regime-Switching Methods ArXiv ID: 2510.03236 “View on arXiv” Authors: Ava C. Blake, Nivika A. Gandhi, Anurag R. Jakkula Abstract Accurate prediction of financial market volatility is critical for risk management, derivatives pricing, and investment strategy. In this study, we propose a multitude of regime-switching methods to improve the prediction of S&P 500 volatility by capturing structural changes in the market across time. We use eleven years of SPX data, from May 1st, 2014 to May 27th, 2025, to compute daily realized volatility (RV) from 5-minute intraday log returns, adjusted for irregular trading days. To enhance forecast accuracy, we engineered features to capture both historical dynamics and forward-looking market sentiment across regimes. The regime-switching methods include a soft Markov switching algorithm to estimate soft-regime probabilities, a distributional spectral clustering method that uses XGBoost to assign clusters at prediction time, and a coefficient-based soft regime algorithm that extracts HAR coefficients from time segments segmented through the Mood test and clusters through Bayesian GMM for soft regime weights, using XGBoost to predict regime probabilities. Models were evaluated across three time periods–before, during, and after the COVID-19 pandemic. The coefficient-based clustering algorithm outperformed all other models, including the baseline autoregressive model, during all time periods. Additionally, each model was evaluated on its recursive forecasting performance for 5- and 10-day horizons during each time period. The findings of this study demonstrate the value of regime-aware modeling frameworks and soft clustering approaches in improving volatility forecasting, especially during periods of heightened uncertainty and structural change. ...

September 21, 2025 · 2 min · Research Team

Forecasting S&P 500 Using LSTM Models

Forecasting S&P 500 Using LSTM Models ArXiv ID: 2501.17366 “View on arXiv” Authors: Unknown Abstract With the volatile and complex nature of financial data influenced by external factors, forecasting the stock market is challenging. Traditional models such as ARIMA and GARCH perform well with linear data but struggle with non-linear dependencies. Machine learning and deep learning models, particularly Long Short-Term Memory (LSTM) networks, address these challenges by capturing intricate patterns and long-term dependencies. This report compares ARIMA and LSTM models in predicting the S&P 500 index, a major financial benchmark. Using historical price data and technical indicators, we evaluated these models using Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE). The ARIMA model showed reasonable performance with an MAE of 462.1, RMSE of 614, and 89.8 percent accuracy, effectively capturing short-term trends but limited by its linear assumptions. The LSTM model, leveraging sequential processing capabilities, outperformed ARIMA with an MAE of 369.32, RMSE of 412.84, and 92.46 percent accuracy, capturing both short- and long-term dependencies. Notably, the LSTM model without additional features performed best, achieving an MAE of 175.9, RMSE of 207.34, and 96.41 percent accuracy, showcasing its ability to handle market data efficiently. Accurately predicting stock movements is crucial for investment strategies, risk assessments, and market stability. Our findings confirm the potential of deep learning models in handling volatile financial data compared to traditional ones. The results highlight the effectiveness of LSTM and suggest avenues for further improvements. This study provides insights into financial forecasting, offering a comparative analysis of ARIMA and LSTM while outlining their strengths and limitations. ...

January 29, 2025 · 2 min · Research Team

AI-Enhanced Factor Analysis for Predicting S&P 500 Stock Dynamics

AI-Enhanced Factor Analysis for Predicting S&P 500 Stock Dynamics ArXiv ID: 2412.12438 “View on arXiv” Authors: Unknown Abstract This project investigates the interplay of technical, market, and statistical factors in predicting stock market performance, with a primary focus on S&P 500 companies. Utilizing a comprehensive dataset spanning multiple years, the analysis constructs advanced financial metrics, such as momentum indicators, volatility measures, and liquidity adjustments. The machine learning framework is employed to identify patterns, relationships, and predictive capabilities of these factors. The integration of traditional financial analytics with machine learning enables enhanced predictive accuracy, offering valuable insights into market behavior and guiding investment strategies. This research highlights the potential of combining domain-specific financial expertise with modern computational tools to address complex market dynamics. ...

December 17, 2024 · 2 min · Research Team

Hunting Tomorrow's Leaders: Using Machine Learning to Forecast S&P 500 Additions & Removal

Hunting Tomorrow’s Leaders: Using Machine Learning to Forecast S&P 500 Additions & Removal ArXiv ID: 2412.12539 “View on arXiv” Authors: Unknown Abstract This study applies machine learning to predict S&P 500 membership changes: key events that profoundly impact investor behavior and market dynamics. Quarterly data from WRDS datasets (2013 onwards) was used, incorporating features such as industry classification, financial data, market data, and corporate governance indicators. Using a Random Forest model, we achieved a test F1 score of 0.85, outperforming logistic regression and SVC models. This research not only showcases the power of machine learning for financial forecasting but also emphasizes model transparency through SHAP analysis and feature engineering. The model’s real world applicability is demonstrated with predicted changes for Q3 2023, such as the addition of Uber (UBER) and the removal of SolarEdge Technologies (SEDG). By incorporating these predictions into a trading strategy i.e. buying stocks announced for addition and shorting those marked for removal, we anticipate capturing alpha and enhancing investment decision making, offering valuable insights into index dynamics ...

December 17, 2024 · 2 min · Research Team

Evaluating Financial Relational Graphs: Interpretation Before Prediction

Evaluating Financial Relational Graphs: Interpretation Before Prediction ArXiv ID: 2410.07216 “View on arXiv” Authors: Unknown Abstract Accurate and robust stock trend forecasting has been a crucial and challenging task, as stock price changes are influenced by multiple factors. Graph neural network-based methods have recently achieved remarkable success in this domain by constructing stock relationship graphs that reflect internal factors and relationships between stocks. However, most of these methods rely on predefined factors to construct static stock relationship graphs due to the lack of suitable datasets, failing to capture the dynamic changes in stock relationships. Moreover, the evaluation of relationship graphs in these methods is often tied to the performance of neural network models on downstream tasks, leading to confusion and imprecision. To address these issues, we introduce the SPNews dataset, collected based on S&P 500 Index stocks, to facilitate the construction of dynamic relationship graphs. Furthermore, we propose a novel set of financial relationship graph evaluation methods that are independent of downstream tasks. By using the relationship graph to explain historical financial phenomena, we assess its validity before constructing a graph neural network, ensuring the graph’s effectiveness in capturing relevant financial relationships. Experimental results demonstrate that our evaluation methods can effectively differentiate between various financial relationship graphs, yielding more interpretable results compared to traditional approaches. We make our source code publicly available on GitHub to promote reproducibility and further research in this area. ...

September 28, 2024 · 2 min · Research Team

Dynamical analysis of financial stocks network: improving forecasting using network properties

Dynamical analysis of financial stocks network: improving forecasting using network properties ArXiv ID: 2408.11759 “View on arXiv” Authors: Unknown Abstract Applying a network analysis to stock return correlations, we study the dynamical properties of the network and how they correlate with the market return, finding meaningful variables that partially capture the complex dynamical processes of stock interactions and the market structure. We then use the individual properties of stocks within the network along with the global ones, to find correlations with the future returns of individual S&P 500 stocks. Applying these properties as input variables for forecasting, we find a 50% improvement on the R2score in the prediction of stock returns on long time scales (per year), and 3% on short time scales (2 days), relative to baseline models without network variables. ...

August 21, 2024 · 2 min · Research Team

Less is more: AI Decision-Making using Dynamic Deep Neural Networks for Short-Term Stock Index Prediction

Less is more: AI Decision-Making using Dynamic Deep Neural Networks for Short-Term Stock Index Prediction ArXiv ID: 2408.11740 “View on arXiv” Authors: Unknown Abstract In this paper we introduce a multi-agent deep-learning method which trades in the Futures markets based on the US S&P 500 index. The method (referred to as Model A) is an innovation founded on existing well-established machine-learning models which sample market prices and associated derivatives in order to decide whether the investment should be long/short or closed (zero exposure), on a day-to-day decision. We compare the predictions with some conventional machine-learning methods namely, Long Short-Term Memory, Random Forest and Gradient-Boosted-Trees. Results are benchmarked against a passive model in which the Futures contracts are held (long) continuously with the same exposure (level of investment). Historical tests are based on daily daytime trading carried out over a period of 6 calendar years (2018-23). We find that Model A outperforms the passive investment in key performance metrics, placing it within the top quartile performance of US Large Cap active fund managers. Model A also outperforms the three machine-learning classification comparators over this period. We observe that Model A is extremely efficient (doing less and getting more) with an exposure to the market of only 41.95% compared to the 100% market exposure of the passive investment, and thus provides increased profitability with reduced risk. ...

August 21, 2024 · 2 min · Research Team