Forecasting S&P 500 Using LSTM Models

ArXiv ID: 2501.17366 “View on arXiv”

Authors: Unknown

Abstract

With the volatile and complex nature of financial data influenced by external factors, forecasting the stock market is challenging. Traditional models such as ARIMA and GARCH perform well with linear data but struggle with non-linear dependencies. Machine learning and deep learning models, particularly Long Short-Term Memory (LSTM) networks, address these challenges by capturing intricate patterns and long-term dependencies. This report compares ARIMA and LSTM models in predicting the S&P 500 index, a major financial benchmark. Using historical price data and technical indicators, we evaluated these models using Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE). The ARIMA model showed reasonable performance with an MAE of 462.1, RMSE of 614, and 89.8 percent accuracy, effectively capturing short-term trends but limited by its linear assumptions. The LSTM model, leveraging sequential processing capabilities, outperformed ARIMA with an MAE of 369.32, RMSE of 412.84, and 92.46 percent accuracy, capturing both short- and long-term dependencies. Notably, the LSTM model without additional features performed best, achieving an MAE of 175.9, RMSE of 207.34, and 96.41 percent accuracy, showcasing its ability to handle market data efficiently. Accurately predicting stock movements is crucial for investment strategies, risk assessments, and market stability. Our findings confirm the potential of deep learning models in handling volatile financial data compared to traditional ones. The results highlight the effectiveness of LSTM and suggest avenues for further improvements. This study provides insights into financial forecasting, offering a comparative analysis of ARIMA and LSTM while outlining their strengths and limitations.

Keywords: Long Short-Term Memory (LSTM), ARIMA, Time Series Forecasting, S&P 500, Technical Indicators, Equities

Complexity vs Empirical Score

  • Math Complexity: 4.0/10
  • Empirical Rigor: 8.0/10
  • Quadrant: Street Traders
  • Why: The paper uses standard machine learning concepts without deep mathematical derivations, fitting a low-complexity profile, but it provides specific backtest-ready metrics (MAE, RMSE) and comparative results on historical S&P 500 data, showing high empirical rigor.
  flowchart TD
    A["Research Goal: Forecast S&P 500<br>using LSTM vs. ARIMA"] --> B["Data & Inputs<br>Historical Prices & Technical Indicators"]
    B --> C{"Model Selection & Processing"}
    C --> D["ARIMA Model<br>Linear Processing"]
    C --> E["LSTM Model<br>Non-linear Processing"]
    D --> F["Evaluation: MAE, RMSE"]
    E --> F
    F --> G["Key Findings/Outcomes"]
    G --> H["ARIMA: MAE 462.1<br>RMSE 614<br>Accuracy 89.8%"]
    G --> I["LSTM: MAE 369.32<br>RMSE 412.84<br>Accuracy 92.46%"]
    G --> J["LSTM (No Features):<br>MAE 175.9, RMSE 207.34<br>Accuracy 96.41%"]