Forecasting the Performance of US Stock Market Indices During COVID-19: RF vs LSTM

ArXiv ID: 2306.03620 “View on arXiv”

Authors: Unknown

Abstract

The US stock market experienced instability following the recession (2007-2009). COVID-19 poses a significant challenge to US stock traders and investors. Traders and investors should keep up with the stock market. This is to mitigate risks and improve profits by using forecasting models that account for the effects of the pandemic. With consideration of the COVID-19 pandemic after the recession, two machine learning models, including Random Forest and LSTM are used to forecast two major US stock market indices. Data on historical prices after the big recession is used for developing machine learning models and forecasting index returns. To evaluate the model performance during training, cross-validation is used. Additionally, hyperparameter optimizing, regularization, such as dropouts and weight decays, and preprocessing improve the performances of Machine Learning techniques. Using high-accuracy machine learning techniques, traders and investors can forecast stock market behavior, stay ahead of their competition, and improve profitability. Keywords: COVID-19, LSTM, S&P500, Random Forest, Russell 2000, Forecasting, Machine Learning, Time Series JEL Code: C6, C8, G4.

Keywords: Long Short-Term Memory (LSTM), Random Forest, Time Series Forecasting, Cross-Validation, Hyperparameter Optimization, Equities

Complexity vs Empirical Score

Math Complexity: 4.0/10
Empirical Rigor: 6.0/10
Quadrant: Street Traders
Why: The paper applies established machine learning models (Random Forest, LSTM) to financial time series forecasting with moderate mathematical complexity, but it is highly empirical, focusing on data preprocessing, hyperparameter tuning, and backtest-ready methodology for trading signals.

  flowchart TD
    A["Research Goal: Forecast US Stock Indices<br/>S&P500 & Russell 2000<br/>During COVID-19"] --> B["Data Collection<br/>Historical Prices Post-2008 Recession"]
    B --> C["Preprocessing & Feature Engineering<br/>Normalization, Technical Indicators, Split Data"]
    C --> D{"Model Training & Optimization"}
    D --> D1["Random Forest (RF)"]
    D --> D2["LSTM Neural Network<br/>Dropout & Weight Decay"]
    D1 & D2 --> E["Model Evaluation<br/>Cross-Validation & Performance Metrics"]
    E --> F["Key Findings<br/>LSTM outperforms RF<br/>Validated for Risk Mitigation & Profitability"]

Forecasting the Performance of US Stock Market Indices During COVID-19: RF vs LSTM#

Abstract#

Complexity vs Empirical Score#

Forecasting the Performance of US Stock Market Indices During COVID-19: RF vs LSTM

Abstract

Complexity vs Empirical Score