Optimizing Time Series Forecasting: A Comparative Study of Adam and Nesterov Accelerated Gradient on LSTM and GRU networks Using Stock Market data

ArXiv ID: 2410.01843 “View on arXiv”

Authors: Unknown

Abstract

Several studies have discussed the impact different optimization techniques in the context of time series forecasting across different Neural network architectures. This paper examines the effectiveness of Adam and Nesterov’s Accelerated Gradient (NAG) optimization techniques on LSTM and GRU neural networks for time series prediction, specifically stock market time-series. Our study was done by training LSTM and GRU models with two different optimization techniques - Adam and Nesterov Accelerated Gradient (NAG), comparing and evaluating their performance on Apple Inc’s closing price data over the last decade. The GRU model optimized with Adam produced the lowest RMSE, outperforming the other model-optimizer combinations in both accuracy and convergence speed. The GRU models with both optimizers outperformed the LSTM models, whilst the Adam optimizer outperformed the NAG optimizer for both model architectures. The results suggest that GRU models optimized with Adam are well-suited for practitioners in time-series prediction, more specifically stock price time series prediction producing accurate and computationally efficient models. The code for the experiments in this project can be found at https://github.com/AhmadMak/Time-Series-Optimization-Research Keywords: Time-series Forecasting, Neural Network, LSTM, GRU, Adam Optimizer, Nesterov Accelerated Gradient (NAG) Optimizer

Keywords: Adam optimizer, Nesterov Accelerated Gradient, LSTM, GRU, time series forecasting, equities

Complexity vs Empirical Score

  • Math Complexity: 3.5/10
  • Empirical Rigor: 7.0/10
  • Quadrant: Street Traders
  • Why: The paper applies standard neural network architectures and optimization algorithms with relatively basic mathematical exposition, while the empirical component is strong due to the use of real financial data (Apple stock), specific performance metrics (RMSE), and a public code repository.
  flowchart TD
    A["Research Goal: Compare optimizers on RNNs for stock forecasting"] --> B{"Methodology"}
    B --> C["Data: Apple Inc. closing prices 2014-2024"]
    B --> D["Models: LSTM vs GRU"]
    B --> E["Optimizers: Adam vs Nesterov Accelerated Gradient"]
    
    C & D & E --> F["Training & Evaluation: RMSE & Convergence Speed"]
    
    F --> G{"Key Findings"}
    G --> H["GRU-Adam: Lowest RMSE, Best Performance"]
    G --> I["GRU > LSTM regardless of optimizer"]
    G --> J["Adam > Nesterov regardless of architecture"]