Recurrent Neural Networks with more flexible memory: better predictions than rough volatility

ArXiv ID: 2308.08550 “View on arXiv”

Authors: Unknown

Abstract

We extend recurrent neural networks to include several flexible timescales for each dimension of their output, which mechanically improves their abilities to account for processes with long memory or with highly disparate time scales. We compare the ability of vanilla and extended long short term memory networks (LSTMs) to predict asset price volatility, known to have a long memory. Generally, the number of epochs needed to train extended LSTMs is divided by two, while the variation of validation and test losses among models with the same hyperparameters is much smaller. We also show that the model with the smallest validation loss systemically outperforms rough volatility predictions by about 20% when trained and tested on a dataset with multiple time series.

Keywords: Recurrent Neural Networks, Long Short-Term Memory (LSTM), Long Memory, Volatility Prediction, Timescales, Equities

Complexity vs Empirical Score

  • Math Complexity: 7.5/10
  • Empirical Rigor: 8.0/10
  • Quadrant: Holy Grail
  • Why: The paper presents advanced mathematical modifications to RNNs (multiple timescale approximations, GRU/LSTM architecture derivations) while rigorously evaluating performance through a systematic backtest on financial data, including hyperparameter sweeps, train/validation/test splits, and comparison against a standard benchmark (rough volatility).
  flowchart TD
    A["Research Goal: Improve RNNs for Volatility Prediction<br>with Long Memory"] --> B["Methodology: Extended LSTM with<br>Flexible Timescales"]
    B --> C["Data: Multi-Time Series Equities Dataset"]
    C --> D["Computational Process:<br>Train Vanilla vs. Extended LSTMs"]
    D --> E{"Evaluation & Comparison"}
    E --> F["Outcome 1: 2x Faster Convergence<br>& Reduced Model Variance"]
    E --> G["Outcome 2: ~20% Better Accuracy<br>than Rough Volatility Models"]