Global Neural Networks and The Data Scaling Effect in Financial Time Series Forecasting

ArXiv ID: 2309.02072 “View on arXiv”

Authors: Unknown

Abstract

Neural networks have revolutionized many empirical fields, yet their application to financial time series forecasting remains controversial. In this study, we demonstrate that the conventional practice of estimating models locally in data-scarce environments may underlie the mixed empirical performance observed in prior work. By focusing on volatility forecasting, we employ a dataset comprising over 10,000 global stocks and implement a global estimation strategy that pools information across cross-sections. Our econometric analysis reveals that forecasting accuracy improves markedly as the training dataset becomes larger and more heterogeneous. Notably, even with as little as 12 months of data, globally trained networks deliver robust predictions for individual stocks and portfolios that are not even in the training dataset. Furthermore, our interpretation of the model dynamics shows that these networks not only capture key stylized facts of volatility but also exhibit resilience to outliers and rapid adaptation to market regime changes. These findings underscore the importance of leveraging extensive and diverse datasets in financial forecasting and advocate for a shift from traditional local training approaches to integrated global estimation methods.

Keywords: Global estimation, Volatility forecasting, Cross-section, Econometric analysis, Neural networks, Equities

Complexity vs Empirical Score

  • Math Complexity: 6.5/10
  • Empirical Rigor: 8.0/10
  • Quadrant: Holy Grail
  • Why: The paper employs advanced econometric and neural network formulations with formal likelihood specifications, while backing them with a massive 10,000-stock dataset, out-of-sample generalization tests, and a provided implementation notebook, bridging theoretical depth with rigorous empirical validation.
  flowchart TD
    A["Research Goal:<br>Global vs. Local NNs in Financial Time Series"] --> B{"Key Methodology"}
    
    B --> B1["Dataset: 10k+ Global Stocks"]
    B --> B2["Focus: Volatility Forecasting"]
    B --> B3["Strategy: Global Estimation<br>Pooling Cross-sectional Info"]
    
    B1 & B2 & B3 --> C{"Computational Process"}
    C --> C1["Train Neural Network<br>on Large Heterogeneous Data"]
    
    C1 --> D["Key Findings / Outcomes"]
    
    D --> D1["Accuracy ↑ as<br>Training Data Increases"]
    D --> D2["Robust Predictions<br>on Unseen Stocks/Portfolios"]
    D --> D3["Captures Stylized Facts<br>Adapts to Regime Changes"]