Chain-structured neural architecture search for financial time series forecasting

ArXiv ID: 2403.14695 “View on arXiv”

Authors: Unknown

Abstract

Neural architecture search (NAS) emerged as a way to automatically optimize neural networks for a specific task and dataset. Despite an abundance of research on NAS for images and natural language applications, similar studies for time series data are lacking. Among NAS search spaces, chain-structured are the simplest and most applicable to small datasets like time series. We compare three popular NAS strategies on chain-structured search spaces: Bayesian optimization (specifically Tree-structured Parzen Estimator), the hyperband method, and reinforcement learning in the context of financial time series forecasting. These strategies were employed to optimize simple well-understood neural architectures like the MLP, 1D CNN, and RNN, with more complex temporal fusion transformers (TFT) and their own optimizers included for comparison. We find Bayesian optimization and the hyperband method performing best among the strategies, and RNN and 1D CNN best among the architectures, but all methods were very close to each other with a high variance due to the difficulty of working with financial datasets. We discuss our approach to overcome the variance and provide implementation recommendations for future users and researchers.

Keywords: Neural Architecture Search, Time Series Forecasting, Bayesian Optimization, Deep Learning, Financial Datasets

Complexity vs Empirical Score

  • Math Complexity: 4.5/10
  • Empirical Rigor: 7.0/10
  • Quadrant: Street Traders
  • Why: The paper involves advanced hyperparameter optimization techniques and neural network architecture search, but focuses on practical implementation with real-world financial datasets, backtesting, and specific performance metrics like F1 and AUC.
  flowchart TD
    A["Research Goal: Optimize Neural Networks<br>for Financial Time Series Forecasting"] --> B["Methodology: Chain-Structured NAS<br>with 3 Strategies & 5 Architectures"]
    B --> C["Data Inputs: Financial Datasets<br>(High Variance, Small Size)"]
    C --> D{"Computational Process"}
    D --> E["Bayesian Optimization<br>(Tree-structured Parzen Estimator)"]
    D --> F["Hyperband Method<br>(Resource-based Pruning)"]
    D --> G["Reinforcement Learning<br>(Policy Gradient)"]
    E & F & G --> H["Key Outcomes"]
    H --> I["Best Strategies: Bayesian & Hyperband"]
    H --> J["Best Architectures: RNN & 1D CNN"]