Time-Causal VAE: Robust Financial Time Series Generator
ArXiv ID: 2411.02947 “View on arXiv”
Authors: Unknown
Abstract
We build a time-causal variational autoencoder (TC-VAE) for robust generation of financial time series data. Our approach imposes a causality constraint on the encoder and decoder networks, ensuring a causal transport from the real market time series to the fake generated time series. Specifically, we prove that the TC-VAE loss provides an upper bound on the causal Wasserstein distance between market distributions and generated distributions. Consequently, the TC-VAE loss controls the discrepancy between optimal values of various dynamic stochastic optimization problems under real and generated distributions. To further enhance the model’s ability to approximate the latent representation of the real market distribution, we integrate a RealNVP prior into the TC-VAE framework. Finally, extensive numerical experiments show that TC-VAE achieves promising results on both synthetic and real market data. This is done by comparing real and generated distributions according to various statistical distances, demonstrating the effectiveness of the generated data for downstream financial optimization tasks, as well as showcasing that the generated data reproduces stylized facts of real financial market data.
Keywords: Variational Autoencoder (VAE), Causal Inference, Wasserstein Distance, RealNVP, Financial Time Series Generation
Complexity vs Empirical Score
- Math Complexity: 8.5/10
- Empirical Rigor: 7.0/10
- Quadrant: Holy Grail
- Why: The paper presents advanced mathematics with proofs on causal optimal transport, Wasserstein distances, and integrations with RealNVP, while also demonstrating strong empirical rigor through extensive numerical experiments on synthetic and real market data, comparing statistical distances and downstream financial tasks.
flowchart TD
A["Research Goal:<br>Generate Robust Financial<br>Time Series Data"] --> B["Method: Time-Causal VAE (TC-VAE)<br>Integrates Causality Constraints + RealNVP Prior"]
B --> C["Compute Loss:<br>Upper Bound on Causal<br>Wasserstein Distance"]
C --> D["Optimize via<br>TC-VAE Training<br>Encoder + Decoder"]
D --> E["Generate Synthetic<br>Financial Time Series"]
E --> F["Validation:<br>Compare Real vs. Generated<br>Distributions"]
F --> G["Key Outcomes:<br>1. Validated Statistical Metrics<br>2. Effective Downstream Optimization<br>3. Reproduces Stylized Facts"]