End-to-End Policy Learning of a Statistical Arbitrage Autoencoder Architecture
ArXiv ID: 2402.08233 “View on arXiv”
Authors: Unknown
Abstract
In Statistical Arbitrage (StatArb), classical mean reversion trading strategies typically hinge on asset-pricing or PCA based models to identify the mean of a synthetic asset. Once such a (linear) model is identified, a separate mean reversion strategy is then devised to generate a trading signal. With a view of generalising such an approach and turning it truly data-driven, we study the utility of Autoencoder architectures in StatArb. As a first approach, we employ a standard Autoencoder trained on US stock returns to derive trading strategies based on the Ornstein-Uhlenbeck (OU) process. To further enhance this model, we take a policy-learning approach and embed the Autoencoder network into a neural network representation of a space of portfolio trading policies. This integration outputs portfolio allocations directly and is end-to-end trainable by backpropagation of the risk-adjusted returns of the neural policy. Our findings demonstrate that this innovative end-to-end policy learning approach not only simplifies the strategy development process, but also yields superior gross returns over its competitors illustrating the potential of end-to-end training over classical two-stage approaches.
Keywords: Statistical Arbitrage, Autoencoder, Ornstein-Uhlenbeck (OU) process, Policy Learning, End-to-End Training, Equities
Complexity vs Empirical Score
- Math Complexity: 8.0/10
- Empirical Rigor: 6.5/10
- Quadrant: Holy Grail
- Why: The paper employs advanced concepts like policy gradient, end-to-end training of neural networks, and references stochastic calculus and OU processes, indicating high math complexity. It includes backtesting on historical US equity returns and comparisons to benchmarks, demonstrating significant empirical rigor.
flowchart TD
A["Research Goal: <br> Generalise StatArb with Data-Driven Methods"] --> B["Input: US Stock Returns Data"]
B --> C["Phase 1: Classical Autoencoder <br> (Unsupervised Feature Extraction)"]
C --> D["Phase 2: Policy Learning <br> (End-to-End Neural Network)"]
D --> E["Computational Process: <br> Backpropagation of Risk-Adjusted Returns"]
E --> F["Outcome: Direct Portfolio Allocations"]
F --> G["Key Finding: <br> Superior Gross Returns vs. Competitors"]