Reinforcement Learning for Optimal Execution when Liquidity is Time-Varying
ArXiv ID: 2402.12049 “View on arXiv”
Authors: Unknown
Abstract
Optimal execution is an important problem faced by any trader. Most solutions are based on the assumption of constant market impact, while liquidity is known to be dynamic. Moreover, models with time-varying liquidity typically assume that it is observable, despite the fact that, in reality, it is latent and hard to measure in real time. In this paper we show that the use of Double Deep Q-learning, a form of Reinforcement Learning based on neural networks, is able to learn optimal trading policies when liquidity is time-varying. Specifically, we consider an Almgren-Chriss framework with temporary and permanent impact parameters following several deterministic and stochastic dynamics. Using extensive numerical experiments, we show that the trained algorithm learns the optimal policy when the analytical solution is available, and overcomes benchmarks and approximated solutions when the solution is not available.
Keywords: Optimal Execution, Reinforcement Learning, Market Impact, Liquidity, Almgren-Chriss Model, Equities
Complexity vs Empirical Score
- Math Complexity: 7.5/10
- Empirical Rigor: 4.0/10
- Quadrant: Lab Rats
- Why: The paper employs advanced mathematical formulations, including stochastic differential equations and a Double Deep Q-learning neural network architecture, reflecting high mathematical density. However, the empirical validation relies on simulated data and numerical experiments rather than live trading, backtesting on real historical data, or public code repositories, resulting in lower empirical rigor.
flowchart TD
A["Research Goal<br>Optimal Execution with<br>Time-Varying Liquidity?"] --> B["Methodology<br>Double Deep Q-Learning RL"]
B --> C["Data & Framework<br>Almgren-Chriss Model<br>Stochastic Impact Parameters"]
C --> D["Computational Process<br>Neural Network Training<br>Simulation & Learning"]
D --> E["Key Findings<br>1. Matches analytical solutions<br>2. Outperforms benchmarks<br>3. Handles latent liquidity"]