Optimal Execution with Reinforcement Learning
ArXiv ID: 2411.06389 “View on arXiv”
Authors: Unknown
Abstract
This study investigates the development of an optimal execution strategy through reinforcement learning, aiming to determine the most effective approach for traders to buy and sell inventory within a finite time horizon. Our proposed model leverages input features derived from the current state of the limit order book and operates at a high frequency to maximize control. To simulate this environment and overcome the limitations associated with relying on historical data, we utilize the multi-agent market simulator ABIDES, which provides a diverse range of depth levels within the limit order book. We present a custom MDP formulation followed by the results of our methodology and benchmark the performance against standard execution strategies. Results show that the reinforcement learning agent outperforms standard strategies and offers a practical foundation for real-world trading applications.
Keywords: Optimal Execution, Reinforcement Learning (RL), Limit Order Book (LOB), Market Simulator (ABIDES), Markov Decision Process (MDP), Equities (High Frequency)
Complexity vs Empirical Score
- Math Complexity: 7.5/10
- Empirical Rigor: 8.0/10
- Quadrant: Holy Grail
- Why: The paper employs advanced mathematics including stochastic processes, transient impact models, and MDP formulation with Reinforcement Learning (DQN), while demonstrating high empirical rigor through the use of a multi-agent market simulator (ABIDES) to generate synthetic data, train the agent, and benchmark against standard strategies.
flowchart TD
A["Research Goal:<br>Optimal Execution via RL"] --> B["Data & Inputs:<br>Limit Order Book Features<br>Market Simulator: ABIDES"]
B --> C["Methodology:<br>Custom MDP Formulation"]
C --> D["Computational Process:<br>Reinforcement Learning Agent<br>High-Frequency Control"]
D --> E["Key Findings:<br>RL Outperforms Standard Strategies<br>Practical for Real-World Trading"]