Optimal Execution with Reinforcement Learning

ArXiv ID: 2411.06389 “View on arXiv”

Authors: Unknown

Abstract

This study investigates the development of an optimal execution strategy through reinforcement learning, aiming to determine the most effective approach for traders to buy and sell inventory within a finite time horizon. Our proposed model leverages input features derived from the current state of the limit order book and operates at a high frequency to maximize control. To simulate this environment and overcome the limitations associated with relying on historical data, we utilize the multi-agent market simulator ABIDES, which provides a diverse range of depth levels within the limit order book. We present a custom MDP formulation followed by the results of our methodology and benchmark the performance against standard execution strategies. Results show that the reinforcement learning agent outperforms standard strategies and offers a practical foundation for real-world trading applications.

Keywords: Optimal Execution, Reinforcement Learning (RL), Limit Order Book (LOB), Market Simulator (ABIDES), Markov Decision Process (MDP), Equities (High Frequency)

Complexity vs Empirical Score

Math Complexity: 7.5/10
Empirical Rigor: 8.0/10
Quadrant: Holy Grail
Why: The paper employs advanced mathematics including stochastic processes, transient impact models, and MDP formulation with Reinforcement Learning (DQN), while demonstrating high empirical rigor through the use of a multi-agent market simulator (ABIDES) to generate synthetic data, train the agent, and benchmark against standard strategies.

  flowchart TD
    A["Research Goal:<br>Optimal Execution via RL"] --> B["Data & Inputs:<br>Limit Order Book Features<br>Market Simulator: ABIDES"]
    B --> C["Methodology:<br>Custom MDP Formulation"]
    C --> D["Computational Process:<br>Reinforcement Learning Agent<br>High-Frequency Control"]
    D --> E["Key Findings:<br>RL Outperforms Standard Strategies<br>Practical for Real-World Trading"]

Optimal Execution with Reinforcement Learning#

Abstract#

Complexity vs Empirical Score#

Optimal Execution with Reinforcement Learning

Abstract

Complexity vs Empirical Score