false

Deep Reinforcement Learning for Optimum Order Execution: Mitigating Risk and Maximizing Returns

Deep Reinforcement Learning for Optimum Order Execution: Mitigating Risk and Maximizing Returns ArXiv ID: 2601.04896 “View on arXiv” Authors: Khabbab Zakaria, Jayapaulraj Jerinsh, Andreas Maier, Patrick Krauss, Stefano Pasquali, Dhagash Mehta Abstract Optimal Order Execution is a well-established problem in finance that pertains to the flawless execution of a trade (buy or sell) for a given volume within a specified time frame. This problem revolves around optimizing returns while minimizing risk, yet recent research predominantly focuses on addressing one aspect of this challenge. In this paper, we introduce an innovative approach to Optimal Order Execution within the US market, leveraging Deep Reinforcement Learning (DRL) to effectively address this optimization problem holistically. Our study assesses the performance of our model in comparison to two widely employed execution strategies: Volume Weighted Average Price (VWAP) and Time Weighted Average Price (TWAP). Our experimental findings clearly demonstrate that our DRL-based approach outperforms both VWAP and TWAP in terms of return on investment and risk management. The model’s ability to adapt dynamically to market conditions, even during periods of market stress, underscores its promise as a robust solution. ...

January 8, 2026 · 2 min · Research Team

In-Context Operator Learning for Linear Propagator Models

In-Context Operator Learning for Linear Propagator Models ArXiv ID: 2501.15106 “View on arXiv” Authors: Unknown Abstract We study operator learning in the context of linear propagator models for optimal order execution problems with transient price impact à la Bouchaud et al. (2004) and Gatheral (2010). Transient price impact persists and decays over time according to some propagator kernel. Specifically, we propose to use In-Context Operator Networks (ICON), a novel transformer-based neural network architecture introduced by Yang et al. (2023), which facilitates data-driven learning of operators by merging offline pre-training with an online few-shot prompting inference. First, we train ICON to learn the operator from various propagator models that maps the trading rate to the induced transient price impact. The inference step is then based on in-context prediction, where ICON is presented only with a few examples. We illustrate that ICON is capable of accurately inferring the underlying price impact model from the data prompts, even with propagator kernels not seen in the training data. In a second step, we employ the pre-trained ICON model provided with context as a surrogate operator in solving an optimal order execution problem via a neural network control policy, and demonstrate that the exact optimal execution strategies from Abi Jaber and Neuman (2022) for the models generating the context are correctly retrieved. Our introduced methodology is very general, offering a new approach to solving optimal stochastic control problems with unknown state dynamics, inferred data-efficiently from a limited number of examples by leveraging the few-shot and transfer learning capabilities of transformer networks. ...

January 25, 2025 · 2 min · Research Team