FlowOE: Imitation Learning with Flow Policy from Ensemble RL Experts for Optimal Execution under Heston Volatility and Concave Market Impacts
ArXiv ID: 2506.05755 “View on arXiv”
Authors: Yang Li, Zhi Chen
Abstract
Optimal execution in financial markets refers to the process of strategically transacting a large volume of assets over a period to achieve the best possible outcome by balancing the trade-off between market impact costs and timing or volatility risks. Traditional optimal execution strategies, such as static Almgren-Chriss models, often prove suboptimal in dynamic financial markets. This paper propose flowOE, a novel imitation learning framework based on flow matching models, to address these limitations. FlowOE learns from a diverse set of expert traditional strategies and adaptively selects the most suitable expert behavior for prevailing market conditions. A key innovation is the incorporation of a refining loss function during the imitation process, enabling flowOE not only to mimic but also to improve upon the learned expert actions. To the best of our knowledge, this work is the first to apply flow matching models in a stochastic optimal execution problem. Empirical evaluations across various market conditions demonstrate that flowOE significantly outperforms both the specifically calibrated expert models and other traditional benchmarks, achieving higher profits with reduced risk. These results underscore the practical applicability and potential of flowOE to enhance adaptive optimal execution.
Keywords: optimal execution, imitation learning, flow matching, market impact, stochastic control
Complexity vs Empirical Score
- Math Complexity: 8.0/10
- Empirical Rigor: 7.0/10
- Quadrant: Holy Grail
- Why: The paper employs advanced mathematical concepts including stochastic optimal control, flow matching models, and Heston volatility modeling, indicating high mathematical complexity. The rigorous empirical evaluation across multiple market conditions with benchmarks against traditional models and clear performance metrics shows substantial empirical rigor.
flowchart TD
A["Research Goal:<br>Develop Adaptive Optimal Execution Strategy<br>under Heston Volatility & Concave Impact"] --> B["Key Methodology:<br>FlowOE Framework"]
B --> C["Data & Inputs:<br>Simulated Market Data<br>Ensemble RL Experts<br>Traditional Strategies"]
C --> D["Computational Process:<br>1. Flow Matching Imitation<br>2. Refining Loss Function<br>3. Expert Selection"]
D --> E["Key Findings:<br>1. Outperforms Calibrated Experts<br>2. Higher Profits, Reduced Risk<br>3. First Flow Matching Application<br>in Stochastic Optimal Execution"]