Model-Free Deep Hedging with Transaction Costs and Light Data Requirements

ArXiv ID: 2505.22836 “View on arXiv”

Authors: Pierre Brugière, Gabriel Turinici

Abstract

Option pricing theory, such as the Black and Scholes (1973) model, provides an explicit solution to construct a strategy that perfectly hedges an option in a continuous-time setting. In practice, however, trading occurs in discrete time and often involves transaction costs, making the direct application of continuous-time solutions potentially suboptimal. Previous studies, such as those by Buehler et al. (2018), Buehler et al. (2019) and Cao et al. (2019), have shown that deep learning or reinforcement learning can be used to derive better hedging strategies than those based on continuous-time models. However, these approaches typically rely on a large number of trajectories (of the order of $10^5$ or $10^6$) to train the model. In this work, we show that using as few as 256 trajectories is sufficient to train a neural network that significantly outperforms, in the Geometric Brownian Motion framework, both the classical Black & Scholes formula and the Leland model, which is arguably one of the most effective explicit alternatives for incorporating transaction costs. The ability to train neural networks with such a small number of trajectories suggests the potential for more practical and simple implementation on real-time financial series.

Keywords: Option Hedging, Reinforcement Learning, Neural Networks, Transaction Costs, Black-Scholes Model, Derivatives

Complexity vs Empirical Score

  • Math Complexity: 8.5/10
  • Empirical Rigor: 7.0/10
  • Quadrant: Holy Grail
  • Why: The paper employs advanced stochastic calculus and deep learning optimization with dense mathematical formulations, while demonstrating practical performance with limited data trajectories suitable for real-time implementation.
  flowchart TD
    Start["Research Goal: Develop a deep hedging model<br>with minimal data and transaction costs"] --> Methodology["Methodology: Model-Free Deep Reinforcement Learning"]
    
    Inputs["Data Inputs: 256 Trajectories<br>Geometric Brownian Motion (GBM)"] --> Methodology
    
    Methodology --> Process["Computational Process:<br>Neural Network Optimization<br>Minimize Risk + Transaction Costs"]
    
    Process --> Compare["Benchmark Comparison:<br>Black & Scholes vs Leland Model"]
    
    Compare --> Outcomes["Key Findings:<br>1. Significantly outperforms B&S & Leland<br>2. Efficient with only 256 trajectories<br>3. Practical for real-time financial data"]
    
    style Start fill:#e1f5fe
    style Outcomes fill:#e8f5e8