ARL-Based Multi-Action Market Making with Hawkes Processes and Variable Volatility

ArXiv ID: 2508.16589 “View on arXiv”

Authors: Ziyi Wang, Carmine Ventre, Maria Polukarov

Abstract

We advance market-making strategies by integrating Adversarial Reinforcement Learning (ARL), Hawkes Processes, and variable volatility levels while also expanding the action space available to market makers (MMs). To enhance the adaptability and robustness of these strategies – which can quote always, quote only on one side of the market or not quote at all – we shift from the commonly used Poisson process to the Hawkes process, which better captures real market dynamics and self-exciting behaviors. We then train and evaluate strategies under volatility levels of 2 and 200. Our findings show that the 4-action MM trained in a low-volatility environment effectively adapts to high-volatility conditions, maintaining stable performance and providing two-sided quotes at least 92% of the time. This indicates that incorporating flexible quoting mechanisms and realistic market simulations significantly enhances the effectiveness of market-making strategies.

Keywords: Adversarial Reinforcement Learning, Hawkes Processes, Market Making, Volatility Regimes, High-Frequency Trading, Equities

Complexity vs Empirical Score

  • Math Complexity: 7.5/10
  • Empirical Rigor: 6.5/10
  • Quadrant: Holy Grail
  • Why: The paper employs advanced stochastic processes (Hawkes) and adversarial reinforcement learning, requiring dense mathematical modeling. It reports specific performance metrics (e.g., 92% quoting time) under variable volatility, indicating robust simulation and data-driven evaluation.
  flowchart TD
    A["Research Goal<br>Develop robust MM strategies<br>with expanded action space"] --> B["Methodology<br>ARL + Hawkes + Variable Volatility"]
    B --> C["Input Data<br>Simulated HFT Data<br>Low Vol = 2 | High Vol = 200"]
    C --> D["Computation<br>Train 4-Action Model<br>Test Cross-Volatility"]
    D --> E["Outcome 1<br>Low Vol: Optimal Performance"]
    D --> F["Outcome 2<br>High Vol: 92% Two-Sided Quotes"]
    D --> G["Outcome 3<br>Stable & Adaptive Strategy"]