false

ABIDES-MARL: A Multi-Agent Reinforcement Learning Environment for Endogenous Price Formation and Execution in a Limit Order Book

ABIDES-MARL: A Multi-Agent Reinforcement Learning Environment for Endogenous Price Formation and Execution in a Limit Order Book ArXiv ID: 2511.02016 “View on arXiv” Authors: Patrick Cheridito, Jean-Loup Dupret, Zhexin Wu Abstract We present ABIDES-MARL, a framework that combines a new multi-agent reinforcement learning (MARL) methodology with a new realistic limit-order-book (LOB) simulation system to study equilibrium behavior in complex financial market games. The system extends ABIDES-Gym by decoupling state collection from kernel interruption, enabling synchronized learning and decision-making for multiple adaptive agents while maintaining compatibility with standard RL libraries. It preserves key market features such as price-time priority and discrete tick sizes. Methodologically, we use MARL to approximate equilibrium-like behavior in multi-period trading games with a finite number of heterogeneous agents-an informed trader, a liquidity trader, noise traders, and competing market makers-all with individual price impacts. This setting bridges optimal execution and market microstructure by embedding the liquidity trader’s optimization problem within a strategic trading environment. We validate the approach by solving an extended Kyle model within the simulation system, recovering the gradual price discovery phenomenon. We then extend the analysis to a liquidity trader’s problem where market liquidity arises endogenously and show that, at equilibrium, execution strategies shape market-maker behavior and price dynamics. ABIDES-MARL provides a reproducible foundation for analyzing equilibrium and strategic adaptation in realistic markets and contributes toward building economically interpretable agentic AI systems for finance. ...

November 3, 2025 · 2 min · Research Team

Generating realistic metaorders from public data

Generating realistic metaorders from public data ArXiv ID: 2503.18199 “View on arXiv” Authors: Unknown Abstract This paper introduces a novel algorithm for generating realistic metaorders from public trade data, addressing a longstanding challenge in price impact research that has traditionally relied on proprietary datasets. Our method effectively recovers all established stylized facts of metaorders impact, such as the Square Root Law, the concave profile during metaorder execution, and the post-execution decay. This algorithm not only overcomes the dependence on proprietary data, a major barrier to research reproducibility, but also enables the creation of larger and more robust datasets that may increase the quality of empirical studies. Our findings strongly suggest that average realized short-term price impact is not due to information revelation (as in the Kyle framework) but has a mechanical origin which could explain the universality of the Square Root Law. ...

March 23, 2025 · 2 min · Research Team

Solvability of the Gaussian Kyle model with imperfect information and risk aversion

Solvability of the Gaussian Kyle model with imperfect information and risk aversion ArXiv ID: 2501.16488 “View on arXiv” Authors: Unknown Abstract We investigate a Kyle model under Gaussian assumptions where a risk-averse informed trader has imperfect information on the fundamental price of an asset. We show that an equilibrium can be constructed by considering an optimal transport problem that is solved under a measure that renders the utility of the informed trader martingale and a filtering problem under the historical measure. ...

January 27, 2025 · 2 min · Research Team

Insider trading in discrete time Kyle games

Insider trading in discrete time Kyle games ArXiv ID: 2312.00904 “View on arXiv” Authors: Unknown Abstract We present a new discrete time version of Kyle’s (1985) classic model of insider trading, formulated as a generalised extensive form game. The model has three kinds of traders: an insider, random noise traders, and a market maker. The insider aims to exploit her informational advantage and maximise expected profits while the market maker observes the total order flow and sets prices accordingly. First, we show how the multi-period model with finitely many pure strategies can be reduced to a (static) social system in the sense of Debreu (1952) and prove the existence of a sequential Kyle equilibrium, following Kreps and Wilson (1982). This works for any probability distribution with finite support of the noise trader’s demand and the true value, and for any finite information flow of the insider. In contrast to Kyle (1985) with normal distributions, equilibria exist in general only in mixed strategies and not in pure strategies. In the single-period model we establish bounds for the insider’s strategy in equilibrium. Finally, we prove the existence of an equilibrium for the game with a continuum of actions, by considering an approximating sequence of games with finitely many actions. Because of the lack of compactness of the set of measurable price functions, standard infinite-dimensional fixed point theorems are not applicable. ...

December 1, 2023 · 2 min · Research Team