false

Intraday Limit Order Price Change Transition Dynamics Across Market Capitalizations Through Markov Analysis

Intraday Limit Order Price Change Transition Dynamics Across Market Capitalizations Through Markov Analysis ArXiv ID: 2601.04959 “View on arXiv” Authors: Salam Rabindrajit Luwang, Kundan Mukhia, Buddha Nath Sharma, Md. Nurujjaman, Anish Rai, Filippo Petroni Abstract Quantitative understanding of stochastic dynamics in limit order price changes is essential for execution strategy design. We analyze intraday transition dynamics of ask and bid orders across market capitalization tiers using high-frequency NASDAQ100 tick data. Employing a discrete-time Markov chain framework, we categorize consecutive price changes into nine states and estimate transition probability matrices (TPMs) for six intraday intervals across High ($\mathtt{“HMC”}$), Medium ($\mathtt{“MMC”}$), and Low ($\mathtt{“LMC”}$) market cap stocks. Element-wise TPM comparison reveals systematic patterns: price inertia peaks during opening and closing hours, stabilizing midday. A capitalization gradient is observed: $\mathtt{“HMC”}$ stocks exhibit the strongest inertia, while $\mathtt{“LMC”}$ stocks show lower stability and wider spreads. Markov metrics, including spectral gap, entropy rate, and mean recurrence times, quantify these dynamics. Clustering analysis identifies three distinct temporal phases on the bid side – Opening, Midday, and Closing, and four phases on the ask side by distinguishing Opening, Midday, Pre-Close, and Close. This indicates that sellers initiate end-of-day positioning earlier than buyers. Stationary distributions show limit order dynamics are dominated by neutral and mild price changes. Jensen-Shannon divergence confirms the closing hour as the most distinct phase, with capitalization modulating temporal contrasts and bid-ask asymmetry. These findings support capitalization-aware and time-adaptive execution algorithms. ...

January 8, 2026 · 2 min · Research Team

Limit Order Book Dynamics in Matching Markets: Microstructure, Spread, and Execution Slippage

Limit Order Book Dynamics in Matching Markets: Microstructure, Spread, and Execution Slippage ArXiv ID: 2511.20606 “View on arXiv” Authors: Yao Wu Abstract Conventional models of matching markets assume that monetary transfers can clear markets by compensating for utility differentials. However, empirical patterns show that such transfers often fail to close structural preference gaps. This paper introduces a market microstructure framework that models matching decisions as a limit order book system with rigid bid ask spreads. Individual preferences are represented by a latent preference state matrix, where the spread between an agent’s internal ask price (the unconditional maximum) and the market’s best bid (the reachable maximum) creates a structural liquidity constraint. We establish a Threshold Impossibility Theorem showing that linear compensation cannot close these spreads unless it induces a categorical identity shift. A dynamic discrete choice execution model further demonstrates that matches occur only when the market to book ratio crosses a time decaying liquidity threshold, analogous to order execution under inventory pressure. Numerical experiments validate persistent slippage, regional invariance of preference orderings, and high tier zero spread executions. The model provides a unified microstructure explanation for matching failures, compensation inefficiency, and post match regret in illiquid order driven environments. ...

November 25, 2025 · 2 min · Research Team

ABIDES-MARL: A Multi-Agent Reinforcement Learning Environment for Endogenous Price Formation and Execution in a Limit Order Book

ABIDES-MARL: A Multi-Agent Reinforcement Learning Environment for Endogenous Price Formation and Execution in a Limit Order Book ArXiv ID: 2511.02016 “View on arXiv” Authors: Patrick Cheridito, Jean-Loup Dupret, Zhexin Wu Abstract We present ABIDES-MARL, a framework that combines a new multi-agent reinforcement learning (MARL) methodology with a new realistic limit-order-book (LOB) simulation system to study equilibrium behavior in complex financial market games. The system extends ABIDES-Gym by decoupling state collection from kernel interruption, enabling synchronized learning and decision-making for multiple adaptive agents while maintaining compatibility with standard RL libraries. It preserves key market features such as price-time priority and discrete tick sizes. Methodologically, we use MARL to approximate equilibrium-like behavior in multi-period trading games with a finite number of heterogeneous agents-an informed trader, a liquidity trader, noise traders, and competing market makers-all with individual price impacts. This setting bridges optimal execution and market microstructure by embedding the liquidity trader’s optimization problem within a strategic trading environment. We validate the approach by solving an extended Kyle model within the simulation system, recovering the gradual price discovery phenomenon. We then extend the analysis to a liquidity trader’s problem where market liquidity arises endogenously and show that, at equilibrium, execution strategies shape market-maker behavior and price dynamics. ABIDES-MARL provides a reproducible foundation for analyzing equilibrium and strategic adaptation in realistic markets and contributes toward building economically interpretable agentic AI systems for finance. ...

November 3, 2025 · 2 min · Research Team

JaxMARL-HFT: GPU-Accelerated Large-Scale Multi-Agent Reinforcement Learning for High-Frequency Trading

JaxMARL-HFT: GPU-Accelerated Large-Scale Multi-Agent Reinforcement Learning for High-Frequency Trading ArXiv ID: 2511.02136 “View on arXiv” Authors: Valentin Mohl, Sascha Frey, Reuben Leyland, Kang Li, George Nigmatulin, Mihai Cucuringu, Stefan Zohren, Jakob Foerster, Anisoara Calinescu Abstract Agent-based modelling (ABM) approaches for high-frequency financial markets are difficult to calibrate and validate, partly due to the large parameter space created by defining fixed agent policies. Multi-agent reinforcement learning (MARL) enables more realistic agent behaviour and reduces the number of free parameters, but the heavy computational cost has so far limited research efforts. To address this, we introduce JaxMARL-HFT (JAX-based Multi-Agent Reinforcement Learning for High-Frequency Trading), the first GPU-accelerated open-source multi-agent reinforcement learning environment for high-frequency trading (HFT) on market-by-order (MBO) data. Extending the JaxMARL framework and building on the JAX-LOB implementation, JaxMARL-HFT is designed to handle a heterogeneous set of agents, enabling diverse observation/action spaces and reward functions. It is designed flexibly, so it can also be used for single-agent RL, or extended to act as an ABM with fixed-policy agents. Leveraging JAX enables up to a 240x reduction in end-to-end training time, compared with state-of-the-art reference implementations on the same hardware. This significant speed-up makes it feasible to exploit the large, granular datasets available in high-frequency trading, and to perform the extensive hyperparameter sweeps required for robust and efficient MARL research in trading. We demonstrate the use of JaxMARL-HFT with independent Proximal Policy Optimization (IPPO) for a two-player environment, with an order execution and a market making agent, using one year of LOB data (400 million orders), and show that these agents learn to outperform standard benchmarks. The code for the JaxMARL-HFT framework is available on GitHub. ...

November 3, 2025 · 2 min · Research Team

RL-Exec: Impact-Aware Reinforcement Learning for Opportunistic Optimal Liquidation, Outperforms TWAP and a Book-Liquidity VWAP on BTC-USD Replays

RL-Exec: Impact-Aware Reinforcement Learning for Opportunistic Optimal Liquidation, Outperforms TWAP and a Book-Liquidity VWAP on BTC-USD Replays ArXiv ID: 2511.07434 “View on arXiv” Authors: Enzo Duflot, Stanislas Robineau Abstract We study opportunistic optimal liquidation over fixed deadlines on BTC-USD limit-order books (LOB). We present RL-Exec, a PPO agent trained on historical replays augmented with endogenous transient impact (resilience), partial fills, maker/taker fees, and latency. The policy observes depth-20 LOB features plus microstructure indicators and acts under a sell-only inventory constraint to reach a residual target. Evaluation follows a strict time split (train: Jan-2020; test: Feb-2020) and a per-day protocol: for each test day we run ten independent start times and aggregate to a single daily score, avoiding pseudo-replication. We compare the agent to (i) TWAP and (ii) a VWAP-like baseline allocating using opposite-side order-book liquidity (top-20 levels), both executed on identical timestamps and costs. Statistical inference uses one-sided Wilcoxon signed-rank tests on daily RL-baseline differences with Benjamini-Hochberg FDR correction and bootstrap confidence intervals. On the Feb-2020 test set, RL-Exec significantly outperforms both baselines and the gap increases with the execution horizon (+2-3 bps at 30 min, +7-8 bps at 60 min, +23 bps at 120 min). Code: github.com/Giafferri/RL-Exec ...

October 30, 2025 · 2 min · Research Team

On Bellman equation in the limit order optimization problem for high-frequency trading

On Bellman equation in the limit order optimization problem for high-frequency trading ArXiv ID: 2510.15988 “View on arXiv” Authors: M. I. Balakaeva, A. Yu. Veretennikov Abstract An approximation method for construction of optimal strategies in the bid & ask limit order book in the high-frequency trading (HFT) is studied. The basis is the article by M. Avellaneda & S. Stoikov 2008, in which certain seemingly serious gaps have been found; in the present paper they are carefully corrected. However, a bit surprisingly, our corrections do not change the main answer in the cited paper, so that, in fact, the gaps turn out to be unimportant. An explanation of this effect is offered. ...

October 13, 2025 · 2 min · Research Team

Mean-field theory of the Santa Fe model revisited: a systematic derivation from an exact BBGKY hierarchy for the zero-intelligence limit-order book model

Mean-field theory of the Santa Fe model revisited: a systematic derivation from an exact BBGKY hierarchy for the zero-intelligence limit-order book model ArXiv ID: 2510.01814 “View on arXiv” Authors: Taiki Wakatsuki, Kiyoshi Kanazawa Abstract The Santa Fe model is an established econophysics model for describing stochastic dynamics of the limit order book from the viewpoint of the zero-intelligence approach. While its foundation was studied by combining a dimensional analysis and a mean-field theory by E. Smith et al. in Quantitative Finance 2003, their arguments are rather heuristic and lack solid mathematical foundation; indeed, their mean-field equations were derived with heuristic arguments and their solutions were not explicitly obtained. In this work, we revisit the mean-field theory of the Santa Fe model from the viewpoint of kinetic theory – a traditional mathematical program in statistical physics. We study the exact master equation for the Santa Fe model and systematically derive the Bogoliubov-Born-Green-Kirkwood-Yvon (BBGKY) hierarchical equation. By applying the mean-field approximation, we derive the mean-field equation for the order-book density profile, parallel to the Boltzmann equation in conventional statistical physics. Furthermore, we obtain explicit and closed expression of the mean-field solutions. Our solutions have several implications: (1)Our scaling formulas are available for both $μ\to 0$ and $μ\to \infty$ asymptotics, where $μ$ is the market-order submission intensity. Particularly, the mean-field theory works very well for small $μ$, while its validity is partially limited for large $μ$. (2)The ``method of image’’ solution, heuristically derived by Bouchaud-Mézard-Potters in Quantitative Finance 2002, is obtained for large $μ$, serving as a mathematical foundation for their heuristic arguments. (3)Finally, we point out an error in E. Smith et al. 2003 in the scaling law for the diffusion constant due to a misspecification in their dimensional analysis. ...

October 2, 2025 · 3 min · Research Team

Reinforcement Learning-Based Market Making as a Stochastic Control on Non-Stationary Limit Order Book Dynamics

Reinforcement Learning-Based Market Making as a Stochastic Control on Non-Stationary Limit Order Book Dynamics ArXiv ID: 2509.12456 “View on arXiv” Authors: Rafael Zimmer, Oswaldo Luiz do Valle Costa Abstract Reinforcement Learning has emerged as a promising framework for developing adaptive and data-driven strategies, enabling market makers to optimize decision-making policies based on interactions with the limit order book environment. This paper explores the integration of a reinforcement learning agent in a market-making context, where the underlying market dynamics have been explicitly modeled to capture observed stylized facts of real markets, including clustered order arrival times, non-stationary spreads and return drifts, stochastic order quantities and price volatility. These mechanisms aim to enhance stability of the resulting control agent, and serve to incorporate domain-specific knowledge into the agent policy learning process. Our contributions include a practical implementation of a market making agent based on the Proximal-Policy Optimization (PPO) algorithm, alongside a comparative evaluation of the agent’s performance under varying market conditions via a simulator-based environment. As evidenced by our analysis of the financial return and risk metrics when compared to a closed-form optimal solution, our results suggest that the reinforcement learning agent can effectively be used under non-stationary market conditions, and that the proposed simulator-based environment can serve as a valuable tool for training and pre-training reinforcement learning agents in market-making scenarios. ...

September 15, 2025 · 2 min · Research Team

Prospects of Imitating Trading Agents in the Stock Market

Prospects of Imitating Trading Agents in the Stock Market ArXiv ID: 2509.00982 “View on arXiv” Authors: Mateusz Wilinski, Juho Kanniainen Abstract In this work we show how generative tools, which were successfully applied to limit order book data, can be utilized for the task of imitating trading agents. To this end, we propose a modified generative architecture based on the state-space model, and apply it to limit order book data with identified investors. The model is trained on synthetic data, generated from a heterogeneous agent-based model. Finally, we compare model’s predicted distribution over different aspects of investors’ actions, with the ground truths known from the agent-based model. ...

August 31, 2025 · 2 min · Research Team

Agent-based model of information diffusion in the limit order book trading

Agent-based model of information diffusion in the limit order book trading ArXiv ID: 2508.20672 “View on arXiv” Authors: Mateusz Wilinski, Juho Kanniainen Abstract There are multiple explanations for stylized facts in high-frequency trading, including adaptive and informed agents, many of which have been studied through agent-based models. This paper investigates an alternative explanation by examining whether, and under what circumstances, interactions between traders placing limit order book messages can reproduce stylized facts, and what forms of interaction are required. While the agent-based modeling literature has introduced interconnected agents on networks, little attention has been paid to whether specific trading network topologies can generate stylized facts in limit order book markets. In our model, agents are strictly zero-intelligence, with no fundamental knowledge or chartist-like strategies, so that the role of network topology can be isolated. We find that scale-free connectivity between agents reproduces stylized facts observed in markets, whereas no-interaction does not. Our experiments show that regular lattices and Erdos-Renyi networks are not significantly different from the no-interaction baseline. Thus, we provide a completely new, potentially complementary, explanation for the emergence of stylized facts. ...

August 28, 2025 · 2 min · Research Team