false

Trade execution games in a Markovian environment

Trade execution games in a Markovian environment ArXiv ID: 2405.07184 “View on arXiv” Authors: Unknown Abstract This paper examines a trade execution game for two large traders in a generalized price impact model. We incorporate a stochastic and sequentially dependent factor that exogenously affects the market price into financial markets. Our model accounts for how strategic and environmental uncertainties affect the large traders’ execution strategies. We formulate an expected utility maximization problem for two large traders as a Markov game model. Applying the backward induction method of dynamic programming, we provide an explicit closed-form execution strategy at a Markov perfect equilibrium. Our theoretical results reveal that the execution strategy generally lies in a dynamic and non-randomized class; it becomes deterministic if the Markovian environment is also deterministic. In addition, our simulation-based numerical experiments suggest that the execution strategy captures various features observed in financial markets. ...

May 12, 2024 · 2 min · Research Team

Reinforcement Learning in Agent-Based Market Simulation: Unveiling Realistic Stylized Facts and Behavior

Reinforcement Learning in Agent-Based Market Simulation: Unveiling Realistic Stylized Facts and Behavior ArXiv ID: 2403.19781 “View on arXiv” Authors: Unknown Abstract Investors and regulators can greatly benefit from a realistic market simulator that enables them to anticipate the consequences of their decisions in real markets. However, traditional rule-based market simulators often fall short in accurately capturing the dynamic behavior of market participants, particularly in response to external market impact events or changes in the behavior of other participants. In this study, we explore an agent-based simulation framework employing reinforcement learning (RL) agents. We present the implementation details of these RL agents and demonstrate that the simulated market exhibits realistic stylized facts observed in real-world markets. Furthermore, we investigate the behavior of RL agents when confronted with external market impacts, such as a flash crash. Our findings shed light on the effectiveness and adaptability of RL-based agents within the simulation, offering insights into their response to significant market events. ...

March 28, 2024 · 2 min · Research Team

Optimal Portfolio Choice with Cross-Impact Propagators

Optimal Portfolio Choice with Cross-Impact Propagators ArXiv ID: 2403.10273 “View on arXiv” Authors: Unknown Abstract We consider a class of optimal portfolio choice problems in continuous time where the agent’s transactions create both transient cross-impact driven by a matrix-valued Volterra propagator, as well as temporary price impact. We formulate this problem as the maximization of a revenue-risk functional, where the agent also exploits available information on a progressively measurable price predicting signal. We solve the maximization problem explicitly in terms of operator resolvents, by reducing the corresponding first order condition to a coupled system of stochastic Fredholm equations of the second kind and deriving its solution. We then give sufficient conditions on the matrix-valued propagator so that the model does not permit price manipulation. We also provide an implementation of the solutions to the optimal portfolio choice problem and to the associated optimal execution problem. Our solutions yield financial insights on the influence of cross-impact on the optimal strategies and its interplay with alpha decays. ...

March 15, 2024 · 2 min · Research Team

Deep Hedging with Market Impact

Deep Hedging with Market Impact ArXiv ID: 2402.13326 “View on arXiv” Authors: Unknown Abstract Dynamic hedging is the practice of periodically transacting financial instruments to offset the risk caused by an investment or a liability. Dynamic hedging optimization can be framed as a sequential decision problem; thus, Reinforcement Learning (RL) models were recently proposed to tackle this task. However, existing RL works for hedging do not consider market impact caused by the finite liquidity of traded instruments. Integrating such feature can be crucial to achieve optimal performance when hedging options on stocks with limited liquidity. In this paper, we propose a novel general market impact dynamic hedging model based on Deep Reinforcement Learning (DRL) that considers several realistic features such as convex market impacts, and impact persistence through time. The optimal policy obtained from the DRL model is analysed using several option hedging simulations and compared to commonly used procedures such as delta hedging. Results show our DRL model behaves better in contexts of low liquidity by, among others: 1) learning the extent to which portfolio rebalancing actions should be dampened or delayed to avoid high costs, 2) factoring in the impact of features not considered by conventional approaches, such as previous hedging errors through the portfolio value, and the underlying asset’s drift (i.e. the magnitude of its expected return). ...

February 20, 2024 · 2 min · Research Team

Reinforcement Learning for Optimal Execution when Liquidity is Time-Varying

Reinforcement Learning for Optimal Execution when Liquidity is Time-Varying ArXiv ID: 2402.12049 “View on arXiv” Authors: Unknown Abstract Optimal execution is an important problem faced by any trader. Most solutions are based on the assumption of constant market impact, while liquidity is known to be dynamic. Moreover, models with time-varying liquidity typically assume that it is observable, despite the fact that, in reality, it is latent and hard to measure in real time. In this paper we show that the use of Double Deep Q-learning, a form of Reinforcement Learning based on neural networks, is able to learn optimal trading policies when liquidity is time-varying. Specifically, we consider an Almgren-Chriss framework with temporary and permanent impact parameters following several deterministic and stochastic dynamics. Using extensive numerical experiments, we show that the trained algorithm learns the optimal policy when the analytical solution is available, and overcomes benchmarks and approximated solutions when the solution is not available. ...

February 19, 2024 · 2 min · Research Team

Limit Order Book Dynamics and Order Size Modelling Using Compound Hawkes Process

Limit Order Book Dynamics and Order Size Modelling Using Compound Hawkes Process ArXiv ID: 2312.08927 “View on arXiv” Authors: Unknown Abstract Hawkes Process has been used to model Limit Order Book (LOB) dynamics in several ways in the literature however the focus has been limited to capturing the inter-event times while the order size is usually assumed to be constant. We propose a novel methodology of using Compound Hawkes Process for the LOB where each event has an order size sampled from a calibrated distribution. The process is formulated in a novel way such that the spread of the process always remains positive. Further, we condition the model parameters on time of day to support empirical observations. We make use of an enhanced non-parametric method to calibrate the Hawkes kernels and allow for inhibitory cross-excitation kernels. We showcase the results and quality of fits for an equity stock’s LOB in the NASDAQ exchange and compare them against several baselines. Finally, we conduct a market impact study of the simulator and show the empirical observation of a concave market impact function is indeed replicated. ...

December 14, 2023 · 2 min · Research Team

The two square root laws of market impact and the role of sophisticated market participants

The two square root laws of market impact and the role of sophisticated market participants ArXiv ID: 2311.18283 “View on arXiv” Authors: Unknown Abstract The goal of this paper is to disentangle the roles of volume and of participation rate in the price response of the market to a sequence of transactions. To do so, we are inspired the methodology introduced in arXiv:1402.1288, arXiv:1805.07134 where price dynamics are derived from order flow dynamics using no arbitrage assumptions. We extend this approach by taking into account a sophisticated market participant having superior abilities to analyse market dynamics. Our results lead to the recovery of two square root laws: (i) For a given participation rate, during the execution of a metaorder, the market impact evolves in a square root manner with respect to the cumulated traded volume. (ii) For a given executed volume $Q$, the market impact is proportional to $\sqrtγ$, where $γ$ denotes the participation rate, for $γ$ large enough. Smaller participation rates induce a more linear dependence of the market impact in the participation rate. ...

November 30, 2023 · 2 min · Research Team

Prime Match: A Privacy-Preserving Inventory Matching System

Prime Match: A Privacy-Preserving Inventory Matching System ArXiv ID: 2310.09621 “View on arXiv” Authors: Unknown Abstract Inventory matching is a standard mechanism/auction for trading financial stocks by which buyers and sellers can be paired. In the financial world, banks often undertake the task of finding such matches between their clients. The related stocks can be traded without adversely impacting the market price for either client. If matches between clients are found, the bank can offer the trade at advantageous rates. If no match is found, the parties have to buy or sell the stock in the public market, which introduces additional costs. A problem with the process as it is presently conducted is that the involved parties must share their order to buy or sell a particular stock, along with the intended quantity (number of shares), to the bank. Clients worry that if this information were to leak somehow, then other market participants would become aware of their intentions and thus cause the price to move adversely against them before their transaction finalizes. We provide a solution, Prime Match, that enables clients to match their orders efficiently with reduced market impact while maintaining privacy. In the case where there are no matches, no information is revealed. Our main cryptographic innovation is a two-round secure linear comparison protocol for computing the minimum between two quantities without preprocessing and with malicious security, which can be of independent interest. We report benchmarks of our Prime Match system, which runs in production and is adopted by J.P. Morgan. The system is designed utilizing a star topology network, which provides clients with a centralized node (the bank) as an alternative to the idealized assumption of point-to-point connections, which would be impractical and undesired for the clients to implement in reality. Prime Match is the first secure multiparty computation solution running live in the traditional financial world. ...

October 14, 2023 · 3 min · Research Team

Evaluation of Deep Reinforcement Learning Algorithms for Portfolio Optimisation

Evaluation of Deep Reinforcement Learning Algorithms for Portfolio Optimisation ArXiv ID: 2307.07694 “View on arXiv” Authors: Unknown Abstract We evaluate benchmark deep reinforcement learning algorithms on the task of portfolio optimisation using simulated data. The simulator to generate the data is based on correlated geometric Brownian motion with the Bertsimas-Lo market impact model. Using the Kelly criterion (log utility) as the objective, we can analytically derive the optimal policy without market impact as an upper bound to measure performance when including market impact. We find that the off-policy algorithms DDPG, TD3 and SAC are unable to learn the right $Q$-function due to the noisy rewards and therefore perform poorly. The on-policy algorithms PPO and A2C, with the use of generalised advantage estimation, are able to deal with the noise and derive a close to optimal policy. The clipping variant of PPO was found to be important in preventing the policy from deviating from the optimal once converged. In a more challenging environment where we have regime changes in the GBM parameters, we find that PPO, combined with a hidden Markov model to learn and predict the regime context, is able to learn different policies adapted to each regime. Overall, we find that the sample complexity of these algorithms is too high for applications using real data, requiring more than 2m steps to learn a good policy in the simplest setting, which is equivalent to almost 8,000 years of daily prices. ...

July 15, 2023 · 2 min · Research Team

Online Learning of Order Flow and Market Impact with Bayesian Change-Point Detection Methods

Online Learning of Order Flow and Market Impact with Bayesian Change-Point Detection Methods ArXiv ID: 2307.02375 “View on arXiv” Authors: Unknown Abstract Financial order flow exhibits a remarkable level of persistence, wherein buy (sell) trades are often followed by subsequent buy (sell) trades over extended periods. This persistence can be attributed to the division and gradual execution of large orders. Consequently, distinct order flow regimes might emerge, which can be identified through suitable time series models applied to market data. In this paper, we propose the use of Bayesian online change-point detection (BOCPD) methods to identify regime shifts in real-time and enable online predictions of order flow and market impact. To enhance the effectiveness of our approach, we have developed a novel BOCPD method using a score-driven approach. This method accommodates temporal correlations and time-varying parameters within each regime. Through empirical application to NASDAQ data, we have found that: (i) Our newly proposed model demonstrates superior out-of-sample predictive performance compared to existing models that assume i.i.d. behavior within each regime; (ii) When examining the residuals, our model demonstrates good specification in terms of both distributional assumptions and temporal correlations; (iii) Within a given regime, the price dynamics exhibit a concave relationship with respect to time and volume, mirroring the characteristics of actual large orders; (iv) By incorporating regime information, our model produces more accurate online predictions of order flow and market impact compared to models that do not consider regimes. ...

July 5, 2023 · 2 min · Research Team