false

Deep reinforcement learning for optimal trading with partial information

Deep reinforcement learning for optimal trading with partial information ArXiv ID: 2511.00190 “View on arXiv” Authors: Andrea Macrì, Sebastian Jaimungal, Fabrizio Lillo Abstract Reinforcement Learning (RL) applied to financial problems has been the subject of a lively area of research. The use of RL for optimal trading strategies that exploit latent information in the market is, to the best of our knowledge, not widely tackled. In this paper we study an optimal trading problem, where a trading signal follows an Ornstein-Uhlenbeck process with regime-switching dynamics. We employ a blend of RL and Recurrent Neural Networks (RNN) in order to make the most at extracting underlying information from the trading signal with latent parameters. The latent parameters driving mean reversion, speed, and volatility are filtered from observations of the signal, and trading strategies are derived via RL. To address this problem, we propose three Deep Deterministic Policy Gradient (DDPG)-based algorithms that integrate Gated Recurrent Unit (GRU) networks to capture temporal dependencies in the signal. The first, a one -step approach (hid-DDPG), directly encodes hidden states from the GRU into the RL trader. The second and third are two-step methods: one (prob-DDPG) makes use of posterior regime probability estimates, while the other (reg-DDPG) relies on forecasts of the next signal value. Through extensive simulations with increasingly complex Markovian regime dynamics for the trading signal’s parameters, as well as an empirical application to equity pair trading, we find that prob-DDPG achieves superior cumulative rewards and exhibits more interpretable strategies. By contrast, reg-DDPG provides limited benefits, while hid-DDPG offers intermediate performance with less interpretable strategies. Our results show that the quality and structure of the information supplied to the agent are crucial: embedding probabilistic insights into latent regimes substantially improves both profitability and robustness of reinforcement learning-based trading strategies. ...

October 31, 2025 · 3 min · Research Team

Exponential Hedging for the Ornstein-Uhlenbeck Process in the Presence of Linear Price Impact

Exponential Hedging for the Ornstein-Uhlenbeck Process in the Presence of Linear Price Impact ArXiv ID: 2509.25472 “View on arXiv” Authors: Yan Dolinsky Abstract In this work we study a continuous time exponential utility maximization problem in the presence of a linear temporary price impact. More precisely, for the case where the risky asset is given by the Ornstein-Uhlenbeck diffusion process we compute the optimal portfolio strategy and the corresponding value. Our method of solution relies on duality, and it is purely probabilistic. ...

September 29, 2025 · 1 min · Research Team

A Stochastic Model for Illiquid Stock Prices and its Conclusion about Correlation Measurement

A Stochastic Model for Illiquid Stock Prices and its Conclusion about Correlation Measurement ArXiv ID: 2509.10553 “View on arXiv” Authors: Erina Nanyonga, Juma Kasozi, Fred Mayambala, Hassan W. Kayondo, Matt Davison Abstract This study explores the behavioral dynamics of illiquid stock prices in a listed stock market. Illiquidity, characterized by wide bid and ask spreads affects price formation by decoupling prices from standard risk and return relationships and increasing sensitivity to market sentiment. We model the prices at the Uganda Securities Exchange (USE) which is illiquid in that the prices remain constant much of the time thus complicating price modelling. We circumvent this challenge by combining the Markov model (MM) with two models; the exponential Ornstein Uhlenbeck model (XOU) and geometric Brownian motion (gBm). In the combined models, the MM was used to capture the constant prices in the stock prices while the XOU and gBm captured the stochastic price dynamics. We modelled stock prices using the combined models, as well as XOU and gBm alone. We found that USE stocks appeared to have low correlation with one another. Using theoretical analysis, simulation study and empirical analysis, we conclude that this apparent low correlation is due to illiquidity. In particular data simulated from combined MM-gBm, in which the gBm portion were highly correlated resulted in a low measured correlation when the Markov chain had a higher transition from zero state to zero state. ...

September 9, 2025 · 3 min · Research Team

Stochastic Price Dynamics in Response to Order Flow Imbalance: Evidence from CSI 300 Index Futures

Stochastic Price Dynamics in Response to Order Flow Imbalance: Evidence from CSI 300 Index Futures ArXiv ID: 2505.17388 “View on arXiv” Authors: Chen Hu, Kouxiao Zhang Abstract We conduct modeling of the price dynamics following order flow imbalance in market microstructure and apply the model to the analysis of Chinese CSI 300 Index Futures. There are three findings. The first is that the order flow imbalance is analogous to a shock to the market. Unlike the common practice of using Hawkes processes, we model the impact of order flow imbalance as an Ornstein-Uhlenbeck process with memory and mean-reverting characteristics driven by a jump-type Lévy process. Motivated by the empirically stable correlation between order flow imbalance and contemporaneous price changes, we propose a modified asset price model where the drift term of canonical geometric Brownian motion is replaced by an Ornstein-Uhlenbeck process. We establish stochastic differential equations and derive the logarithmic return process along with its mean and variance processes under initial boundary conditions, and evolution of cost-effectiveness ratio with order flow imbalance as the trading trigger point, termed as the quasi-Sharpe ratio or response ratio. Secondly, our results demonstrate horizon-dependent heterogeneity in how conventional metrics interact with order flow imbalance. This underscores the critical role of forecast horizon selection for strategies. Thirdly, we identify regime-dependent dynamics in the memory and forecasting power of order flow imbalance. This taxonomy provides both a screening protocol for existing indicators and an ex-ante evaluation paradigm for novel metrics. ...

May 23, 2025 · 2 min · Research Team

Ornstein-Uhlenbeck Process for Horse Race Betting: A Micro-Macro Analysis of Herding and Informed Bettors

Ornstein-Uhlenbeck Process for Horse Race Betting: A Micro-Macro Analysis of Herding and Informed Bettors ArXiv ID: 2503.16470 “View on arXiv” Authors: Unknown Abstract We model the time evolution of single win odds in Japanese horse racing as a stochastic process, deriving an Ornstein–Uhlenbeck process by analyzing the probability dynamics of vote shares and the empirical time series of odds movements. Our framework incorporates two types of bettors: herders, who adjust their bets based on current odds, and fundamentalists, who wager based on a horse’s true winning probability. Using data from 3450 Japan Racing Association races in 2008, we identify a microscopic probability rule governing individual bets and a mean-reverting macroscopic pattern in odds convergence. This structure parallels financial markets, where traders’ decisions are influenced by market fluctuations, and the interplay between herding and fundamentalist strategies shapes price dynamics. These results highlight the broader applicability of our approach to non-equilibrium financial and betting markets, where mean-reverting dynamics emerge from simple behavioral interactions. ...

March 1, 2025 · 2 min · Research Team

An Application of the Ornstein-Uhlenbeck Process to Pairs Trading

An Application of the Ornstein-Uhlenbeck Process to Pairs Trading ArXiv ID: 2412.12458 “View on arXiv” Authors: Unknown Abstract We conduct a preliminary analysis of a pairs trading strategy using the Ornstein-Uhlenbeck (OU) process to model stock price spreads. We compare this approach to a naive pairs trading strategy that uses a rolling window to calculate mean and standard deviation parameters. Our findings suggest that the OU model captures signals and trends effectively but underperforms the naive model on a risk-return basis, likely due to non-stationary pairs and parameter tuning limitations. ...

December 17, 2024 · 2 min · Research Team

Estimation of bid-ask spreads in the presence of serial dependence

Estimation of bid-ask spreads in the presence of serial dependence ArXiv ID: 2407.17401 “View on arXiv” Authors: Unknown Abstract Starting from a basic model in which the dynamic of the transaction prices is a geometric Brownian motion disrupted by a microstructure white noise, corresponding to the random alternation of bids and asks, we propose moment-based estimators along with their statistical properties. We then make the model more realistic by considering serial dependence: we assume a geometric fractional Brownian motion for the price, then an Ornstein-Uhlenbeck process for the microstructure noise. In these two cases of serial dependence, we propose again consistent and asymptotically normal estimators. All our estimators are compared on simulated data with existing approaches, such as Roll, Corwin-Schultz, Abdi-Ranaldo, or Ardia-Guidotti-Kroencke estimators. ...

July 24, 2024 · 2 min · Research Team

Market Making in Spot Precious Metals

Market Making in Spot Precious Metals ArXiv ID: 2404.15478 “View on arXiv” Authors: Unknown Abstract The primary challenge of market making in spot precious metals is navigating the liquidity that is mainly provided by futures contracts. The Exchange for Physical (EFP) spread, which is the price difference between futures and spot, plays a pivotal role and exhibits multiple modes of relaxation corresponding to the diverse trading horizons of market participants. In this paper, we model the EFP spread using a nested Ornstein-Uhlenbeck process, in the spirit of the two-factor Hull-White model for interest rates. We demonstrate the suitability of the framework for maximizing the expected P&L of a market maker while minimizing inventory risk across both spot and futures. Using a computationally efficient technique to approximate the solution of the Hamilton-Jacobi-Bellman equation associated with the corresponding stochastic optimal control problem, our methodology facilitates strategy optimization on demand in near real-time, paving the way for advanced algorithmic market making that capitalizes on the co-integration properties intrinsic to the precious metals sector. ...

April 23, 2024 · 2 min · Research Team

A Comparison of Traditional and Deep Learning Methods for Parameter Estimation of the Ornstein-Uhlenbeck Process

A Comparison of Traditional and Deep Learning Methods for Parameter Estimation of the Ornstein-Uhlenbeck Process ArXiv ID: 2404.11526 “View on arXiv” Authors: Unknown Abstract We consider the Ornstein-Uhlenbeck (OU) process, a stochastic process widely used in finance, physics, and biology. Parameter estimation of the OU process is a challenging problem. Thus, we review traditional tracking methods and compare them with novel applications of deep learning to estimate the parameters of the OU process. We use a multi-layer perceptron to estimate the parameters of the OU process and compare its performance with traditional parameter estimation methods, such as the Kalman filter and maximum likelihood estimation. We find that the multi-layer perceptron can accurately estimate the parameters of the OU process given a large dataset of observed trajectories and, on average, outperforms traditional parameter estimation methods. ...

April 17, 2024 · 2 min · Research Team

Transformer for Times Series: an Application to the S&P500

Transformer for Times Series: an Application to the S&P500 ArXiv ID: 2403.02523 “View on arXiv” Authors: Unknown Abstract The transformer models have been extensively used with good results in a wide area of machine learning applications including Large Language Models and image generation. Here, we inquire on the applicability of this approach to financial time series. We first describe the dataset construction for two prototypical situations: a mean reverting synthetic Ornstein-Uhlenbeck process on one hand and real S&P500 data on the other hand. Then, we present in detail the proposed Transformer architecture and finally we discuss some encouraging results. For the synthetic data we predict rather accurately the next move, and for the S&P500 we get some interesting results related to quadratic variation and volatility prediction. ...

March 4, 2024 · 2 min · Research Team