false

Exploratory Mean-Variance with Jumps: An Equilibrium Approach

Exploratory Mean-Variance with Jumps: An Equilibrium Approach ArXiv ID: 2512.09224 “View on arXiv” Authors: Yuling Max Chen, Bin Li, David Saunders Abstract Revisiting the continuous-time Mean-Variance (MV) Portfolio Optimization problem, we model the market dynamics with a jump-diffusion process and apply Reinforcement Learning (RL) techniques to facilitate informed exploration within the control space. We recognize the time-inconsistency of the MV problem and adopt the time-inconsistent control (TIC) approach to analytically solve for an exploratory equilibrium investment policy, which is a Gaussian distribution centered on the equilibrium control of the classical MV problem. Our approach accounts for time-inconsistent preferences and actions, and our equilibrium policy is the best option an investor can take at any given time during the investment period. Moreover, we leverage the martingale properties of the equilibrium policy, design a RL model, and propose an Actor-Critic RL algorithm. All of our RL model parameters converge to the corresponding true values in a simulation study. Our numerical study on 24 years of real market data shows that the proposed RL model is profitable in 13 out of 14 tests, demonstrating its practical applicability in real world investment. ...

December 10, 2025 · 2 min · Research Team

Equilibrium Portfolio Selection under Utility-Variance Analysis of Log Returns in Incomplete Markets

Equilibrium Portfolio Selection under Utility-Variance Analysis of Log Returns in Incomplete Markets ArXiv ID: 2511.05861 “View on arXiv” Authors: Yue Cao, Zongxia Liang, Sheng Wang, Xiang Yu Abstract This paper investigates a time-inconsistent portfolio selection problem in the incomplete mar ket model, integrating expected utility maximization with risk control. The objective functional balances the expected utility and variance on log returns, giving rise to time inconsistency and motivating the search of a time-consistent equilibrium strategy. We characterize the equilibrium via a coupled quadratic backward stochastic differential equation (BSDE) system and establish the existence theory in two special cases: (i)the two Brownian motions driven the price dynamics and the factor process are independent with $ρ= 0$; (ii) the trading strategy is constrained to be bounded. For the general case with correlation coefficient $ρ\neq 0$, we introduce the notion of an approximate time-consistent equilibrium. Employing the solution structure from the equilibrium in the case $ρ= 0$, we can construct an approximate time-consistent equilibrium in the general case with an error of order $O(ρ^2)$. Numerical examples and financial insights are also presented based on deep learning algorithms. ...

November 8, 2025 · 2 min · Research Team

Time-consistent portfolio selection with strictly monotone mean-variance preference

Time-consistent portfolio selection with strictly monotone mean-variance preference ArXiv ID: 2502.11052 “View on arXiv” Authors: Unknown Abstract This paper is devoted to time-consistent control problems of portfolio selection with strictly monotone mean-variance preferences. These preferences are variational modifications of the conventional mean-variance preferences, and remain time-inconsistent as in mean-variance optimization problems. To tackle the time-inconsistency, we study the Nash equilibrium controls of both the open-loop type and the closed-loop type, and characterize them within a random parameter setting. The problem is reduced to solving a flow of forward-backward stochastic differential equations for open-loop equilibria, and to solving extended Hamilton-Jacobi-Bellman equations for closed-loop equilibria. In particular, we derive semi-closed-form solutions for these two types of equilibria under a deterministic parameter setting. Both solutions are represented by the same function, which is independent of wealth state and random path. This function can be expressed as the conventional time-consistent mean-variance portfolio strategy multiplied by a factor greater than one. Furthermore, we find that the state-independent closed-loop Nash equilibrium control is a strong equilibrium strategy in a constant parameter setting only when the interest rate is sufficiently large. ...

February 16, 2025 · 2 min · Research Team

Periodic portfolio selection with quasi-hyperbolic discounting

Periodic portfolio selection with quasi-hyperbolic discounting ArXiv ID: 2410.18240 “View on arXiv” Authors: Unknown Abstract We introduce an infinite-horizon, continuous-time portfolio selection problem faced by an agent with periodic S-shaped preference and present bias. The inclusion of a quasi-hyperbolic discount function leads to time-inconsistency and we characterize the optimal portfolio for a pre-committing, naive and sophisticated agent respectively. In the more theoretically challenging problem with a sophisticated agent, the time-consistent planning strategy can be formulated as an equilibrium to a static mean field game. Interestingly, present bias and naivety do not necessarily result in less desirable risk taking behaviors, while agent’s sophistication may lead to excessive leverage (underinvestement) in the bad (good) states of the world. ...

October 23, 2024 · 2 min · Research Team

Inference of Utilities and Time Preference in Sequential Decision-Making

Inference of Utilities and Time Preference in Sequential Decision-Making ArXiv ID: 2405.15975 “View on arXiv” Authors: Unknown Abstract This paper introduces a novel stochastic control framework to enhance the capabilities of automated investment managers, or robo-advisors, by accurately inferring clients’ investment preferences from past activities. Our approach leverages a continuous-time model that incorporates utility functions and a generic discounting scheme of a time-varying rate, tailored to each client’s risk tolerance, valuation of daily consumption, and significant life goals. We address the resulting time inconsistency issue through state augmentation and the establishment of the dynamic programming principle and the verification theorem. Additionally, we provide sufficient conditions for the identifiability of client investment preferences. To complement our theoretical developments, we propose a learning algorithm based on maximum likelihood estimation within a discrete-time Markov Decision Process framework, augmented with entropy regularization. We prove that the log-likelihood function is locally concave, facilitating the fast convergence of our proposed algorithm. Practical effectiveness and efficiency are showcased through two numerical examples, including Merton’s problem and an investment problem with unhedgeable risks. Our proposed framework not only advances financial technology by improving personalized investment advice but also contributes broadly to other fields such as healthcare, economics, and artificial intelligence, where understanding individual preferences is crucial. ...

May 24, 2024 · 2 min · Research Team

Dynamic portfolio selection under generalized disappointment aversion

Dynamic portfolio selection under generalized disappointment aversion ArXiv ID: 2401.08323 “View on arXiv” Authors: Unknown Abstract This paper addresses the continuous-time portfolio selection problem under generalized disappointment aversion (GDA). The implicit definition of the certainty equivalent within GDA preferences introduces time inconsistency to this problem. We provide the sufficient and necessary condition for a strategy to be an equilibrium by a fully nonlinear integral equation. Investigating the existence and uniqueness of the solution to the integral equation, we establish the existence and uniqueness of the equilibrium. Our findings indicate that under disappointment aversion preferences, non-participation in the stock market is the unique equilibrium. The semi-analytical equilibrium strategies obtained under the constant relative risk aversion utility functions reveal that, under GDA preferences, the investment proportion in the stock market consistently remains smaller than the investment proportion under classical expected utility theory. The numerical analysis shows that the equilibrium strategy’s monotonicity concerning the two parameters of GDA preference aligns with the monotonicity of the degree of risk aversion. ...

January 16, 2024 · 2 min · Research Team

Portfolio Time Consistency and Utility Weighted Discount Rates

Portfolio Time Consistency and Utility Weighted Discount Rates ArXiv ID: 2402.05113 “View on arXiv” Authors: Unknown Abstract Merton portfolio management problem is studied in this paper within a stochastic volatility, non constant time discount rate, and power utility framework. This problem is time inconsistent and the way out of this predicament is to consider the subgame perfect strategies. The later are characterized through an extended Hamilton Jacobi Bellman (HJB) equation. A fixed point iteration is employed to solve the extended HJB equation. This is done in a two stage approach: in a first step the utility weighted discount rate is introduced and characterized as the fixed point of a certain operator; in the second step the value function is determined through a linear parabolic partial differential equation. Numerical experiments explore the effect of the time discount rate on the subgame perfect and precommitment strategies. ...

November 27, 2023 · 2 min · Research Team

Dynamic portfolio selection for nonlinear law-dependent preferences

Dynamic portfolio selection for nonlinear law-dependent preferences ArXiv ID: 2311.06745 “View on arXiv” Authors: Unknown Abstract This paper addresses the portfolio selection problem for nonlinear law-dependent preferences in continuous time, which inherently exhibit time inconsistency. Employing the method of stochastic maximum principle, we establish verification theorems for equilibrium strategies, accommodating both random market coefficients and incomplete markets. We derive the first-order condition (FOC) for the equilibrium strategies, using a notion of functional derivatives with respect to probability distributions. Then, with the help of the FOC we obtain the equilibrium strategies in closed form for two classes of implicitly defined preferences: CRRA and CARA betweenness preferences, with deterministic market coefficients. Finally, to show applications of our theoretical results to problems with random market coefficients, we examine the weighted utility. We reveal that the equilibrium strategy can be described by a coupled system of Quadratic Backward Stochastic Differential Equations (QBSDEs). The well-posedness of this system is generally open but is established under the special structures of our problem. ...

November 12, 2023 · 2 min · Research Team

D-TIPO: Deep time-inconsistent portfolio optimization with stocks and options

D-TIPO: Deep time-inconsistent portfolio optimization with stocks and options ArXiv ID: 2308.10556 “View on arXiv” Authors: Unknown Abstract In this paper, we propose a machine learning algorithm for time-inconsistent portfolio optimization. The proposed algorithm builds upon neural network based trading schemes, in which the asset allocation at each time point is determined by a a neural network. The loss function is given by an empirical version of the objective function of the portfolio optimization problem. Moreover, various trading constraints are naturally fulfilled by choosing appropriate activation functions in the output layers of the neural networks. Besides this, our main contribution is to add options to the portfolio of risky assets and a risk-free bond and using additional neural networks to determine the amount allocated into the options as well as their strike prices. We consider objective functions more in line with the rational preference of an investor than the classical mean-variance, apply realistic trading constraints and model the assets with a correlated jump-diffusion SDE. With an incomplete market and a more involved objective function, we show that it is beneficial to add options to the portfolio. Moreover, it is shown that adding options leads to a more constant stock allocation with less demand for drastic re-allocations. ...

August 21, 2023 · 2 min · Research Team