false

Unified continuous-time q-learning for mean-field game and mean-field control problems

Unified continuous-time q-learning for mean-field game and mean-field control problems ArXiv ID: 2407.04521 “View on arXiv” Authors: Unknown Abstract This paper studies the continuous-time q-learning in mean-field jump-diffusion models when the population distribution is not directly observable. We propose the integrated q-function in decoupled form (decoupled Iq-function) from the representative agent’s perspective and establish its martingale characterization, which provides a unified policy evaluation rule for both mean-field game (MFG) and mean-field control (MFC) problems. Moreover, we consider the learning procedure where the representative agent updates the population distribution based on his own state values. Depending on the task to solve the MFG or MFC problem, we can employ the decoupled Iq-function differently to characterize the mean-field equilibrium policy or the mean-field optimal policy respectively. Based on these theoretical findings, we devise a unified q-learning algorithm for both MFG and MFC problems by utilizing test policies and the averaged martingale orthogonality condition. For several financial applications in the jump-diffusion setting, we obtain the exact parameterization of the decoupled Iq-functions and the value functions, and illustrate our q-learning algorithm with satisfactory performance. ...

July 5, 2024 · 2 min · Research Team

Unwinding Toxic Flow with Partial Information

Unwinding Toxic Flow with Partial Information ArXiv ID: 2407.04510 “View on arXiv” Authors: Unknown Abstract We consider a central trading desk which aggregates the inflow of clients’ orders with unobserved toxicity, i.e. persistent adverse directionality. The desk chooses either to internalise the inflow or externalise it to the market in a cost effective manner. In this model, externalising the order flow creates both price impact costs and an additional market feedback reaction for the inflow of trades. The desk’s objective is to maximise the daily trading P&L subject to end of the day inventory penalization. We formulate this setting as a partially observable stochastic control problem and solve it in two steps. First, we derive the filtered dynamics of the inventory and toxicity, projected to the observed filtration, which turns the stochastic control problem into a fully observed problem. Then we use a variational approach in order to derive the unique optimal trading strategy. We illustrate our results for various scenarios in which the desk is facing momentum and mean-reverting toxicity. Our implementation shows that the P&L performance gap between the partially observable problem and the full information case are of order $0.01%$ in all tested scenarios. ...

July 5, 2024 · 2 min · Research Team

Constrained mean-variance investment-reinsurance under the Cramér-Lundberg model with random coefficients

Constrained mean-variance investment-reinsurance under the Cramér-Lundberg model with random coefficients ArXiv ID: 2406.10465 “View on arXiv” Authors: Unknown Abstract In this paper, we study an optimal mean-variance investment-reinsurance problem for an insurer (she) under a Cramér-Lundberg model with random coefficients. At any time, the insurer can purchase reinsurance or acquire new business and invest her surplus in a security market consisting of a risk-free asset and multiple risky assets, subject to a general convex cone investment constraint. We reduce the problem to a constrained stochastic linear-quadratic control problem with jumps whose solution is related to a system of partially coupled stochastic Riccati equations (SREs). Then we devote ourselves to establishing the existence and uniqueness of solutions to the SREs by pure backward stochastic differential equation (BSDE) techniques. We achieve this with the help of approximation procedure, comparison theorems for BSDEs with jumps, log transformation and BMO martingales. The efficient investment-reinsurance strategy and efficient mean-variance frontier are explicitly given through the solutions of the SREs, which are shown to be a linear feedback form of the wealth process and a half-line, respectively. ...

June 15, 2024 · 2 min · Research Team

Constrained monotone mean--variance investment-reinsurance under the Cramér--Lundberg model with random coefficients

Constrained monotone mean–variance investment-reinsurance under the Cramér–Lundberg model with random coefficients ArXiv ID: 2405.17841 “View on arXiv” Authors: Unknown Abstract This paper studies an optimal investment-reinsurance problem for an insurer (she) under the Cramér–Lundberg model with monotone mean–variance (MMV) criterion. At any time, the insurer can purchase reinsurance (or acquire new business) and invest in a security market consisting of a risk-free asset and multiple risky assets whose excess return rate and volatility rate are allowed to be random. The trading strategy is subject to a general convex cone constraint, encompassing no-shorting constraint as a special case. The optimal investment-reinsurance strategy and optimal value for the MMV problem are deduced by solving certain backward stochastic differential equations with jumps. In the literature, it is known that models with MMV criterion and mean–variance criterion lead to the same optimal strategy and optimal value when the wealth process is continuous. Our result shows that the conclusion remains true even if the wealth process has compensated Poisson jumps and the market coefficients are random. ...

May 28, 2024 · 2 min · Research Team

Inference of Utilities and Time Preference in Sequential Decision-Making

Inference of Utilities and Time Preference in Sequential Decision-Making ArXiv ID: 2405.15975 “View on arXiv” Authors: Unknown Abstract This paper introduces a novel stochastic control framework to enhance the capabilities of automated investment managers, or robo-advisors, by accurately inferring clients’ investment preferences from past activities. Our approach leverages a continuous-time model that incorporates utility functions and a generic discounting scheme of a time-varying rate, tailored to each client’s risk tolerance, valuation of daily consumption, and significant life goals. We address the resulting time inconsistency issue through state augmentation and the establishment of the dynamic programming principle and the verification theorem. Additionally, we provide sufficient conditions for the identifiability of client investment preferences. To complement our theoretical developments, we propose a learning algorithm based on maximum likelihood estimation within a discrete-time Markov Decision Process framework, augmented with entropy regularization. We prove that the log-likelihood function is locally concave, facilitating the fast convergence of our proposed algorithm. Practical effectiveness and efficiency are showcased through two numerical examples, including Merton’s problem and an investment problem with unhedgeable risks. Our proposed framework not only advances financial technology by improving personalized investment advice but also contributes broadly to other fields such as healthcare, economics, and artificial intelligence, where understanding individual preferences is crucial. ...

May 24, 2024 · 2 min · Research Team

Trade execution games in a Markovian environment

Trade execution games in a Markovian environment ArXiv ID: 2405.07184 “View on arXiv” Authors: Unknown Abstract This paper examines a trade execution game for two large traders in a generalized price impact model. We incorporate a stochastic and sequentially dependent factor that exogenously affects the market price into financial markets. Our model accounts for how strategic and environmental uncertainties affect the large traders’ execution strategies. We formulate an expected utility maximization problem for two large traders as a Markov game model. Applying the backward induction method of dynamic programming, we provide an explicit closed-form execution strategy at a Markov perfect equilibrium. Our theoretical results reveal that the execution strategy generally lies in a dynamic and non-randomized class; it becomes deterministic if the Markovian environment is also deterministic. In addition, our simulation-based numerical experiments suggest that the execution strategy captures various features observed in financial markets. ...

May 12, 2024 · 2 min · Research Team

On Risk-Sensitive Decision Making Under Uncertainty

On Risk-Sensitive Decision Making Under Uncertainty ArXiv ID: 2404.13371 “View on arXiv” Authors: Unknown Abstract This paper studies a risk-sensitive decision-making problem under uncertainty. It considers a decision-making process that unfolds over a fixed number of stages, in which a decision-maker chooses among multiple alternatives, some of which are deterministic and others are stochastic. The decision-maker’s cumulative value is updated at each stage, reflecting the outcomes of the chosen alternatives. After formulating this as a stochastic control problem, we delineate the necessary optimality conditions for it. Two illustrative examples from optimal betting and inventory management are provided to support our theory. ...

April 20, 2024 · 1 min · Research Team

The Life Care Annuity: enhancing product features and refining pricing methods

The Life Care Annuity: enhancing product features and refining pricing methods ArXiv ID: 2404.02858 “View on arXiv” Authors: Unknown Abstract The state-of-the-art proposes Life Care Annuities, that have been recently designed as variable annuity contracts with Long-Term Care payouts and Guaranteed Lifelong Withdrawal Benefits. In this paper, we propose more general features for these insurance products and refine their pricing methods. We name our proposed product GLWB-LTC''. In particular, as to the product features, we allow dynamic withdrawal strategies, including the surrender option. Furthermore, we consider stochastic interest rates, described by a Cox-Ingersoll-Ross process. As to the numerical methods, we solve the stochastic control problem involved by the selection of the optimal withdrawal strategy through a robust tree method, which outperforms the Monte Carlo approach. We name this method Tree-LTC’’, and we use it to estimate the fair price of the product, as some relevant parameters vary, such as, for instance, the entry age of the policyholder. Furthermore, our numerical results show how the optimal withdrawal strategy varies over time with the health status of the policyholder. Our findings stress the important advantage of flexible withdrawal strategies in relation to insurance policies offering protection from health risks. Indeed, the policyholder is given more choice about how much to save for protection from the possible disability states at future times. ...

April 3, 2024 · 2 min · Research Team

Optimal Portfolio Choice with Cross-Impact Propagators

Optimal Portfolio Choice with Cross-Impact Propagators ArXiv ID: 2403.10273 “View on arXiv” Authors: Unknown Abstract We consider a class of optimal portfolio choice problems in continuous time where the agent’s transactions create both transient cross-impact driven by a matrix-valued Volterra propagator, as well as temporary price impact. We formulate this problem as the maximization of a revenue-risk functional, where the agent also exploits available information on a progressively measurable price predicting signal. We solve the maximization problem explicitly in terms of operator resolvents, by reducing the corresponding first order condition to a coupled system of stochastic Fredholm equations of the second kind and deriving its solution. We then give sufficient conditions on the matrix-valued propagator so that the model does not permit price manipulation. We also provide an implementation of the solutions to the optimal portfolio choice problem and to the associated optimal execution problem. Our solutions yield financial insights on the influence of cross-impact on the optimal strategies and its interplay with alpha decays. ...

March 15, 2024 · 2 min · Research Team

A Note on Optimal Liquidation with Linear Price Impact

A Note on Optimal Liquidation with Linear Price Impact ArXiv ID: 2402.14100 “View on arXiv” Authors: Unknown Abstract In this note we consider the maximization of the expected terminal wealth for the setup of quadratic transaction costs. First, we provide a very simple probabilistic solution to the problem. Although the problem was largely studied, as far as we know up to date this simple and probabilistic form of the solution has not appeared in the literature. Next, we apply the general result for the numerical study of the case where the risky asset is given by a fractional Brownian Motion and the information flow of the investor can be diversified. ...

February 21, 2024 · 2 min · Research Team