false

Exploratory Randomization for Discrete-Time Linear Exponential Quadratic Gaussian (LEQG) Problem

Exploratory Randomization for Discrete-Time Linear Exponential Quadratic Gaussian (LEQG) Problem ArXiv ID: 2501.06275 “View on arXiv” Authors: Unknown Abstract We investigate exploratory randomization for an extended linear-exponential-quadratic-Gaussian (LEQG) control problem in discrete time. This extended control problem is related to the structure of risk-sensitive investment management applications. We introduce exploration through a randomization of the control. Next, we apply the duality between free energy and relative entropy to reduce the LEQG problem to an equivalent risk-neutral LQG control problem with an entropy regularization term, see, e.g. Dai Pra et al. (1996), for which we present a solution approach based on Dynamic Programming. Our approach, based on the energy-entropy duality may also be considered as leading to a justification for the use, in the literature, of an entropy regularization when applying a randomized control. ...

January 10, 2025 · 2 min · Research Team

Continuous-time Risk-sensitive Reinforcement Learning via Quadratic Variation Penalty

Continuous-time Risk-sensitive Reinforcement Learning via Quadratic Variation Penalty ArXiv ID: 2404.12598 “View on arXiv” Authors: Unknown Abstract This paper studies continuous-time risk-sensitive reinforcement learning (RL) under the entropy-regularized, exploratory diffusion process formulation with the exponential-form objective. The risk-sensitive objective arises either as the agent’s risk attitude or as a distributionally robust approach against the model uncertainty. Owing to the martingale perspective in Jia and Zhou (2023) the risk-sensitive RL problem is shown to be equivalent to ensuring the martingale property of a process involving both the value function and the q-function, augmented by an additional penalty term: the quadratic variation of the value process, capturing the variability of the value-to-go along the trajectory. This characterization allows for the straightforward adaptation of existing RL algorithms developed for non-risk-sensitive scenarios to incorporate risk sensitivity by adding the realized variance of the value process. Additionally, I highlight that the conventional policy gradient representation is inadequate for risk-sensitive problems due to the nonlinear nature of quadratic variation; however, q-learning offers a solution and extends to infinite horizon settings. Finally, I prove the convergence of the proposed algorithm for Merton’s investment problem and quantify the impact of temperature parameter on the behavior of the learning procedure. I also conduct simulation experiments to demonstrate how risk-sensitive RL improves the finite-sample performance in the linear-quadratic control problem. ...

April 19, 2024 · 2 min · Research Team

A Backtesting Protocol in the Era of Machine Learning

A Backtesting Protocol in the Era of Machine Learning ArXiv ID: ssrn-3275654 “View on arXiv” Authors: Unknown Abstract Machine learning offers a set of powerful tools that holds considerable promise for investment management. As with most quantitative applications in finance, th Keywords: Machine Learning, Investment Management, Quantitative Finance, Asset Pricing, Algorithmic Trading Complexity vs Empirical Score Math Complexity: 3.0/10 Empirical Rigor: 7.0/10 Quadrant: Street Traders Why: The paper focuses on a research protocol for backtesting and data mining, with moderate empirical rigor involving practical concerns like overfitting and data scarcity, but lacks advanced mathematical derivations, centering instead on statistical concepts and real-world data challenges. flowchart TD A["Research Goal: Develop robust backtesting protocol for ML in finance"] --> B["Data: Cross-sectional stock data & fundamental features"] B --> C["Methodology: ML pipelines with walk-forward validation"] C --> D["Computation: Model training, hyperparameter tuning, & signal generation"] D --> E["Risk Controls: Transaction costs, liquidity constraints, & overfitting tests"] E --> F["Key Outcomes: Generalizable signals & realistic performance metrics"] F --> G["Implication: ML requires rigorous validation to avoid false discoveries"]

November 13, 2018 · 1 min · Research Team