false

Breaking the Dimensional Barrier: Dynamic Portfolio Choice with Parameter Uncertainty via Pontryagin Projection

Breaking the Dimensional Barrier: Dynamic Portfolio Choice with Parameter Uncertainty via Pontryagin Projection ArXiv ID: 2601.03175 “View on arXiv” Authors: Jeonggyu Huh, Hyeng Keun Koo Abstract We study continuous-time portfolio choice in diffusion markets with parameter $θ\in Θ$ and uncertainty law $q(dθ)$. Nature draws latent $θ\sim q$ at time 0; the investor cannot observe it and must deploy a single $θ$-blind feedback policy maximizing an ex-ante CRRA objective averaged over diffusion noise and $θ$. Our methods access $q$ only by sampling and assume no parametric form. We extend Pontryagin-Guided Direct Policy Optimization (PG-DPO) by sampling $θ$ inside the simulator and computing discrete-time gradients via backpropagation through time (BPTT), and we propose projected PG-DPO (P-PGDPO) that projects costate estimates to satisfy the $q$-aggregated Pontryagin first-order condition, yielding a deployable rule. We prove a BPTT-PMP correspondence uniform on compacts and a residual-based $θ$-blind policy-gap bound under local stability with explicit discretization/Monte Carlo errors; experiments show projection-driven stability and accurate decision-time benchmark recovery in high dimensions. ...

January 6, 2026 · 2 min · Research Team

Breaking the Dimensional Barrier: A Pontryagin-Guided Direct Policy Optimization for Continuous-Time Multi-Asset Portfolio Choice

Breaking the Dimensional Barrier: A Pontryagin-Guided Direct Policy Optimization for Continuous-Time Multi-Asset Portfolio Choice ArXiv ID: 2504.11116 “View on arXiv” Authors: Unknown Abstract We introduce the Pontryagin-Guided Direct Policy Optimization (PG-DPO) framework for high-dimensional continuous-time portfolio choice. Our approach combines Pontryagin’s Maximum Principle (PMP) with backpropagation through time (BPTT) to directly inform neural network-based policy learning, enabling accurate recovery of both myopic and intertemporal hedging demands–an aspect often missed by existing methods. Building on this, we develop the Projected PG-DPO (P-PGDPO) variant, which achieves nearoptimal policies with substantially improved efficiency. P-PGDPO leverages rapidly stabilizing costate estimates from BPTT and analytically projects them onto PMP’s first-order conditions, reducing training overhead while improving precision. Numerical experiments show that PG-DPO matches or exceeds the accuracy of Deep BSDE, while P-PGDPO delivers significantly higher precision and scalability. By explicitly incorporating time-to-maturity, our framework naturally applies to finite-horizon problems and captures horizon-dependent effects, with the long-horizon case emerging as a stationary special case. ...

April 15, 2025 · 2 min · Research Team