Breaking the Dimensional Barrier: Dynamic Portfolio Choice with Parameter Uncertainty via Pontryagin Projection

ArXiv ID: 2601.03175 “View on arXiv”

Authors: Jeonggyu Huh, Hyeng Keun Koo

Abstract

We study continuous-time portfolio choice in diffusion markets with parameter $θ\in Θ$ and uncertainty law $q(dθ)$. Nature draws latent $θ\sim q$ at time 0; the investor cannot observe it and must deploy a single $θ$-blind feedback policy maximizing an ex-ante CRRA objective averaged over diffusion noise and $θ$. Our methods access $q$ only by sampling and assume no parametric form. We extend Pontryagin-Guided Direct Policy Optimization (PG-DPO) by sampling $θ$ inside the simulator and computing discrete-time gradients via backpropagation through time (BPTT), and we propose projected PG-DPO (P-PGDPO) that projects costate estimates to satisfy the $q$-aggregated Pontryagin first-order condition, yielding a deployable rule. We prove a BPTT-PMP correspondence uniform on compacts and a residual-based $θ$-blind policy-gap bound under local stability with explicit discretization/Monte Carlo errors; experiments show projection-driven stability and accurate decision-time benchmark recovery in high dimensions.

Keywords: portfolio choice, Pontryagin principle, backpropagation through time (BPTT), diffusion markets, latent parameter estimation, Multi-Asset

Complexity vs Empirical Score

  • Math Complexity: 9.0/10
  • Empirical Rigor: 4.0/10
  • Quadrant: Lab Rats
  • Why: The paper is heavily theoretical, employing advanced continuous-time stochastic control (Pontryagin’s Maximum Principle) and deep learning optimization (PG-DPO with BPTT), which is mathematically dense. While it mentions high-dimensional experiments, the summary and excerpt focus on theoretical proofs and algorithmic development without detailed backtesting results, datasets, or empirical metrics, indicating lower empirical rigor.
  flowchart TD
    A["Research Goal<br>Optimal Portfolio Choice<br>with Unknown Parameter θ"] --> B["Methodology<br>Pontryagin-Guided Direct Policy Optimization<br>Extended with Latent θ Sampling"]
    B --> C{"Compute Discrete Gradients<br>via Backpropagation Through Time"}
    C --> D["Proposed: Projected PG-DPO<br>Projects costates to satisfy<br>q-aggregated Pontryagin First-Order Condition"]
    D --> E["Key Findings<br>1. BPTT-PMP Correspondence Proof<br>2. Policy-Gap Bound<br>3. Stable Decision-Time Recovery"]
    style A fill:#e1f5e1
    style E fill:#fff2cc