In-Context Operator Learning for Linear Propagator Models

ArXiv ID: 2501.15106 “View on arXiv”

Authors: Unknown

Abstract

We study operator learning in the context of linear propagator models for optimal order execution problems with transient price impact à la Bouchaud et al. (2004) and Gatheral (2010). Transient price impact persists and decays over time according to some propagator kernel. Specifically, we propose to use In-Context Operator Networks (ICON), a novel transformer-based neural network architecture introduced by Yang et al. (2023), which facilitates data-driven learning of operators by merging offline pre-training with an online few-shot prompting inference. First, we train ICON to learn the operator from various propagator models that maps the trading rate to the induced transient price impact. The inference step is then based on in-context prediction, where ICON is presented only with a few examples. We illustrate that ICON is capable of accurately inferring the underlying price impact model from the data prompts, even with propagator kernels not seen in the training data. In a second step, we employ the pre-trained ICON model provided with context as a surrogate operator in solving an optimal order execution problem via a neural network control policy, and demonstrate that the exact optimal execution strategies from Abi Jaber and Neuman (2022) for the models generating the context are correctly retrieved. Our introduced methodology is very general, offering a new approach to solving optimal stochastic control problems with unknown state dynamics, inferred data-efficiently from a limited number of examples by leveraging the few-shot and transfer learning capabilities of transformer networks.

Keywords: Operator learning, In-Context Operator Networks (ICON), Transient price impact, Optimal order execution, Propagator models

Complexity vs Empirical Score

  • Math Complexity: 8.5/10
  • Empirical Rigor: 4.0/10
  • Quadrant: Lab Rats
  • Why: The paper is dense with advanced stochastic control theory, PDE/ODE formulations, and transformer architecture derivations, making it highly mathematically complex. However, it relies on synthetic data and lacks real-market backtests, code, or statistical metrics, resulting in low empirical rigor.
  flowchart TD
    A["Research Goal:<br/>Learn & Control Linear Propagator Models"] --> B{"Data & Inputs"}
    B --> C["Offline Training Data:<br/>Various Propagator Kernels"]
    B --> D["Prompting Examples:<br/>Few Context Trajectories"]
    
    C --> E["Step 1: ICON Training"]
    E --> F["Pre-trained ICON<br/>(Operator Approximation)"]
    
    D --> G["Step 2: In-Context Inference"]
    F --> G
    
    G --> H["Learned Dynamics:<br/>Identified Propagator Model"]
    G --> I["Step 3: Neural Control"]
    
    H --> I
    I --> J["Key Outcomes"]
    
    subgraph J [" "]
        J1["Operator Learning from Data"]
        J2["Model-Agnostic<br/>Few-Shot Identification"]
        J3["Optimal Control Recovery"]
    end