Inference of Utilities and Time Preference in Sequential Decision-Making

ArXiv ID: 2405.15975 “View on arXiv”

Authors: Unknown

Abstract

This paper introduces a novel stochastic control framework to enhance the capabilities of automated investment managers, or robo-advisors, by accurately inferring clients’ investment preferences from past activities. Our approach leverages a continuous-time model that incorporates utility functions and a generic discounting scheme of a time-varying rate, tailored to each client’s risk tolerance, valuation of daily consumption, and significant life goals. We address the resulting time inconsistency issue through state augmentation and the establishment of the dynamic programming principle and the verification theorem. Additionally, we provide sufficient conditions for the identifiability of client investment preferences. To complement our theoretical developments, we propose a learning algorithm based on maximum likelihood estimation within a discrete-time Markov Decision Process framework, augmented with entropy regularization. We prove that the log-likelihood function is locally concave, facilitating the fast convergence of our proposed algorithm. Practical effectiveness and efficiency are showcased through two numerical examples, including Merton’s problem and an investment problem with unhedgeable risks. Our proposed framework not only advances financial technology by improving personalized investment advice but also contributes broadly to other fields such as healthcare, economics, and artificial intelligence, where understanding individual preferences is crucial.

Keywords: Robo-Advisory, Stochastic Control, Utility Functions, Maximum Likelihood Estimation, Time Inconsistency, Wealth Management (Robo-Advisory)

Complexity vs Empirical Score

  • Math Complexity: 9.0/10
  • Empirical Rigor: 3.0/10
  • Quadrant: Lab Rats
  • Why: The paper employs advanced stochastic control, dynamic programming principles, and verification theorems in continuous time, resulting in high mathematical complexity. However, it lacks real-world backtesting, datasets, or performance metrics, relying instead on theoretical proofs and two numerical examples (Merton’s problem and unhedgeable risks), indicating low empirical rigor.
  flowchart TD
    A["Research Goal: Infer Investment Preferences<br>of Robo-Advisor Clients"] --> B["Methodology: Stochastic Control Framework"]
    
    B --> C["Inputs: Past Client Activity<br>Utility Functions & Discounting Schemes"]
    
    C --> D["Core Process: State Augmentation<br>Dynamic Programming Principle"]
    D --> E["Computational Process: MLE with Entropy Regularization"]
    
    E --> F{"Outcomes"}
    F --> G["Identifiability of Preferences"]
    F --> H["Concave Log-Likelihood & Fast Convergence"]
    F --> I["Personalized Investment Advice"]