Causal Inference on Investment Constraints and Non-stationarity in Dynamic Portfolio Optimization through Reinforcement Learning

ArXiv ID: 2311.04946 “View on arXiv”

Authors: Unknown

Abstract

In this study, we have developed a dynamic asset allocation investment strategy using reinforcement learning techniques. To begin with, we have addressed the crucial issue of incorporating non-stationarity of financial time series data into reinforcement learning algorithms, which is a significant implementation in the application of reinforcement learning in investment strategies. Our findings highlight the significance of introducing certain variables such as regime change in the environment setting to enhance the prediction accuracy. Furthermore, the application of reinforcement learning in investment strategies provides a remarkable advantage of setting the optimization problem flexibly. This enables the integration of practical constraints faced by investors into the algorithm, resulting in efficient optimization. Our study has categorized the investment strategy formulation conditions into three main categories, including performance measurement indicators, portfolio management rules, and other constraints. We have evaluated the impact of incorporating these conditions into the environment and rewards in a reinforcement learning framework and examined how they influence investment behavior.

Keywords: Reinforcement learning, Dynamic asset allocation, Non-stationarity, Investment strategy, Optimization

Complexity vs Empirical Score

  • Math Complexity: 7.5/10
  • Empirical Rigor: 8.0/10
  • Quadrant: Holy Grail
  • Why: The paper employs advanced reinforcement learning algorithms (SARSA, Q-learning) with non-trivial state spaces and reward engineering, scoring high on mathematical complexity. It features a rigorous out-of-sample backtest over 22 years, specific data handling, and statistical significance testing, indicating high empirical rigor.
  flowchart TD
    A["Research Goal: Develop RL-based Dynamic Asset Allocation Strategy"] --> B{"Methodology & Data"}
    
    B --> C["Data: Financial Time Series<br>Non-stationary Trends"]
    B --> D["Core Methodology: Reinforcement Learning"]
    
    C & D --> E["Computational Process: Environment Setup"]
    E --> F["Process: Incorporate Investment Constraints<br>Rules & Regime Changes"]
    F --> G["Process: Optimize Policy via<br>RL Algorithm"]
    
    G --> H["Key Findings / Outcomes"]
    
    H --> I["Strategy: Flexible Optimization<br>Integrates Practical Constraints"]
    H --> J["Performance: Enhanced Accuracy<br>via Non-stationarity Handling"]