Reinforcement Learning for Portfolio Optimization with a Financial Goal and Defined Time Horizons

ArXiv ID: 2511.18076 “View on arXiv”

Authors: Fermat Leukam, Rock Stephane Koffi, Prudence Djagba

Abstract

This research proposes an enhancement to the innovative portfolio optimization approach using the G-Learning algorithm, combined with parametric optimization via the GIRL algorithm (G-learning approach to the setting of Inverse Reinforcement Learning) as presented by. The goal is to maximize portfolio value by a target date while minimizing the investor’s periodic contributions. Our model operates in a highly volatile market with a well-diversified portfolio, ensuring a low-risk level for the investor, and leverages reinforcement learning to dynamically adjust portfolio positions over time. Results show that we improved the Sharpe Ratio from 0.42, as suggested by recent studies using the same approach, to a value of 0.483 a notable achievement in highly volatile markets with diversified portfolios. The comparison between G-Learning and GIRL reveals that while GIRL optimizes the reward function parameters (e.g., lambda = 0.0012 compared to 0.002), its impact on portfolio performance remains marginal. This suggests that reinforcement learning methods, like G-Learning, already enable robust optimization. This research contributes to the growing development of reinforcement learning applications in financial decision-making, demonstrating that probabilistic learning algorithms can effectively align portfolio management strategies with investor needs.

Keywords: G-Learning, GIRL, Reinforcement Learning, Portfolio Optimization, Sharpe Ratio, Portfolio

Complexity vs Empirical Score

Math Complexity: 8.0/10
Empirical Rigor: 3.0/10
Quadrant: Lab Rats
Why: The paper employs advanced reinforcement learning algorithms (G-Learning, GIRL) and Markov Decision Processes, involving sophisticated probabilistic modeling and mathematical formulations. However, the empirical validation relies on simulated data with no mention of code, specific datasets, or detailed backtesting methodology, resulting in low implementation readiness.

  flowchart TD
    A["Research Goal<br>Maximize portfolio value & minimize contributions<br>within defined time horizon"] --> B["Data & Inputs<br>Historical market data & volatility"]
    B --> C["Methodology: Reinforcement Learning<br>G-Learning + GIRL algorithms"]
    C --> D{"Computational Process<br>Dynamically adjust portfolio positions<br>based on market volatility"}
    D -- G-Learning Core --> E["Model Optimization<br>Robust RL optimization"]
    D -- GIRL Refinement --> F["Parametric Tuning<br>Reward function optimization"]
    E --> G["Key Findings<br>Sharpe Ratio: 0.42 → 0.483"]
    F --> G
    G --> H["Outcome<br>Effective alignment of RL strategies<br>with investor financial goals"]

Reinforcement Learning for Portfolio Optimization with a Financial Goal and Defined Time Horizons#

Abstract#

Complexity vs Empirical Score#

Reinforcement Learning for Portfolio Optimization with a Financial Goal and Defined Time Horizons

Abstract

Complexity vs Empirical Score