A Deep Reinforcement Learning Framework for Dynamic Portfolio Optimization: Evidence from China’s Stock Market
ArXiv ID: 2412.18563 “View on arXiv”
Authors: Unknown
Abstract
Artificial intelligence is transforming financial investment decision-making frameworks, with deep reinforcement learning demonstrating substantial potential in robo-advisory applications. This paper addresses the limitations of traditional portfolio optimization methods in dynamic asset weight adjustment through the development of a deep reinforcement learning-based dynamic optimization model grounded in practical trading processes. The research advances two key innovations: first, the introduction of a novel Sharpe ratio reward function engineered for Actor-Critic deep reinforcement learning algorithms, which ensures stable convergence during training while consistently achieving positive average Sharpe ratios; second, the development of an innovative comprehensive approach to portfolio optimization utilizing deep reinforcement learning, which significantly enhances model optimization capability through the integration of random sampling strategies during training with image-based deep neural network architectures for multi-dimensional financial time series data processing, average Sharpe ratio reward functions, and deep reinforcement learning algorithms. The empirical analysis validates the model using randomly selected constituent stocks from the CSI 300 Index, benchmarking against established financial econometric optimization models. Backtesting results demonstrate the model’s efficacy in optimizing portfolio allocation and mitigating investment risk, yielding superior comprehensive performance metrics.
Keywords: Deep reinforcement learning, Portfolio optimization, Actor-Critic algorithms, Sharpe ratio reward function, Robo-advisory
Complexity vs Empirical Score
- Math Complexity: 7.0/10
- Empirical Rigor: 8.0/10
- Quadrant: Holy Grail
- Why: The paper employs advanced mathematical concepts typical of deep reinforcement learning and financial econometrics (like Sharpe ratio derivations and DRL algorithms), while providing a backtest-ready empirical validation using real market data (CSI 300 stocks) with benchmarks against established models.
flowchart TD
A["Research Goal<br>Dynamic Portfolio Optimization<br>vs Traditional Methods"] --> B["Data & Inputs<br>CSI 300 Constituent Stocks<br>Financial Time Series"]
B --> C["Methodology: Deep RL Framework<br>Actor-Critic Algorithms<br>with Novel Sharpe Ratio Reward"]
C --> D["Computational Process<br>Image-based DNN for Data Processing<br>Random Sampling for Training"]
D --> E["Backtesting & Optimization<br>Dynamic Asset Weight Adjustment"]
E --> F["Key Outcomes<br>Superior Performance & Risk Mitigation<br>Positive Sharpe Ratios Achieved"]