Reinforcement Learning with Maskable Stock Representation for Portfolio Management in Customizable Stock Pools
ArXiv ID: 2311.10801 “View on arXiv”
Authors: Unknown
Abstract
Portfolio management (PM) is a fundamental financial trading task, which explores the optimal periodical reallocation of capitals into different stocks to pursue long-term profits. Reinforcement learning (RL) has recently shown its potential to train profitable agents for PM through interacting with financial markets. However, existing work mostly focuses on fixed stock pools, which is inconsistent with investors’ practical demand. Specifically, the target stock pool of different investors varies dramatically due to their discrepancy on market states and individual investors may temporally adjust stocks they desire to trade (e.g., adding one popular stocks), which lead to customizable stock pools (CSPs). Existing RL methods require to retrain RL agents even with a tiny change of the stock pool, which leads to high computational cost and unstable performance. To tackle this challenge, we propose EarnMore, a rEinforcement leARNing framework with Maskable stOck REpresentation to handle PM with CSPs through one-shot training in a global stock pool (GSP). Specifically, we first introduce a mechanism to mask out the representation of the stocks outside the target pool. Second, we learn meaningful stock representations through a self-supervised masking and reconstruction process. Third, a re-weighting mechanism is designed to make the portfolio concentrate on favorable stocks and neglect the stocks outside the target pool. Through extensive experiments on 8 subset stock pools of the US stock market, we demonstrate that EarnMore significantly outperforms 14 state-of-the-art baselines in terms of 6 popular financial metrics with over 40% improvement on profit.
Keywords: Reinforcement Learning, Portfolio Management, Stock Representation, Maskable Networks, Algorithmic Trading, Equities
Complexity vs Empirical Score
- Math Complexity: 7.5/10
- Empirical Rigor: 8.0/10
- Quadrant: Holy Grail
- Why: The paper employs advanced RL theory, deep learning architectures, and representation learning with self-supervised objectives, indicating high mathematical complexity. It also features extensive empirical validation with multiple baselines, specific financial metrics, and a public code repository, demonstrating strong empirical rigor.
flowchart TD
A["Research Goal: Handle Portfolio Management with Customizable Stock Pools<br/>using one-shot training"] --> B["Data Input: Global Stock Pool & Financial Markets"]
B --> C["Methodology: EarnMore Framework"]
C --> D["Representation Learning<br/>Self-supervised Masking & Reconstruction"]
C --> E["Masking Mechanism<br/>Mask representations of stocks outside target pool"]
C --> F["Re-weighting Mechanism<br/>Concentrate on favorable stocks"]
D & E & F --> G["Computational Process<br/>RL Agent Interaction with CSPs"]
G --> H["Key Findings<br/>Outperforms 14 SOTA baselines<br/>>40% profit improvement"]