Generating Synergistic Formulaic Alpha Collections via Reinforcement Learning
ArXiv ID: 2306.12964 “View on arXiv”
Authors: Unknown
Abstract
In the field of quantitative trading, it is common practice to transform raw historical stock data into indicative signals for the market trend. Such signals are called alpha factors. Alphas in formula forms are more interpretable and thus favored by practitioners concerned with risk. In practice, a set of formulaic alphas is often used together for better modeling precision, so we need to find synergistic formulaic alpha sets that work well together. However, most traditional alpha generators mine alphas one by one separately, overlooking the fact that the alphas would be combined later. In this paper, we propose a new alpha-mining framework that prioritizes mining a synergistic set of alphas, i.e., it directly uses the performance of the downstream combination model to optimize the alpha generator. Our framework also leverages the strong exploratory capabilities of reinforcement learning~(RL) to better explore the vast search space of formulaic alphas. The contribution to the combination models’ performance is assigned to be the return used in the RL process, driving the alpha generator to find better alphas that improve upon the current set. Experimental evaluations on real-world stock market data demonstrate both the effectiveness and the efficiency of our framework for stock trend forecasting. The investment simulation results show that our framework is able to achieve higher returns compared to previous approaches.
Keywords: Alpha Mining, Reinforcement Learning, Quantitative Trading, Formulaic Alphas, Stock Trend Forecasting, Equities
Complexity vs Empirical Score
- Math Complexity: 7.0/10
- Empirical Rigor: 8.0/10
- Quadrant: Holy Grail
- Why: The paper employs advanced reinforcement learning and policy gradient algorithms for a complex search space (high math), and evaluates the framework on real-world stock data with investment simulations and comparative benchmarks (high rigor).
flowchart TD
A["<b>Research Goal:</b><br>Develop a synergistic alpha mining framework<br>using RL for quantitative trading"] --> B["<b>Data Input:</b><br>Historical Stock Market Data"]
B --> C["<b>Methodology:</b><br>Reinforcement Learning Agent<br>explores vast alpha formula space"]
C --> D["<b>Computational Process:</b><br>Generate & combine alpha sets,<br>evaluate via downstream model performance"]
D --> E["<b>Key Finding 1:</b><br>RL framework finds synergistic<br>alpha sets effectively"]
D --> F["<b>Key Finding 2:</b><br>Higher investment returns vs.<br>traditional single-alpha mining"]