Alpha Mining and Enhancing via Warm Start Genetic Programming for Quantitative Investment
ArXiv ID: 2412.00896 “View on arXiv”
Authors: Unknown
Abstract
Traditional genetic programming (GP) often struggles in stock alpha factor discovery due to its vast search space, overwhelming computational burden, and sporadic effective alphas. We find that GP performs better when focusing on promising regions rather than random searching. This paper proposes a new GP framework with carefully chosen initialization and structural constraints to enhance search performance and improve the interpretability of the alpha factors. This approach is motivated by and mimics the alpha searching practice and aims to boost the efficiency of such a process. Analysis of 2020-2024 Chinese stock market data shows that our method yields superior out-of-sample prediction results and higher portfolio returns than the benchmark.
Keywords: Genetic Programming (GP), Alpha Factor Discovery, Structural Constraints, Portfolio Optimization, Interpretability, Equities
Complexity vs Empirical Score
- Math Complexity: 4.0/10
- Empirical Rigor: 8.0/10
- Quadrant: Street Traders
- Why: The paper is primarily methodological, focusing on algorithmic improvements (tree structures, initialization) with minimal advanced mathematics beyond basic statistics. It is highly empirical, evidenced by backtests on real Chinese stock market data from 2020-2024, specific performance metrics like IC and portfolio returns, and a clear comparison to benchmarks.
flowchart TD
A["Research Goal:<br>Efficient Alpha Factor Discovery<br>via Genetic Programming"] --> B{"Data & Inputs"}
B --> B1["Stock Market Data<br>(2020-2024 China)"]
B --> B2["Genetic Programming<br>Framework"]
B --> B3["Structural Constraints<br>& Warm Start"]
B --> C["Methodology: GP Initialization"]
C --> D{"Optimization Process"}
D --> D1["Focused Search<br>Promising Regions"]
D --> D2["Structural Constraints<br>Reduce Search Space"]
D --> D3["Interpretability<br>Enhancement"]
D --> E["Outcomes & Findings"]
E --> E1["Superior Out-of-Sample<br>Prediction Results"]
E --> E2["Higher Portfolio Returns<br>vs Benchmark"]
style A fill:#e1f5e1
style E fill:#fff2cc