Alpha Mining and Enhancing via Warm Start Genetic Programming for Quantitative Investment

ArXiv ID: 2412.00896 “View on arXiv”

Authors: Unknown

Abstract

Traditional genetic programming (GP) often struggles in stock alpha factor discovery due to its vast search space, overwhelming computational burden, and sporadic effective alphas. We find that GP performs better when focusing on promising regions rather than random searching. This paper proposes a new GP framework with carefully chosen initialization and structural constraints to enhance search performance and improve the interpretability of the alpha factors. This approach is motivated by and mimics the alpha searching practice and aims to boost the efficiency of such a process. Analysis of 2020-2024 Chinese stock market data shows that our method yields superior out-of-sample prediction results and higher portfolio returns than the benchmark.

Keywords: Genetic Programming (GP), Alpha Factor Discovery, Structural Constraints, Portfolio Optimization, Interpretability, Equities

Complexity vs Empirical Score

Math Complexity: 4.0/10
Empirical Rigor: 8.0/10
Quadrant: Street Traders
Why: The paper is primarily methodological, focusing on algorithmic improvements (tree structures, initialization) with minimal advanced mathematics beyond basic statistics. It is highly empirical, evidenced by backtests on real Chinese stock market data from 2020-2024, specific performance metrics like IC and portfolio returns, and a clear comparison to benchmarks.

  flowchart TD
    A["Research Goal:<br>Efficient Alpha Factor Discovery<br>via Genetic Programming"] --> B{"Data & Inputs"}
    B --> B1["Stock Market Data<br>(2020-2024 China)"]
    B --> B2["Genetic Programming<br>Framework"]
    B --> B3["Structural Constraints<br>& Warm Start"]
    
    B --> C["Methodology: GP Initialization"]
    C --> D{"Optimization Process"}
    D --> D1["Focused Search<br>Promising Regions"]
    D --> D2["Structural Constraints<br>Reduce Search Space"]
    D --> D3["Interpretability<br>Enhancement"]
    
    D --> E["Outcomes & Findings"]
    E --> E1["Superior Out-of-Sample<br>Prediction Results"]
    E --> E2["Higher Portfolio Returns<br>vs Benchmark"]
    
    style A fill:#e1f5e1
    style E fill:#fff2cc

Alpha Mining and Enhancing via Warm Start Genetic Programming for Quantitative Investment#

Abstract#

Complexity vs Empirical Score#

Alpha Mining and Enhancing via Warm Start Genetic Programming for Quantitative Investment

Abstract

Complexity vs Empirical Score