FDR-Controlled Portfolio Optimization for Sparse Financial Index Tracking
ArXiv ID: 2401.15139 “View on arXiv”
Authors: Unknown
Abstract
In high-dimensional data analysis, such as financial index tracking or biomedical applications, it is crucial to select the few relevant variables while maintaining control over the false discovery rate (FDR). In these applications, strong dependencies often exist among the variables (e.g., stock returns), which can undermine the FDR control property of existing methods like the model-X knockoff method or the T-Rex selector. To address this issue, we have expanded the T-Rex framework to accommodate overlapping groups of highly correlated variables. This is achieved by integrating a nearest neighbors penalization mechanism into the framework, which provably controls the FDR at the user-defined target level. A real-world example of sparse index tracking demonstrates the proposed method’s ability to accurately track the S&P 500 index over the past 20 years based on a small number of stocks. An open-source implementation is provided within the R package TRexSelector on CRAN.
Keywords: variable selection, false discovery rate (FDR) control, knockoff filters, sparse index tracking, high-dimensional statistics, Equities
Complexity vs Empirical Score
- Math Complexity: 8.5/10
- Empirical Rigor: 7.0/10
- Quadrant: Holy Grail
- Why: The paper presents sophisticated mathematical derivations including FDR control proofs and a novel nearest neighbors penalization mechanism (high math). It includes a real-world 20-year backtest on S&P 500 data, an open-source R package implementation, and mentions high-performance computing for empirical validation (high rigor).
flowchart TD
A["Research Goal: FDR-controlled sparse index tracking with dependent variables"] --> B["Propose Method: <br>Extended T-Rex with nearest neighbors penalization"]
B --> C["Data: 20-year S&P 500 stock returns"]
C --> D["Computation: <br>FDR-controlled variable selection via overlapping groups"]
D --> E["Findings: <br>Accurate tracking with minimal stocks"]
E --> F["Outcome: R package TRexSelector on CRAN"]