Sparse Asymptotic PCA: Identifying Sparse Latent Factors Across Time Horizon in High-Dimensional Time Series
ArXiv ID: 2407.09738 “View on arXiv”
Authors: Unknown
Abstract
This paper introduces a novel sparse latent factor modeling framework using sparse asymptotic Principal Component Analysis (APCA) to analyze the co-movements of high-dimensional panel data over time. Unlike existing methods based on sparse PCA, which assume sparsity in the loading matrices, our approach posits sparsity in the factor processes while allowing non-sparse loadings. This is motivated by the fact that financial returns typically exhibit universal and non-sparse exposure to market factors. Unlike the commonly used $\ell_1$-relaxation in sparse PCA, the proposed sparse APCA employs a truncated power method to estimate the leading sparse factor and a sequential deflation method for multi-factor cases under $\ell_0$-constraints. Furthermore, we develop a data-driven approach to identify the sparsity of risk factors over the time horizon using a novel cross-sectional cross-validation method. We establish the consistency of our estimators under mild conditions as both the dimension $N$ and the sample size $T$ grow. Monte Carlo simulations demonstrate that the proposed method performs well in finite samples. Empirically, we apply our method to daily S&P 500 stock returns (2004–2016) and identify nine risk factors influencing the stock market.
Keywords: Sparse Asymptotic PCA, Factor Modeling, Co-movements, Cross-Sectional Validation, High-Dimensional Panel Data, Equities
Complexity vs Empirical Score
- Math Complexity: 8.5/10
- Empirical Rigor: 6.5/10
- Quadrant: Holy Grail
- Why: The paper employs advanced mathematics including ℓ₀-constraints, truncated power methods, and asymptotic theory under dependent data, while it demonstrates empirical application with specific datasets and event analysis.
flowchart TD
A["Research Goal<br>Sparse Latent Factors in<br>High-Dim Time Series"] --> B["Input Data<br>Panel Data / Time Horizon"]
B --> C["Methodology<br>Sparse Asymptotic PCA via<br>Truncated Power & Deflation"]
C --> D{"Cross-Sectional<br>Cross-Validation"}
D -->|Optimal| E["Computational Process<br>Estimate Sparse Factors &<br>Non-Sparse Loadings"]
E --> F["Outcomes<br>Consistent Estimators &<br>9 Identified Risk Factors"]