Machine learning approach to stock price crash risk
ArXiv ID: 2505.16287 “View on arXiv”
Authors: Abdullah Karasan, Ozge Sezgin Alp, Gerhard-Wilhelm Weber
Abstract
In this study, we propose a novel machine-learning-based measure for stock price crash risk, utilizing the minimum covariance determinant methodology. Employing this newly introduced dependent variable, we predict stock price crash risk through cross-sectional regression analysis. The findings confirm that the proposed method effectively captures stock price crash risk, with the model demonstrating strong performance in terms of both statistical significance and economic relevance. Furthermore, leveraging a newly developed firm-specific investor sentiment index, the analysis identifies a positive correlation between stock price crash risk and firm-specific investor sentiment. Specifically, higher levels of sentiment are associated with an increased likelihood of stock price crash risk. This relationship remains robust across different firm sizes and when using the detoned version of the firm-specific investor sentiment index, further validating the reliability of the proposed approach.
Keywords: Minimum Covariance Determinant, Crash Risk Prediction, Cross-sectional Regression, Investor Sentiment Analysis, Anomaly Detection, Equities
Complexity vs Empirical Score
- Math Complexity: 5.0/10
- Empirical Rigor: 6.0/10
- Quadrant: Holy Grail
- Why: The paper employs advanced machine learning (Minimum Covariance Determinant) for anomaly detection and uses panel regression with statistical significance testing, showing moderate mathematical sophistication. It is data-heavy, backtest-ready with a focus on robustness checks, firm-specific sentiment index, and cross-sectional analysis, indicating high empirical implementation.
flowchart TD
A["Research Goal: Predict Stock Price Crash Risk"] --> B["Data Input: Stock Returns & Financial Data"]
B --> C["Compute New Measure: Minimum Covariance Determinant"]
C --> D["Methodology: Cross-Sectional Regression Analysis"]
D --> E["Compute Firm-Specific Investor Sentiment Index"]
D --> F["Identify Correlation: Sentiment vs. Crash Risk"]
E --> F
F --> G["Key Finding: Positive Correlation Confirmed"]
F --> H["Key Finding: Robust Across Firm Sizes & Detoned Index"]
G --> I["Outcome: Validated ML Approach & Economic Relevance"]
H --> I