Identifying Extreme Events in the Stock Market: A Topological Data Analysis
ArXiv ID: 2405.16052 “View on arXiv”
Authors: Unknown
Abstract
This paper employs Topological Data Analysis (TDA) to detect extreme events (EEs) in the stock market at a continental level. Previous approaches, which analyzed stock indices separately, could not detect EEs for multiple time series in one go. TDA provides a robust framework for such analysis and identifies the EEs during the crashes for different indices. The TDA analysis shows that $L^1$, $L^2$ norms and Wasserstein distance ($W_D$) of the world leading indices rise abruptly during the crashes, surpassing a threshold of $μ+4σ$ where $μ$ and $σ$ are the mean and the standard deviation of norm or $W_D$, respectively. Our study identified the stock index crashes of the 2008 financial crisis and the COVID-19 pandemic across continents as EEs. Given that different sectors in an index behave differently, a sector-wise analysis was conducted during the COVID-19 pandemic for the Indian stock market. The sector-wise results show that after the occurrence of EE, we have observed strong crashes surpassing $μ+2σ$ for an extended period for the banking sector. While for the pharmaceutical sector, no significant spikes were noted. Hence, TDA also proves successful in identifying the duration of shocks after the occurrence of EEs. This also indicates that the Banking sector continued to face stress and remained volatile even after the crash. This study gives us the applicability of TDA as a powerful analytical tool to study EEs in various fields.
Keywords: Topological Data Analysis (TDA), Extreme Events Detection, Financial Crashes, Multivariate Time Series, Wasserstein Distance, Equities (Stock Indices)
Complexity vs Empirical Score
- Math Complexity: 8.5/10
- Empirical Rigor: 3.0/10
- Quadrant: Lab Rats
- Why: The paper employs advanced mathematical concepts from topological data analysis, including persistent homology, Vietoris–Rips complexes, and norms of persistence landscapes, indicating high math complexity. However, it lacks detailed backtesting, specific implementation code, or out-of-sample performance metrics, focusing instead on theoretical framework and descriptive results, resulting in low empirical rigor.
flowchart TD
A[""Research Goal:
Detect Extreme Events in
Multivariate Stock Indices""] --> B[""Data Input:
Continental Stock Indices
(e.g., 2008 Crisis, COVID-19)""]
B --> C[""Methodology:
Topological Data Analysis
(TDA) via Persistence Diagrams""]
C --> D[""Computational Process:
Calculate L1/L2 Norms & Wasserstein Distance (W_D)""]
D --> E{"Threshold Check:
Value > μ + 4σ?"}
E -- Yes --> F[""Outcome 1: Identifies Extreme Events
(e.g., Global Market Crashes)""]
E -- No --> G
F --> H[""Outcome 2: Sector Analysis
(e.g., Banking vs. Pharma)""]
H --> G[""Final Outcome:
TDA successfully detects shock duration;
Banking sector shows prolonged stress""]
style A fill:#f9f,stroke:#333,stroke-width:2px
style G fill:#bbf,stroke:#333,stroke-width:2px