Demystifying the trend of the healthcare index: Is historical price a key driver?
ArXiv ID: 2601.14062 “View on arXiv”
Authors: Payel Sadhukhan, Samrat Gupta, Subhasis Ghosh, Tanujit Chakraborty
Abstract
Healthcare sector indices consolidate the economic health of pharmaceutical, biotechnology, and healthcare service firms. The short-term movements in these indices are closely intertwined with capital allocation decisions affecting research and development investment, drug availability, and long-term health outcomes. This research investigates whether historical open-high-low-close (OHLC) index data contain sufficient information for predicting the directional movement of the opening index on the subsequent trading day. The problem is formulated as a supervised classification task involving a one-step-ahead rolling window. A diverse feature set is constructed, comprising original prices, volatility-based technical indicators, and a novel class of nowcasting features derived from mutual OHLC ratios. The framework is evaluated on data from healthcare indices in the U.S. and Indian markets over a five-year period spanning multiple economic phases, including the COVID-19 pandemic. The results demonstrate robust predictive performance, with accuracy exceeding 0.8 and Matthews correlation coefficients above 0.6. Notably, the proposed nowcasting features have emerged as a key determinant of the market movement. We have employed the Shapley-based explainability paradigm to further elucidate the contribution of the features: outcomes reveal the dominant role of the nowcasting features, followed by a more moderate contribution of original prices. This research offers a societal utility: the proposed features and model for short-term forecasting of healthcare indices can reduce information asymmetry and support a more stable and equitable health economy.
Keywords: OHLC data, Shapley explainability, Rolling window classification, Nowcasting features, Volatility-based technical indicators, Healthcare Sector Indices
Complexity vs Empirical Score
- Math Complexity: 5.0/10
- Empirical Rigor: 5.0/10
- Quadrant: Holy Grail
- Why: The paper employs standard ML techniques and Shapley explainability (moderate math), and conducts empirical evaluation on multi-year datasets across two markets with backtest-ready classification metrics.
flowchart TD
A["Research Goal:<br>Can historical OHLC data predict<br>healthcare index movement?"] --> B["Methodology:<br>Supervised Classification<br>Rolling Window Approach"]
B --> C["Input Data:<br>US & India Healthcare Indices<br>5 Years (2018-2023)"]
C --> D["Feature Engineering:<br>Original Prices, Volatility Indicators<br>Nowcasting OHLC Ratios"]
D --> E["Modeling & Explainability:<br>Machine Learning Model<br>Shapley Value Analysis"]
E --> F["Outcome 1:<br>High Predictive Power<br>Accuracy > 0.8, MCC > 0.6"]
E --> G["Outcome 2:<br>Feature Dominance<br>Nowcasting ratios are top drivers"]