Demystifying the trend of the healthcare index: Is historical price a key driver?

ArXiv ID: 2601.14062 “View on arXiv”

Authors: Payel Sadhukhan, Samrat Gupta, Subhasis Ghosh, Tanujit Chakraborty

Abstract

Healthcare sector indices consolidate the economic health of pharmaceutical, biotechnology, and healthcare service firms. The short-term movements in these indices are closely intertwined with capital allocation decisions affecting research and development investment, drug availability, and long-term health outcomes. This research investigates whether historical open-high-low-close (OHLC) index data contain sufficient information for predicting the directional movement of the opening index on the subsequent trading day. The problem is formulated as a supervised classification task involving a one-step-ahead rolling window. A diverse feature set is constructed, comprising original prices, volatility-based technical indicators, and a novel class of nowcasting features derived from mutual OHLC ratios. The framework is evaluated on data from healthcare indices in the U.S. and Indian markets over a five-year period spanning multiple economic phases, including the COVID-19 pandemic. The results demonstrate robust predictive performance, with accuracy exceeding 0.8 and Matthews correlation coefficients above 0.6. Notably, the proposed nowcasting features have emerged as a key determinant of the market movement. We have employed the Shapley-based explainability paradigm to further elucidate the contribution of the features: outcomes reveal the dominant role of the nowcasting features, followed by a more moderate contribution of original prices. This research offers a societal utility: the proposed features and model for short-term forecasting of healthcare indices can reduce information asymmetry and support a more stable and equitable health economy.

Keywords: OHLC data, Shapley explainability, Rolling window classification, Nowcasting features, Volatility-based technical indicators, Healthcare Sector Indices

Complexity vs Empirical Score

Math Complexity: 5.0/10
Empirical Rigor: 5.0/10
Quadrant: Holy Grail
Why: The paper employs standard ML techniques and Shapley explainability (moderate math), and conducts empirical evaluation on multi-year datasets across two markets with backtest-ready classification metrics.

  flowchart TD
    A["Research Goal:<br>Can historical OHLC data predict<br>healthcare index movement?"] --> B["Methodology:<br>Supervised Classification<br>Rolling Window Approach"]
    B --> C["Input Data:<br>US & India Healthcare Indices<br>5 Years (2018-2023)"]
    C --> D["Feature Engineering:<br>Original Prices, Volatility Indicators<br>Nowcasting OHLC Ratios"]
    D --> E["Modeling & Explainability:<br>Machine Learning Model<br>Shapley Value Analysis"]
    E --> F["Outcome 1:<br>High Predictive Power<br>Accuracy > 0.8, MCC > 0.6"]
    E --> G["Outcome 2:<br>Feature Dominance<br>Nowcasting ratios are top drivers"]

Demystifying the trend of the healthcare index: Is historical price a key driver?#

Abstract#

Complexity vs Empirical Score#

Demystifying the trend of the healthcare index: Is historical price a key driver?

Abstract

Complexity vs Empirical Score