From On-chain to Macro: Assessing the Importance of Data Source Diversity in Cryptocurrency Market Forecasting

ArXiv ID: 2506.21246 “View on arXiv”

Authors: Giorgos Demosthenous, Chryssis Georgiou, Eliada Polydorou

Abstract

This study investigates the impact of data source diversity on the performance of cryptocurrency forecasting models by integrating various data categories, including technical indicators, on-chain metrics, sentiment and interest metrics, traditional market indices, and macroeconomic indicators. We introduce the Crypto100 index, representing the top 100 cryptocurrencies by market capitalization, and propose a novel feature reduction algorithm to identify the most impactful and resilient features from diverse data sources. Our comprehensive experiments demonstrate that data source diversity significantly enhances the predictive performance of forecasting models across different time horizons. Key findings include the paramount importance of on-chain metrics for both short-term and long-term predictions, the growing relevance of traditional market indices and macroeconomic indicators for longer-term forecasts, and substantial improvements in model accuracy when diverse data sources are utilized. These insights help demystify the short-term and long-term driving factors of the cryptocurrency market and lay the groundwork for developing more accurate and resilient forecasting models.

Keywords: Cryptocurrency Forecasting, Feature Selection, On-chain Metrics, Crypto100 Index, Macro-financial Modelling

Complexity vs Empirical Score

  • Math Complexity: 5.0/10
  • Empirical Rigor: 7.5/10
  • Quadrant: Street Traders
  • Why: The paper employs standard statistical and machine learning methods (like feature reduction algorithms and regression models) without heavy mathematical derivations, but demonstrates strong empirical rigor through a comprehensive dataset, backtesting on the Crypto100 index, and public artifact availability.
  flowchart TD
    A["Research Goal: Assessing Importance of Data Source Diversity"] --> B
    subgraph B ["Methodology"]
        B1["Crypto100 Index: Top 100 Cryptos by MCap"]
        B2["Feature Reduction Algorithm"]
        B3["Multi-source Integration"]
    end
    B --> C
    subgraph C ["Data Sources"]
        C1["Technical Indicators"]
        C2["On-chain Metrics"]
        C3["Sentiment & Interest"]
        C4["Traditional Market Indices"]
        C5["Macroeconomic Indicators"]
    end
    C --> D["Computational Analysis across Time Horizons"]
    D --> E
    subgraph E ["Key Findings"]
        E1["On-chain metrics are paramount for both short & long-term forecasts"]
        E2["Traditional & Macro indicators gain relevance for long-term forecasts"]
        E3["Data diversity significantly improves model accuracy & resilience"]
    end