CTBench: Cryptocurrency Time Series Generation Benchmark

ArXiv ID: 2508.02758 “View on arXiv”

Authors: Yihao Ang, Qiang Wang, Qiang Huang, Yifan Bao, Xinyu Xi, Anthony K. H. Tung, Chen Jin, Zhiyong Huang

Abstract

Synthetic time series are essential tools for data augmentation, stress testing, and algorithmic prototyping in quantitative finance. However, in cryptocurrency markets, characterized by 24/7 trading, extreme volatility, and rapid regime shifts, existing Time Series Generation (TSG) methods and benchmarks often fall short, jeopardizing practical utility. Most prior work (1) targets non-financial or traditional financial domains, (2) focuses narrowly on classification and forecasting while neglecting crypto-specific complexities, and (3) lacks critical financial evaluations, particularly for trading applications. To address these gaps, we introduce \textsf{“CTBench”}, the first comprehensive TSG benchmark tailored for the cryptocurrency domain. \textsf{“CTBench”} curates an open-source dataset from 452 tokens and evaluates TSG models across 13 metrics spanning 5 key dimensions: forecasting accuracy, rank fidelity, trading performance, risk assessment, and computational efficiency. A key innovation is a dual-task evaluation framework: (1) the \emph{“Predictive Utility”} task measures how well synthetic data preserves temporal and cross-sectional patterns for forecasting, while (2) the \emph{“Statistical Arbitrage”} task assesses whether reconstructed series support mean-reverting signals for trading. We benchmark eight representative models from five methodological families over four distinct market regimes, uncovering trade-offs between statistical fidelity and real-world profitability. Notably, \textsf{“CTBench”} offers model ranking analysis and actionable guidance for selecting and deploying TSG models in crypto analytics and strategy development.

Keywords: Time Series Generation, Data Augmentation, Statistical Arbitrage, Synthetic Data, Market Regimes

Complexity vs Empirical Score

  • Math Complexity: 4.5/10
  • Empirical Rigor: 8.5/10
  • Quadrant: Street Traders
  • Why: The paper presents a comprehensive benchmark with extensive empirical evaluation on real cryptocurrency data (452 tokens), multiple market regimes, and financial metrics (Sharpe, MDD, VaR, etc.), indicating high data/implementation rigor. The mathematics is present (definitions of log-returns, rolling windows, and model descriptions) but focuses more on evaluation framework and practical insights rather than dense theoretical derivations.
  flowchart TD
    A["Research Goal: <br/>Benchmark TSG models<br/>for Crypto Time Series"] --> B["Data Curation<br/>(452 tokens, 4 regimes)"]
    B --> C["Methodology<br/>(Dual-Task Framework)"]
    C --> D["Evaluation Metrics<br/>(13 metrics across 5 dimensions)"]
    D --> E["Computational Process<br/>(Benchmark 8 Models)"]
    E --> F["Key Findings<br/>(Trade-offs & Guidance)"]