false

FinSurvival: A Suite of Large Scale Survival Modeling Tasks from Finance

FinSurvival: A Suite of Large Scale Survival Modeling Tasks from Finance ArXiv ID: 2507.14160 “View on arXiv” Authors: Aaron Green, Zihan Nie, Hanzhen Qin, Oshani Seneviratne, Kristin P. Bennett Abstract Survival modeling predicts the time until an event occurs and is widely used in risk analysis; for example, it’s used in medicine to predict the survival of a patient based on censored data. There is a need for large-scale, realistic, and freely available datasets for benchmarking artificial intelligence (AI) survival models. In this paper, we derive a suite of 16 survival modeling tasks from publicly available transaction data generated by lending of cryptocurrencies in Decentralized Finance (DeFi). Each task was constructed using an automated pipeline based on choices of index and outcome events. For example, the model predicts the time from when a user borrows cryptocurrency coins (index event) until their first repayment (outcome event). We formulate a survival benchmark consisting of a suite of 16 survival-time prediction tasks (FinSurvival). We also automatically create 16 corresponding classification problems for each task by thresholding the survival time using the restricted mean survival time. With over 7.5 million records, FinSurvival provides a suite of realistic financial modeling tasks that will spur future AI survival modeling research. Our evaluation indicated that these are challenging tasks that are not well addressed by existing methods. FinSurvival enables the evaluation of AI survival models applicable to traditional finance, industry, medicine, and commerce, which is currently hindered by the lack of large public datasets. Our benchmark demonstrates how AI models could assess opportunities and risks in DeFi. In the future, the FinSurvival benchmark pipeline can be used to create new benchmarks by incorporating more DeFi transactions and protocols as the use of cryptocurrency grows. ...

July 7, 2025 · 2 min · Research Team

Multifractality in Bitcoin Realised Volatility: Implications for Rough Volatility Modelling

Multifractality in Bitcoin Realised Volatility: Implications for Rough Volatility Modelling ArXiv ID: 2507.00575 “View on arXiv” Authors: Milan Pontiggia Abstract We assess the applicability of rough volatility models to Bitcoin realized volatility using the normalised p-variation framework of Cont and Das (2024). Applying this model-free estimator to high-frequency Bitcoin data from 2017 to 2024 across multiple sampling resolutions, we find that the normalised statistic remains strictly negative, precluding the estimation of a valid roughness index. Stationarity tests and robustness checks reveal no significant evidence of non-stationarity or structural breaks as explanatory factors. Instead, convergent evidence from three complementary diagnostics, namely Multifractal Detrended Fluctuation Analysis, log-log moment scaling, and wavelet leaders, reveals a multifractal structure in Bitcoin volatility. This behaviour violates the homogeneity assumptions underlying rough volatility estimation and accounts for the estimator’s systematic failure. These findings suggest that while rough volatility models perform well in traditional markets, they are structurally misaligned with the empirical features of Bitcoin volatility. ...

July 1, 2025 · 2 min · Research Team

Comparing Bitcoin and Ethereum tail behavior via Q-Q analysis of cryptocurrency returns

Comparing Bitcoin and Ethereum tail behavior via Q-Q analysis of cryptocurrency returns ArXiv ID: 2507.01983 “View on arXiv” Authors: A. H. Nzokem Abstract The cryptocurrency market presents both significant investment opportunities and higher risks relative to traditional financial assets. This study examines the tail behavior of daily returns for two leading cryptocurrencies, Bitcoin and Ethereum, using seven-parameter estimates from prior research, which applied the Generalized Tempered Stable (GTS) distribution. Quantile-quantile (Q-Q) plots against the Normal distribution reveal that both assets exhibit heavy-tailed return distributions. However, Ethereum consistently shows a greater frequency of extreme values than would be expected under its Bitcoin-modeled counterpart, indicating more pronounced tail risk. ...

June 26, 2025 · 2 min · Research Team

Dynamic Grid Trading Strategy: From Zero Expectation to Market Outperformance

Dynamic Grid Trading Strategy: From Zero Expectation to Market Outperformance ArXiv ID: 2506.11921 “View on arXiv” Authors: Kai-Yuan Chen, Kai-Hsin Chen, Jyh-Shing Roger Jang Abstract We propose a profitable trading strategy for the cryptocurrency market based on grid trading. Starting with an analysis of the expected value of the traditional grid strategy, we show that under simple assumptions, its expected return is essentially zero. We then introduce a novel Dynamic Grid-based Trading (DGT) strategy that adapts to market conditions by dynamically resetting grid positions. Our backtesting results using minute-level data from Bitcoin and Ethereum between January 2021 and July 2024 demonstrate that the DGT strategy significantly outperforms both the traditional grid and buy-and-hold strategies in terms of internal rate of return and risk control. ...

June 13, 2025 · 2 min · Research Team

Price Discovery in Cryptocurrency Markets

Price Discovery in Cryptocurrency Markets ArXiv ID: 2506.08718 “View on arXiv” Authors: Juan Plazuelo Pascual, Carlos Tardon Rubio, Juan Toro Cebada, Angel Hernando Veciana Abstract This document analyzes price discovery in cryptocurrency markets by comparing centralized and decentralized exchanges, as well as spot and futures markets. The study focuses first on Ethereum (ETH) and then applies a similar approach to Bitcoin (BTC). Chapter 1 outlines the theoretical framework, emphasizing the structural differences between centralized exchanges and decentralized finance mechanisms, especially Automated Market Makers (AMMs). It also explains how to construct an order book from a liquidity pool in a decentralized setting for comparison with centralized exchanges. Chapter 2 describes the methodological tools used: Hasbrouck’s Information Share, Gonzalo and Granger’s Permanent-Transitory decomposition, and the Hayashi-Yoshida estimator. These are applied to explore lead-lag dynamics, cointegration, and price discovery across market types. Chapter 3 presents the empirical analysis. For ETH, it compares price dynamics on Binance and Uniswap v2 over a one-year period, focusing on five key events in 2024. For BTC, it analyzes the relationship between spot and futures prices on the CME. The study estimates lead-lag effects and cointegration in both cases. Results show that centralized markets typically lead in ETH price discovery. In futures markets, while they tend to lead overall, high-volatility periods produce mixed outcomes. The findings have key implications for traders and institutions regarding liquidity, arbitrage, and market efficiency. Various metrics are used to benchmark the performance of modified AMMs and to understand the interaction between decentralized and centralized structures. ...

June 10, 2025 · 2 min · Research Team

Exploring Microstructural Dynamics in Cryptocurrency Limit Order Books: Better Inputs Matter More Than Stacking Another Hidden Layer

Exploring Microstructural Dynamics in Cryptocurrency Limit Order Books: Better Inputs Matter More Than Stacking Another Hidden Layer ArXiv ID: 2506.05764 “View on arXiv” Authors: Haochuan Wang Abstract Cryptocurrency price dynamics are driven largely by microstructural supply demand imbalances in the limit order book (LOB), yet the highly noisy nature of LOB data complicates the signal extraction process. Prior research has demonstrated that deep-learning architectures can yield promising predictive performance on pre-processed equity and futures LOB data, but they often treat model complexity as an unqualified virtue. In this paper, we aim to examine whether adding extra hidden layers or parameters to “blackbox ish” neural networks genuinely enhances short term price forecasting, or if gains are primarily attributable to data preprocessing and feature engineering. We benchmark a spectrum of models from interpretable baselines, logistic regression, XGBoost to deep architectures (DeepLOB, Conv1D+LSTM) on BTC/USDT LOB snapshots sampled at 100 ms to multi second intervals using publicly available Bybit data. We introduce two data filtering pipelines (Kalman, Savitzky Golay) and evaluate both binary (up/down) and ternary (up/flat/down) labeling schemes. Our analysis compares models on out of sample accuracy, latency, and robustness to noise. Results reveal that, with data preprocessing and hyperparameter tuning, simpler models can match and even exceed the performance of more complex networks, offering faster inference and greater interpretability. ...

June 6, 2025 · 2 min · Research Team

Enhancing Meme Token Market Transparency: A Multi-Dimensional Entity-Linked Address Analysis for Liquidity Risk Evaluation

Enhancing Meme Token Market Transparency: A Multi-Dimensional Entity-Linked Address Analysis for Liquidity Risk Evaluation ArXiv ID: 2506.05359 “View on arXiv” Authors: Qiangqiang Liu, Qian Huang, Frank Fan, Haishan Wu, Xueyan Tang Abstract Meme tokens represent a distinctive asset class within the cryptocurrency ecosystem, characterized by high community engagement, significant market volatility, and heightened vulnerability to market manipulation. This paper introduces an innovative approach to assessing liquidity risk in meme token markets using entity-linked address identification techniques. We propose a multi-dimensional method integrating fund flow analysis, behavioral similarity, and anomalous transaction detection to identify related addresses. We develop a comprehensive set of liquidity risk indicators tailored for meme tokens, covering token distribution, trading activity, and liquidity metrics. Empirical analysis of tokens like BabyBonk, NMT, and BonkFork validates our approach, revealing significant disparities between apparent and actual liquidity in meme token markets. The findings of this study provide significant empirical evidence for market participants and regulatory authorities, laying a theoretical foundation for building a more transparent and robust meme token ecosystem. ...

May 22, 2025 · 2 min · Research Team

Cryptocurrencies in the Balance Sheet: Insights from (Micro)Strategy -- Bitcoin Interactions

Cryptocurrencies in the Balance Sheet: Insights from (Micro)Strategy – Bitcoin Interactions ArXiv ID: 2505.14655 “View on arXiv” Authors: Sabrina Aufiero, Antonio Briola, Tesfaye Salarin, Fabio Caccioli, Silvia Bartolucci, Tomaso Aste Abstract This paper investigates the evolving link between cryptocurrency and equity markets in the context of the recent wave of corporate Bitcoin (BTC) treasury strategies. We assemble a dataset of 39 publicly listed firms holding BTC, from their first acquisition through April 2025. Using daily logarithmic returns, we first document significant positive co-movements via Pearson correlations and single factor model regressions, discovering an average BTC beta of 0.62, and isolating 12 companies, including Strategy (formerly MicroStrategy, MSTR), exhibiting a beta exceeding 1. We then classify firms into three groups reflecting their exposure to BTC, liquidity, and return co-movements. We use transfer entropy (TE) to capture the direction of information flow over time. Transfer entropy analysis consistently identifies BTC as the dominant information driver, with brief, announcement-driven feedback from stocks to BTC during major financial events. Our results highlight the critical need for dynamic hedging ratios that adapt to shifting information flows. These findings provide important insights for investors and managers regarding risk management and portfolio diversification in a period of growing integration of digital assets into corporate treasuries. ...

May 20, 2025 · 2 min · Research Team

Hierarchical Representations for Evolving Acyclic Vector Autoregressions (HEAVe)

Hierarchical Representations for Evolving Acyclic Vector Autoregressions (HEAVe) ArXiv ID: 2505.12806 “View on arXiv” Authors: Cameron Cornell, Lewis Mitchell, Matthew Roughan Abstract Causal networks offer an intuitive framework to understand influence structures within time series systems. However, the presence of cycles can obscure dynamic relationships and hinder hierarchical analysis. These networks are typically identified through multivariate predictive modelling, but enforcing acyclic constraints significantly increases computational and analytical complexity. Despite recent advances, there remains a lack of simple, flexible approaches that are easily tailorable to specific problem instances. We propose an evolutionary approach to fitting acyclic vector autoregressive processes and introduces a novel hierarchical representation that directly models structural elements within a time series system. On simulated datasets, our model retains most of the predictive accuracy of unconstrained models and outperforms permutation-based alternatives. When applied to a dataset of 100 cryptocurrency return series, our method generates acyclic causal networks capturing key structural properties of the unconstrained model. The acyclic networks are approximately sub-graphs of the unconstrained networks, and most of the removed links originate from low-influence nodes. Given the high levels of feature preservation, we conclude that this cryptocurrency price system functions largely hierarchically. Our findings demonstrate a flexible, intuitive approach for identifying hierarchical causal networks in time series systems, with broad applications to fields like econometrics and social network analysis. ...

May 19, 2025 · 2 min · Research Team

Microstructure and Manipulation: Quantifying Pump-and-Dump Dynamics in Cryptocurrency Markets

Microstructure and Manipulation: Quantifying Pump-and-Dump Dynamics in Cryptocurrency Markets ArXiv ID: 2504.15790 “View on arXiv” Authors: Unknown Abstract Building on our prior threshold-based analysis of six months of Poloniex trading data, we have extended both the temporal span and granularity of our study by incorporating minute-level OHLCV records for 1021 tokens around each confirmed pump-and-dump event. First, we algorithmically identify the accumulation phase, marking the initial and final insider volume spikes, and observe that 70% of pre-event volume transacts within one hour of the pump announcement. Second, we compute conservative lower bounds on insider profits under both a single-point liquidation at 70% of peak and a tranche-based strategy (selling 20% at 50%, 30% at 60%, and 50% at 80% of peak), yielding median returns above 100% and upper-quartile returns exceeding 2000%. Third, by unfolding the full pump structure and integrating social-media verification (e.g., Telegram announcements), we confirm numerous additional events that eluded our initial model. We also categorize schemes into “pre-accumulation” versus “on-the-spot” archetypes-insights that sharpen detection algorithms, inform risk assessments, and underpin actionable strategies for real-time market-integrity enforcement. ...

April 22, 2025 · 2 min · Research Team