Emergence of Randomness in Temporally Aggregated Financial Tick Sequences
ArXiv ID: 2511.17479 “View on arXiv”
Authors: Silvia Onofri, Andrey Shternshis, Stefano Marmi
Abstract
Markets efficiency implies that the stock returns are intrinsically unpredictable, a property that makes markets comparable to random number generators. We present a novel methodology to investigate ultra-high frequency financial data and to evaluate the extent to which tick by tick returns resemble random sequences. We extend the analysis of ultra high-frequency stock market data by applying comprehensive sets of randomness tests, beyond the usual reliance on serial correlation or entropy measures. Our purpose is to extensively analyze the randomness of these data using statistical tests from standard batteries that evaluate different aspects of randomness. We illustrate the effect of time aggregation in transforming highly correlated high-frequency trade data to random streams. More specifically, we use many of the tests in the NIST Statistical Test Suite and in the TestU01 battery (in particular the Rabbit and Alphabit sub-batteries), to prove that the degree of randomness of financial tick data increases together with the increase of the aggregation level in transaction time. Additionally, the comprehensive nature of our tests also uncovers novel patterns, such as non-monotonic behaviors in predictability for certain assets. This study demonstrates a model-free approach for both assessing randomness in financial time series and generating pseudo-random sequences from them, with potential relevance in several applications.
Keywords: Ultra-high frequency data, NIST Statistical Test Suite, Randomness tests, Time aggregation, Serial correlation, Equities (Stocks)
Complexity vs Empirical Score
- Math Complexity: 8.0/10
- Empirical Rigor: 3.0/10
- Quadrant: Lab Rats
- Why: The paper involves high mathematical complexity through the application of advanced statistical randomness test suites (NIST, TestU01) and probabilistic hypothesis testing, but the empirical component is primarily methodological and descriptive, lacking specific backtesting protocols, dataset details, or implementation-heavy validation for trading strategies.
flowchart TD
A["Research Goal<br>Assess randomness of tick-by-tick returns<br>and effect of time aggregation"] --> B
subgraph B ["Methodology"]
direction LR
B1["NIST Test Suite<br>Serial Correlation, Entropy"] --> B2["TestU01 Battery<br>Rabbit & Alphabit"] --> B3["Ultra-HF Financial Data"]
end
B --> C
subgraph C ["Data & Processing"]
C1["Stock Transaction Data"] --> C2["Time Aggregation<br>Transforming high-frequency<br>to various time intervals"]
end
C --> D
subgraph D ["Computational Process"]
D1["Apply Tests<br>on aggregated streams"] --> D2["Compare Results<br>across aggregation levels"]
end
D --> E
subgraph E ["Key Findings & Outcomes"]
E1["Increased randomness<br>with aggregation"]
E2["Novel patterns<br>e.g., non-monotonic predictability"]
E3["Model-free approach<br>for randomness assessment"]
end