false

Universal Patterns in the Blockchain: Analysis of EOAs and Smart Contracts in ERC20 Token Networks

Universal Patterns in the Blockchain: Analysis of EOAs and Smart Contracts in ERC20 Token Networks ArXiv ID: 2508.04671 “View on arXiv” Authors: Kundan Mukhia, SR Luwang, Md. Nurujjaman, Tanujit Chakraborty, Suman Saha, Chittaranjan Hens Abstract Scaling laws offer a powerful lens to understand complex transactional behaviors in decentralized systems. This study reveals distinctive statistical signatures in the transactional dynamics of ERC20 tokens on the Ethereum blockchain by examining over 44 million token transfers between July 2017 and March 2018 (9-month period). Transactions are categorized into four types: EOA–EOA, EOA–SC, SC-EOA, and SC-SC based on whether the interacting addresses are Externally Owned Accounts (EOAs) or Smart Contracts (SCs), and analyzed across three equal periods (each of 3 months). To identify universal statistical patterns, we investigate the presence of two canonical scaling laws: power law distributions and temporal Taylor’s law (TL). EOA-driven transactions exhibit consistent statistical behavior, including a near-linear relationship between trade volume and unique partners with stable power law exponents ($γ\approx 2.3$), and adherence to TL with scaling coefficients ($β\approx 2.3$). In contrast, interactions involving SCs, especially SC-SC, exhibit sublinear scaling, unstable power-law exponents, and significantly fluctuating Taylor coefficients (variation in $β$ to be $Δβ= 0.51$). Moreover, SC-driven activity displays heavier-tailed distributions ($γ< 2$), indicating bursty and algorithm-driven activity. These findings reveal the characteristic differences between human-controlled and automated transaction behaviors in blockchain ecosystems. By uncovering universal scaling behaviors through the integration of complex systems theory and blockchain data analytics, this work provides a principled framework for understanding the underlying mechanisms of decentralized financial systems. ...

August 6, 2025 · 2 min · Research Team

Asymptotic universal moment matching properties of normal distributions

Asymptotic universal moment matching properties of normal distributions ArXiv ID: 2508.03790 “View on arXiv” Authors: Xuan Liu Abstract Moment matching is an easy-to-implement and usually effective method to reduce variance of Monte Carlo simulation estimates. On the other hand, there is no guarantee that moment matching will always reduce simulation variance for general integration problems at least asymptotically, i.e. when the number of samples is large. We study the characterization of conditions on a given underlying distribution $X$ under which asymptotic variance reduction is guaranteed for a general integration problem $\mathbb{“E”}[“f(X)”]$ when moment matching techniques are applied. We show that a sufficient and necessary condition for such asymptotic variance reduction property is $X$ being a normal distribution. Moreover, when $X$ is a normal distribution, formulae for efficient estimation of simulation variance for (first and second order) moment matching Monte Carlo are obtained. These formulae allow estimations of simulation variance as by-products of the simulation process, in a way similar to variance estimations for plain Monte Carlo. Moreover, we propose non-linear moment matching schemes for any given continuous distribution such that asymptotic variance reduction is guaranteed. ...

August 5, 2025 · 2 min · Research Team

Comparing Normalization Methods for Portfolio Optimization with Reinforcement Learning

Comparing Normalization Methods for Portfolio Optimization with Reinforcement Learning ArXiv ID: 2508.03910 “View on arXiv” Authors: Caio de Souza Barbosa Costa, Anna Helena Reali Costa Abstract Recently, reinforcement learning has achieved remarkable results in various domains, including robotics, games, natural language processing, and finance. In the financial domain, this approach has been applied to tasks such as portfolio optimization, where an agent continuously adjusts the allocation of assets within a financial portfolio to maximize profit. Numerous studies have introduced new simulation environments, neural network architectures, and training algorithms for this purpose. Among these, a domain-specific policy gradient algorithm has gained significant attention in the research community for being lightweight, fast, and for outperforming other approaches. However, recent studies have shown that this algorithm can yield inconsistent results and underperform, especially when the portfolio does not consist of cryptocurrencies. One possible explanation for this issue is that the commonly used state normalization method may cause the agent to lose critical information about the true value of the assets being traded. This paper explores this hypothesis by evaluating two of the most widely used normalization methods across three different markets (IBOVESPA, NYSE, and cryptocurrencies) and comparing them with the standard practice of normalizing data before training. The results indicate that, in this specific domain, the state normalization can indeed degrade the agent’s performance. ...

August 5, 2025 · 2 min · Research Team

Measuring DEX Efficiency and The Effect of an Enhanced Routing Method on Both DEX Efficiency and Stakeholders' Benefits

Measuring DEX Efficiency and The Effect of an Enhanced Routing Method on Both DEX Efficiency and Stakeholders’ Benefits ArXiv ID: 2508.03217 “View on arXiv” Authors: Yu Zhang, Claudio J. Tessone Abstract The efficiency of decentralized exchanges (DEXs) and the influence of token routing algorithms on market performance and stakeholder outcomes remain underexplored. This paper introduces the concept of Standardized Total Arbitrage Profit (STAP), computed via convex optimization, as a systematic measure of DEX efficiency. We prove that executing the trade order maximizing STAP and reintegrating the resulting transaction fees eliminates all arbitrage opportunities-both cyclic arbitrage within DEXs and between DEXs and centralized exchanges (CEXs). In a fully efficient DEX (i.e., STAP = 0), the monetary value of target tokens received must not exceed that of the source tokens, regardless of the routing algorithm. Any violation indicates arbitrage potential, making STAP a reliable metric for arbitrage detection. Using a token graph comprising 11 tokens and 18 liquidity pools based on Uniswap V2 data, we observe a decline in DEX efficiency between June 21 and November 8, 2024. Simulations comparing two routing algorithms-Yu Zhang et al.’s line-graph-based method and the depth-first search (DFS) algorithm-show that employing more profitable routing improves DEX efficiency and trader returns over time. Moreover, while total value locked (TVL) remains stable with the line-graph method, it increases under the DFS algorithm, indicating greater aggregate benefits for liquidity providers. ...

August 5, 2025 · 2 min · Research Team

Modeling Loss-Versus-Rebalancing in Automated Market Makers via Continuous-Installment Options

Modeling Loss-Versus-Rebalancing in Automated Market Makers via Continuous-Installment Options ArXiv ID: 2508.02971 “View on arXiv” Authors: Srisht Fateh Singh, Reina Ke Xin Li, Samuel Gaskin, Yuntao Wu, Jeffrey Klinck, Panagiotis Michalopoulos, Zissis Poulos, Andreas Veneris Abstract This paper mathematically models a constant-function automated market maker (CFAMM) position as a portfolio of exotic options, known as perpetual American continuous-installment (CI) options. This model replicates an AMM position’s delta at each point in time over an infinite time horizon, thus taking into account the perpetual nature and optionality to withdraw of liquidity provision. This framework yields two key theoretical results: (a) It proves that the AMM’s adverse-selection cost, loss-versus-rebalancing (LVR), is analytically identical to the continuous funding fees (the time value decay or theta) earned by the at-the-money CI option embedded in the replicating portfolio. (b) A special case of this model derives an AMM liquidity position’s delta profile and boundaries that suffer approximately constant LVR, up to a bounded residual error, over an arbitrarily long forward window. Finally, the paper describes how the constant volatility parameter required by the perpetual option can be calibrated from the term structure of implied volatilities and estimates the errors for both implied volatility calibration and LVR residual error. Thus, this work provides a practical framework enabling liquidity providers to choose an AMM liquidity profile and price boundaries for an arbitrarily long, forward-looking time window where they can expect an approximately constant, price-independent LVR. The results establish a rigorous option-theoretic interpretation of AMMs and their LVR, and provide actionable guidance for liquidity providers in estimating future adverse-selection costs and optimizing position parameters. ...

August 5, 2025 · 3 min · Research Team

Momentum-integrated Multi-task Stock Recommendation with Converge-based Optimization

Momentum-integrated Multi-task Stock Recommendation with Converge-based Optimization ArXiv ID: 2509.10461 “View on arXiv” Authors: Hao Wang, Jingshu Peng, Yanyan Shen, Xujia Li, Lei Chen Abstract Stock recommendation is critical in Fintech applications, which use price series and alternative information to estimate future stock performance. Although deep learning models are prevalent in stock recommendation systems, traditional time-series forecasting training often fails to capture stock trends and rankings simultaneously, which are essential consideration factors for investors. To tackle this issue, we introduce a Multi-Task Learning (MTL) framework for stock recommendation, \textbf{“M”}omentum-\textbf{“i”}ntegrated \textbf{“M”}ulti-task \textbf{“Stoc”}k \textbf{“R”}ecommendation with Converge-based Optimization (\textbf{“MiM-StocR”}). To improve the model’s ability to capture short-term trends, we novelly invoke a momentum line indicator in model training. To prioritize top-performing stocks and optimize investment allocation, we propose a list-wise ranking loss function called Adaptive-k ApproxNDCG. Moreover, due to the volatility and uncertainty of the stock market, existing MTL frameworks face overfitting issues when applied to stock time series. To mitigate this issue, we introduce the Converge-based Quad-Balancing (CQB) method. We conducted extensive experiments on three stock benchmarks: SEE50, CSI 100, and CSI 300. MiM-StocR outperforms state-of-the-art MTL baselines across both ranking and profitable evaluations. ...

August 5, 2025 · 2 min · Research Team

To Bubble or Not to Bubble: Asset Price Dynamics and Optimality in OLG Economies

To Bubble or Not to Bubble: Asset Price Dynamics and Optimality in OLG Economies ArXiv ID: 2508.03230 “View on arXiv” Authors: Stefano Bosi, Cuong Le Van, Ngoc-Sang Pham Abstract We study an overlapping generations (OLG) exchange economy with an asset that yields dividends. First, we derive general conditions, based on exogenous parameters, that give rise to three distinct scenarios: (1) only bubbleless equilibria exist, (2) a bubbleless equilibrium coexists with a continuum of bubbly equilibria, and (3) all equilibria are bubbly. Under stationary endowments and standard assumptions, we provide a complete characterization of the equilibrium set and the associated asset price dynamics. In this setting, a bubbly equilibrium exists if and only if the interest rate in the economy without the asset is strictly lower than the population growth rate and the sum of per capita dividends is finite. Second, we establish necessary and sufficient conditions for Pareto optimality. Finally, we investigate the relationship between asset price behaviors and the optimality of equilibria. ...

August 5, 2025 · 2 min · Research Team

Unravelling the Probabilistic Forest: Arbitrage in Prediction Markets

Unravelling the Probabilistic Forest: Arbitrage in Prediction Markets ArXiv ID: 2508.03474 “View on arXiv” Authors: Oriol Saguillo, Vahid Ghafouri, Lucianna Kiffer, Guillermo Suarez-Tangil Abstract Polymarket is a prediction market platform where users can speculate on future events by trading shares tied to specific outcomes, known as conditions. Each market is associated with a set of one or more such conditions. To ensure proper market resolution, the condition set must be exhaustive – collectively accounting for all possible outcomes – and mutually exclusive – only one condition may resolve as true. Thus, the collective prices of all related outcomes should be $1, representing a combined probability of 1 of any outcome. Despite this design, Polymarket exhibits cases where dependent assets are mispriced, allowing for purchasing (or selling) a certain outcome for less than (or more than) $1, guaranteeing profit. This phenomenon, known as arbitrage, could enable sophisticated participants to exploit such inconsistencies. In this paper, we conduct an empirical arbitrage analysis on Polymarket data to answer three key questions: (Q1) What conditions give rise to arbitrage (Q2) Does arbitrage actually occur on Polymarket and (Q3) Has anyone exploited these opportunities. A major challenge in analyzing arbitrage between related markets lies in the scalability of comparisons across a large number of markets and conditions, with a naive analysis requiring $O(2^{“n+m”})$ comparisons. To overcome this, we employ a heuristic-driven reduction strategy based on timeliness, topical similarity, and combinatorial relationships, further validated by expert input. Our study reveals two distinct forms of arbitrage on Polymarket: Market Rebalancing Arbitrage, which occurs within a single market or condition, and Combinatorial Arbitrage, which spans across multiple markets. We use on-chain historical order book data to analyze when these types of arbitrage opportunities have existed, and when they have been executed by users. We find a realized estimate of 40 million USD of profit extracted. ...

August 5, 2025 · 2 min · Research Team

An Enhanced Focal Loss Function to Mitigate Class Imbalance in Auto Insurance Fraud Detection with Explainable AI

An Enhanced Focal Loss Function to Mitigate Class Imbalance in Auto Insurance Fraud Detection with Explainable AI ArXiv ID: 2508.02283 “View on arXiv” Authors: Francis Boabang, Samuel Asante Gyamerah Abstract In insurance fraud prediction, handling class imbalance remains a critical challenge. This paper presents a novel multistage focal loss function designed to enhance the performance of machine learning models in such imbalanced settings by helping to escape local minima and converge to a good solution. Building upon the foundation of the standard focal loss, our proposed approach introduces a dynamic, multi-stage convex and nonconvex mechanism that progressively adjusts the focus on hard-to-classify samples across training epochs. This strategic refinement facilitates more stable learning and improved discrimination between fraudulent and legitimate cases. Through extensive experimentation on a real-world insurance dataset, our method achieved better performance than the traditional focal loss, as measured by accuracy, precision, F1-score, recall and Area Under the Curve (AUC) metrics on the auto insurance dataset. These results demonstrate the efficacy of the multistage focal loss in boosting model robustness and predictive accuracy in highly skewed classification tasks, offering significant implications for fraud detection systems in the insurance industry. An explainable model is included to interpret the results. ...

August 4, 2025 · 2 min · Research Team

ByteGen: A Tokenizer-Free Generative Model for Orderbook Events in Byte Space

ByteGen: A Tokenizer-Free Generative Model for Orderbook Events in Byte Space ArXiv ID: 2508.02247 “View on arXiv” Authors: Yang Li, Zhi Chen Abstract Generative modeling of high-frequency limit order book (LOB) dynamics is a critical yet unsolved challenge in quantitative finance, essential for robust market simulation and strategy backtesting. Existing approaches are often constrained by simplifying stochastic assumptions or, in the case of modern deep learning models like Transformers, rely on tokenization schemes that affect the high-precision, numerical nature of financial data through discretization and binning. To address these limitations, we introduce ByteGen, a novel generative model that operates directly on the raw byte streams of LOB events. Our approach treats the problem as an autoregressive next-byte prediction task, for which we design a compact and efficient 32-byte packed binary format to represent market messages without information loss. The core novelty of our work is the complete elimination of feature engineering and tokenization, enabling the model to learn market dynamics from its most fundamental representation. We achieve this by adapting the H-Net architecture, a hybrid Mamba-Transformer model that uses a dynamic chunking mechanism to discover the inherent structure of market messages without predefined rules. Our primary contributions are: 1) the first end-to-end, byte-level framework for LOB modeling; 2) an efficient packed data representation; and 3) a comprehensive evaluation on high-frequency data. Trained on over 34 million events from CME Bitcoin futures, ByteGen successfully reproduces key stylized facts of financial markets, generating realistic price distributions, heavy-tailed returns, and bursty event timing. Our findings demonstrate that learning directly from byte space is a promising and highly flexible paradigm for modeling complex financial systems, achieving competitive performance on standard market quality metrics without the biases of tokenization. ...

August 4, 2025 · 2 min · Research Team