Can AI Detect Wash Trading? Evidence from NFTs

ArXiv ID: 2311.18717 “View on arXiv”

Authors: Unknown

Abstract

Existing studies on crypto wash trading often use indirect statistical methods or leaked private data, both with inherent limitations. This paper leverages public on-chain NFT data for a more direct and granular estimation. Analyzing three major exchanges, we find that ~38% (30-40%) of trades and ~60% (25-95%) of traded value likely involve manipulation, with significant variation across exchanges. This direct evidence enables a critical reassessment of existing indirect methods, identifying roundedness-based regressions à la Cong et al. (2023) as most promising, though still error-prone in the NFT setting. To address this, we develop an AI-based estimator that integrates these regressions in a machine learning framework, significantly reducing both exchange- and trade-level estimation errors in NFT markets (and beyond).

Keywords: Wash Trading, Market Manipulation, NFTs (Non-Fungible Tokens), Machine Learning Estimation, On-chain Analysis, Crypto / NFTs

Complexity vs Empirical Score

  • Math Complexity: 6.5/10
  • Empirical Rigor: 8.0/10
  • Quadrant: Holy Grail
  • Why: The paper employs advanced statistical methods and machine learning (boosting, deep learning) with hyperparameter optimization, indicating moderate-to-high math complexity. It is highly empirical, using massive public on-chain NFT data from three exchanges, implementing direct filters, and reporting specific performance metrics (AUC scores, error rates), making it data- and implementation-heavy.
  flowchart TD
    A["Research Goal<br>Can AI directly detect NFT wash trading using<br>public on-chain data?"] --> B["Methodology<br>Empirical analysis of NFT trades across<br>3 major exchanges (NFT marketplace data)"]
    
    B --> C["Computational Process<br>1. Baseline: Statistical models (e.g., roundedness regressions)<br>2. AI Estimator: Machine learning framework integrating<br>regressions to correct exchange & trade-level errors"]
    
    C --> D["Key Findings/Outcomes<br>• Direct Evidence: ~38% of trades / ~60% of value manipulated<br>• Exchange variation: significant differences detected<br>• ML estimator significantly reduces estimation errors vs.<br>traditional statistical methods<br>• Public data enables granular, repeatable analysis"]
    
    style A fill:#e1f5e1
    style D fill:#fff2cc