Representation learning with a transformer by contrastive learning for money laundering detection

ArXiv ID: 2507.08835 “View on arXiv”

Authors: Harold Guéneau, Alain Celisse, Pascal Delange

Abstract

The present work tackles the money laundering detection problem. A new procedure is introduced which exploits structured time series of both qualitative and quantitative data by means of a transformer neural network. The first step of this procedure aims at learning representations of time series through contrastive learning (without any labels). The second step leverages these representations to generate a money laundering scoring of all observations. A two-thresholds approach is then introduced, which ensures a controlled false-positive rate by means of the Benjamini-Hochberg (BH) procedure. Experiments confirm that the transformer is able to produce general representations that succeed in exploiting money laundering patterns with minimal supervision from domain experts. It also illustrates the higher ability of the new procedure for detecting nonfraudsters as well as fraudsters, while keeping the false positive rate under control. This greatly contrasts with rule-based procedures or the ones based on LSTM architectures.

Keywords: Money laundering detection, Transformer neural network, Contrastive learning, Benjamini-Hochberg procedure, Time series scoring, Cash/Transactions

Complexity vs Empirical Score

Math Complexity: 6.5/10
Empirical Rigor: 4.0/10
Quadrant: Lab Rats
Why: The paper introduces advanced ML techniques (transformers with contrastive learning, Benjamini-Hochberg procedure) requiring significant mathematical infrastructure, but it lacks concrete backtesting details, specific datasets, or implementation code, focusing instead on a conceptual/procedural framework.

  flowchart TD
    A["Research Goal"] --> B["Input Data: Cash/Transaction Time Series"]
    B --> C{"Step 1: Contrastive Learning"}
    C --> D["Unsupervised Representation Learning via Transformer"]
    D --> E{"Step 2: Supervised Scoring"}
    E --> F["Money Laundering Score Generation"]
    F --> G["Benjamini-Hochberg Thresholding"]
    G --> H["Key Outcomes"]
    
    subgraph H["Key Outcomes"]
        H1["High Detection Accuracy"]
        H2["Controlled False-Positive Rate"]
        H3["Superior to LSTM/Baselines"]
    end

Representation learning with a transformer by contrastive learning for money laundering detection#

Abstract#

Complexity vs Empirical Score#

Representation learning with a transformer by contrastive learning for money laundering detection

Abstract

Complexity vs Empirical Score