Attention Factors for Statistical Arbitrage

ArXiv ID: 2510.11616 “View on arXiv”

Authors: Elliot L. Epstein, Rose Wang, Jaewon Choi, Markus Pelger

Abstract

Statistical arbitrage exploits temporal price differences between similar assets. We develop a framework to jointly identify similar assets through factors, identify mispricing and form a trading policy that maximizes risk-adjusted performance after trading costs. Our Attention Factors are conditional latent factors that are the most useful for arbitrage trading. They are learned from firm characteristic embeddings that allow for complex interactions. We identify time-series signals from the residual portfolios of our factors with a general sequence model. Estimating factors and the arbitrage trading strategy jointly is crucial to maximize profitability after trading costs. In a comprehensive empirical study we show that our Attention Factor model achieves an out-of-sample Sharpe ratio above 4 on the largest U.S. equities over a 24-year period. Our one-step solution yields an unprecedented Sharpe ratio of 2.3 net of transaction costs. We show that weak factors are important for arbitrage trading.

Keywords: Statistical Arbitrage, Attention Factors, Latent Factor Models, Risk-Adjusted Performance, Trading Policy, Equities

Complexity vs Empirical Score

  • Math Complexity: 7.0/10
  • Empirical Rigor: 8.5/10
  • Quadrant: Holy Grail
  • Why: The paper employs advanced mathematical techniques including deep learning with attention mechanisms, latent factor models, and sequence models (e.g., transformers, S4), indicating high mathematical complexity. It presents a comprehensive empirical study with a 24-year backtest on U.S. equities, reporting specific Sharpe ratios (gross and net), transaction costs, and out-of-sample results, demonstrating strong empirical rigor.
  flowchart TD
    A["Research Goal: Develop a Joint Framework for Statistical Arbitrage"] --> B["Methodology: Attention Factor Model"]
    B --> C["Data: Firm Characteristic Embeddings"]
    C --> D{"Computational Process"}
    D --> E["Learn Conditional Latent Attention Factors"]
    E --> F["Model Residual Portfolios with General Sequence Model"]
    F --> G["Joint Estimation of Factors & Trading Policy"]
    G --> H["Key Outcomes"]
    H --> I["Out-of-sample Sharpe Ratio > 4.0"]
    H --> J["Net Sharpe Ratio 2.3 (Transaction Costs)"]
    H --> K["Weak Factors are Crucial for Arbitrage"]