Online Learning of Order Flow and Market Impact with Bayesian Change-Point Detection Methods

ArXiv ID: 2307.02375 “View on arXiv”

Authors: Unknown

Abstract

Financial order flow exhibits a remarkable level of persistence, wherein buy (sell) trades are often followed by subsequent buy (sell) trades over extended periods. This persistence can be attributed to the division and gradual execution of large orders. Consequently, distinct order flow regimes might emerge, which can be identified through suitable time series models applied to market data. In this paper, we propose the use of Bayesian online change-point detection (BOCPD) methods to identify regime shifts in real-time and enable online predictions of order flow and market impact. To enhance the effectiveness of our approach, we have developed a novel BOCPD method using a score-driven approach. This method accommodates temporal correlations and time-varying parameters within each regime. Through empirical application to NASDAQ data, we have found that: (i) Our newly proposed model demonstrates superior out-of-sample predictive performance compared to existing models that assume i.i.d. behavior within each regime; (ii) When examining the residuals, our model demonstrates good specification in terms of both distributional assumptions and temporal correlations; (iii) Within a given regime, the price dynamics exhibit a concave relationship with respect to time and volume, mirroring the characteristics of actual large orders; (iv) By incorporating regime information, our model produces more accurate online predictions of order flow and market impact compared to models that do not consider regimes.

Keywords: Order Flow Persistence, Bayesian Online Change-Point Detection, Market Impact, Regime Switching, High-Frequency Trading

Complexity vs Empirical Score

  • Math Complexity: 8.5/10
  • Empirical Rigor: 7.0/10
  • Quadrant: Holy Grail
  • Why: The paper employs advanced Bayesian statistics (BOCPD) and score-driven models with heavy mathematical derivations, but it is validated on real NASDAQ data with out-of-sample forecasting and residual diagnostics.
  flowchart TD
    A["Research Goal: Online Real-Time Detection of Regime Shifts<br>in Order Flow & Market Impact"] --> B{"Methodology: Bayesian Online Change-Point Detection<br>BOCPD Score-Driven Model"}
    B --> C["Input: NASDAQ High-Frequency Data"]
    C --> D{"Computational Process"}
    D --> E["1. Detect Regime Shifts<br>via BOCPD"]
    D --> F["2. Estimate Time-Varying<br>Model Parameters per Regime"]
    D --> G["3. Generate Online Predictions<br>for Flow & Impact"]
    E & F & G --> H{"Key Findings/Outcomes"}
    H --> I["Superior Predictive Performance<br>vs i.i.d. & Non-Regime Models"]
    H --> J["Good Model Specification<br>Residual Analysis Confirms Validity"]
    H --> K["Captures Concave Price Dynamics<br>mirroring Large Orders"]
    H --> L["Regime Information Significantly<br>Improves Forecast Accuracy"]
    style A fill:#f9f,stroke:#333,stroke-width:2px
    style H fill:#ccf,stroke:#333,stroke-width:2px
    style B fill:#cff,stroke:#333,stroke-width:1px
    style D fill:#fff,stroke:#333,stroke-width:1px
    style C fill:#cfc,stroke:#333,stroke-width:1px