Forecasting High Frequency Order Flow Imbalance
ArXiv ID: 2408.03594 “View on arXiv”
Authors: Unknown
Abstract
Market information events are generated intermittently and disseminated at high speeds in real-time. Market participants consume this high-frequency data to build limit order books, representing the current bids and offers for a given asset. The arrival processes, or the order flow of bid and offer events, are asymmetric and possibly dependent on each other. The quantum and direction of this asymmetry are often associated with the direction of the traded price movement. The Order Flow Imbalance (OFI) is an indicator commonly used to estimate this asymmetry. This paper uses Hawkes processes to estimate the OFI while accounting for the lagged dependence in the order flow between bids and offers. Secondly, we develop a method to forecast the near-term distribution of the OFI, which can then be used to compare models for forecasting OFI. Thirdly, we propose a method to compare the forecasts of OFI for an arbitrarily large number of models. We apply the approach developed to tick data from the National Stock Exchange and observe that the Hawkes process modeled with a Sum of Exponential’s kernel gives the best forecast among all competing models.
Keywords: Hawkes Processes, Order Flow Imbalance (OFI), Limit Order Books, High-Frequency Data, Kernel Estimation, Equities
Complexity vs Empirical Score
- Math Complexity: 7.5/10
- Empirical Rigor: 8.0/10
- Quadrant: Holy Grail
- Why: The paper employs advanced mathematical modeling with Hawkes processes and detailed kernel specifications, while demonstrating high empirical rigor through application to high-frequency tick data, model comparison, and specific forecasting results on the National Stock Exchange.
flowchart TD
A["Research Goal:<br>Forecast High-Frequency OFI<br>using Hawkes Processes"] --> B["Data Input:<br>Tick Data from<br>National Stock Exchange"]
B --> C["Methodology 1:<br>Estimate OFI using<br>Hawkes Process with<br>Sum of Exponentials Kernel"]
C --> D["Methodology 2:<br>Forecast Near-Term<br>OFI Distribution"]
D --> E["Methodology 3:<br>Compare Forecasts<br>across Multiple Models"]
E --> F["Key Findings:<br>Hawkes Process with<br>Sum of Exponentials Kernel<br>provides best forecast"]