Binary Tree Option Pricing Under Market Microstructure Effects: A Random Forest Approach
ArXiv ID: 2507.16701 “View on arXiv”
Authors: Akash Deep, Chris Monico, W. Brent Lindquist, Svetlozar T. Rachev, Frank J. Fabozzi
Abstract
We propose a machine learning-based extension of the classical binomial option pricing model that incorporates key market microstructure effects. Traditional models assume frictionless markets, overlooking empirical features such as bid-ask spreads, discrete price movements, and serial return correlations. Our framework augments the binomial tree with path-dependent transition probabilities estimated via Random Forest classifiers trained on high-frequency market data. This approach preserves no-arbitrage conditions while embedding real-world trading dynamics into the pricing model. Using 46,655 minute-level observations of SPY from January to June 2025, we achieve an AUC of 88.25% in forecasting one-step price movements. Order flow imbalance is identified as the most influential predictor, contributing 43.2% to feature importance. After resolving time-scaling inconsistencies in tree construction, our model yields option prices that deviate by 13.79% from Black-Scholes benchmarks, highlighting the impact of microstructure on fair value estimation. While computational limitations restrict the model to short-term derivatives, our results offer a robust, data-driven alternative to classical pricing methods grounded in empirical market behavior.
Keywords: Option Pricing, Binomial Model, Market Microstructure, Random Forest, High-Frequency Data
Complexity vs Empirical Score
- Math Complexity: 6.5/10
- Empirical Rigor: 6.0/10
- Quadrant: Holy Grail
- Why: The paper integrates advanced theoretical concepts like the Minimal Martingale Measure and binary tree extensions with machine learning, indicating significant mathematical density. It also presents concrete empirical results from backtesting with 46,655 high-frequency SPY observations, including AUC metrics and price deviation percentages.
flowchart TD
Start["Research Goal<br/>Develop ML extension of binomial option pricing<br/>incorporating market microstructure effects"] --> Input["Input Data<br/>46,655 minute-level SPY observations<br/>Jan - June 2025"]
Input --> Method["Methodology<br/>Binomial Tree augmented with<br/>Random Forest Classifiers"]
Method --> Process["Computational Process<br/>Estimate path-dependent transition<br/>probabilities using high-frequency data"]
Process --> Outcome1["Outcome 1: Prediction Accuracy<br/>AUC 88.25% in one-step price movements"]
Process --> Outcome2["Outcome 2: Feature Importance<br/>Order flow imbalance: 43.2% contribution"]
Process --> Outcome3["Outcome 3: Option Pricing Impact<br/>13.79% deviation from Black-Scholes<br/>after resolving time-scaling issues"]
Outcome1 --> Final["Key Finding<br/>Robust, data-driven alternative to<br/>classical pricing grounded in empirical behavior"]
Outcome2 --> Final
Outcome3 --> Final