MarketGPT: Developing a Pre-trained transformer (GPT) for Modeling Financial Time Series
ArXiv ID: 2411.16585 “View on arXiv”
Authors: Unknown
Abstract
This work presents a generative pre-trained transformer (GPT) designed for modeling financial time series. The GPT functions as an order generation engine within a discrete event simulator, enabling realistic replication of limit order book dynamics. Our model leverages recent advancements in large language models to produce long sequences of order messages in a steaming manner. Our results demonstrate that the model successfully reproduces key features of order flow data, even when the initial order flow prompt is no longer present within the model’s context window. Moreover, evaluations reveal that the model captures several statistical properties, or ‘stylized facts’, characteristic of real financial markets and broader macro-scale data distributions. Collectively, this work marks a significant step toward creating high-fidelity, interactive market simulations.
Keywords: Limit Order Book (LOB), Generative Pre-trained Transformer (GPT), Market Microstructure, Discrete Event Simulation, Order Flow
Complexity vs Empirical Score
- Math Complexity: 7.5/10
- Empirical Rigor: 7.0/10
- Quadrant: Holy Grail
- Why: The paper employs advanced transformer architectures with sophisticated tokenization and training strategies, indicating high mathematical complexity. It also demonstrates empirical rigor through the use of real Nasdaq ITCH data, explicit train/validation/test splits, and evaluation against stylized facts of market microstructure.
flowchart TD
A["Research Goal:\nCreate GPT for financial time series modeling"] --> B["Methodology:\nOrder generation engine in discrete event simulator"]
B --> C["Data:\nLimit Order Book<br>historical data"]
B --> D["Computational Process:\nStreaming Transformer<br>Generative Pre-training"]
C --> D
D --> E["Key Findings:\n1. Reproduces LOB dynamics\n2. Captures stylized facts\n3. Works beyond context window"]