High-Frequency Trading

Combining Deep Learning on Order Books with Reinforcement Learning for Profitable Trading

Combining Deep Learning on Order Books with Reinforcement Learning for Profitable Trading ArXiv ID: 2311.02088 “View on arXiv” Authors: Unknown Abstract High-frequency trading is prevalent, where automated decisions must be made quickly to take advantage of price imbalances and patterns in price action that forecast near-future movements. While many algorithms have been explored and tested, analytical methods fail to harness the whole nature of the market environment by focusing on a limited domain. With the evergrowing machine learning field, many large-scale end-to-end studies on raw data have been successfully employed to increase the domain scope for profitable trading but are very difficult to replicate. Combining deep learning on the order books with reinforcement learning is one way of breaking down large-scale end-to-end learning into more manageable and lightweight components for reproducibility, suitable for retail trading. The following work focuses on forecasting returns across multiple horizons using order flow imbalance and training three temporal-difference learning models for five financial instruments to provide trading signals. The instruments used are two foreign exchange pairs (GBPUSD and EURUSD), two indices (DE40 and FTSE100), and one commodity (XAUUSD). The performances of these 15 agents are evaluated through backtesting simulation, and successful models proceed through to forward testing on a retail trading platform. The results prove potential but require further minimal modifications for consistently profitable trading to fully handle retail trading costs, slippage, and spread fluctuation. ...

C++ Design Patterns for Low-latency Applications Including High-frequency Trading

C++ Design Patterns for Low-latency Applications Including High-frequency Trading ArXiv ID: 2309.04259 “View on arXiv” Authors: Unknown Abstract This work aims to bridge the existing knowledge gap in the optimisation of latency-critical code, specifically focusing on high-frequency trading (HFT) systems. The research culminates in three main contributions: the creation of a Low-Latency Programming Repository, the optimisation of a market-neutral statistical arbitrage pairs trading strategy, and the implementation of the Disruptor pattern in C++. The repository serves as a practical guide and is enriched with rigorous statistical benchmarking, while the trading strategy optimisation led to substantial improvements in speed and profitability. The Disruptor pattern showcased significant performance enhancement over traditional queuing methods. Evaluation metrics include speed, cache utilisation, and statistical significance, among others. Techniques like Cache Warming and Constexpr showed the most significant gains in latency reduction. Future directions involve expanding the repository, testing the optimised trading algorithm in a live trading environment, and integrating the Disruptor pattern with the trading algorithm for comprehensive system benchmarking. The work is oriented towards academics and industry practitioners seeking to improve performance in latency-sensitive applications. ...

Exploiting Unfair Advantages: Investigating Opportunistic Trading in the NFT Market

Exploiting Unfair Advantages: Investigating Opportunistic Trading in the NFT Market ArXiv ID: 2310.06844 “View on arXiv” Authors: Unknown Abstract As cryptocurrency evolved, new financial instruments, such as lending and borrowing protocols, currency exchanges, fungible and non-fungible tokens (NFT), staking and mining protocols have emerged. A financial ecosystem built on top of a blockchain is supposed to be fair and transparent for each participating actor. Yet, there are sophisticated actors who turn their domain knowledge and market inefficiencies to their strategic advantage; thus extracting value from trades not accessible to others. This situation is further exacerbated by the fact that blockchain-based markets and decentralized finance (DeFi) instruments are mostly unregulated. Though a large body of work has already studied the unfairness of different aspects of DeFi and cryptocurrency trading, the economic intricacies of non-fungible token (NFT) trades necessitate further analysis and academic scrutiny. The trading volume of NFTs has skyrocketed in recent years. A single NFT trade worth over a million US dollars, or marketplaces making billions in revenue is not uncommon nowadays. While previous research indicated the presence of wrongdoings in the NFT market, to our knowledge, we are the first to study predatory trading practices, what we call opportunistic trading, in depth. Opportunistic traders are sophisticated actors who employ automated, high-frequency NFT trading strategies, which, oftentimes, are malicious, deceptive, or, at the very least, unfair. Such attackers weaponize their advanced technical knowledge and superior understanding of DeFi protocols to disrupt trades of unsuspecting users, and collect profits from economic situations that are inaccessible to ordinary users, in a “supposedly” fair market. In this paper, we explore three such broad classes of opportunistic strategies aiming to realize three distinct trading objectives, viz., acquire, instant profit generation, and loss minimization. ...

JAX-LOB: A GPU-Accelerated limit order book simulator to unlock large scale reinforcement learning for trading

JAX-LOB: A GPU-Accelerated limit order book simulator to unlock large scale reinforcement learning for trading ArXiv ID: 2308.13289 “View on arXiv” Authors: Unknown Abstract Financial exchanges across the world use limit order books (LOBs) to process orders and match trades. For research purposes it is important to have large scale efficient simulators of LOB dynamics. LOB simulators have previously been implemented in the context of agent-based models (ABMs), reinforcement learning (RL) environments, and generative models, processing order flows from historical data sets and hand-crafted agents alike. For many applications, there is a requirement for processing multiple books, either for the calibration of ABMs or for the training of RL agents. We showcase the first GPU-enabled LOB simulator designed to process thousands of books in parallel, with a notably reduced per-message processing time. The implementation of our simulator - JAX-LOB - is based on design choices that aim to best exploit the powers of JAX without compromising on the realism of LOB-related mechanisms. We integrate JAX-LOB with other JAX packages, to provide an example of how one may address an optimal execution problem with reinforcement learning, and to share some preliminary results from end-to-end RL training on GPUs. ...

Quantitative statistical analysis of order-splitting behaviour of individual trading accounts in the Japanese stock market over nine years

Quantitative statistical analysis of order-splitting behaviour of individual trading accounts in the Japanese stock market over nine years ArXiv ID: 2308.01112 “View on arXiv” Authors: Unknown Abstract In this research, we focus on the order-splitting behavior. The order splitting is a trading strategy to execute their large potential metaorder into small pieces to reduce transaction cost. This strategic behavior is believed to be important because it is a promising candidate for the microscopic origin of the long-range correlation (LRC) in the persistent order flow. Indeed, in 2005, Lillo, Mike, and Farmer (LMF) introduced a microscopic model of the order-splitting traders to predict the asymptotic behavior of the LRC from the microscopic dynamics, even quantitatively. The plausibility of this scenario has been qualitatively investigated by Toth et al. 2015. However, no solid support has been presented yet on the quantitative prediction by the LMF model in the lack of large microscopic datasets. In this report, we have provided the first quantitative statistical analysis of the order-splitting behavior at the level of each trading account. We analyse a large dataset of the Tokyo stock exchange (TSE) market over nine years, including the account data of traders (called virtual servers). The virtual server is a unit of trading accounts in the TSE market, and we can effectively define the trader IDs by an appropriate preprocessing. We apply a strategy clustering to individual traders to identify the order-splitting traders and the random traders. For most of the stocks, we find that the metaorder length distribution obeys power laws with exponent $α$, such that $P(L)\propto L^{"-α-1"}$ with the metaorder length $L$. By analysing the sign correlation $C(τ)\propto τ^{"-γ"}$, we directly confirmed the LMF prediction $γ\approx α-1$. Furthermore, we discuss how to estimate the total number of the splitting traders only from public data via the ACF prefactor formula in the LMF model. Our work provides the first quantitative evidence of the LMF model. ...

Estimation of an Order Book Dependent Hawkes Process for Large Datasets

Estimation of an Order Book Dependent Hawkes Process for Large Datasets ArXiv ID: 2307.09077 “View on arXiv” Authors: Unknown Abstract A point process for event arrivals in high frequency trading is presented. The intensity is the product of a Hawkes process and high dimensional functions of covariates derived from the order book. Conditions for stationarity of the process are stated. An algorithm is presented to estimate the model even in the presence of billions of data points, possibly mapping covariates into a high dimensional space. The large sample size can be common for high frequency data applications using multiple liquid instruments. Convergence of the algorithm is shown, consistency results under weak conditions is established, and a test statistic to assess out of sample performance of different model specifications is suggested. The methodology is applied to the study of four stocks that trade on the New York Stock Exchange (NYSE). The out of sample testing procedure suggests that capturing the nonlinearity of the order book information adds value to the self exciting nature of high frequency trading events. ...

Interpretable ML for High-Frequency Execution

Interpretable ML for High-Frequency Execution ArXiv ID: 2307.04863 “View on arXiv” Authors: Unknown Abstract Order placement tactics play a crucial role in high-frequency trading algorithms and their design is based on understanding the dynamics of the order book. Using high quality high-frequency data and a set of microstructural features, we exhibit strong state dependence properties of the fill probability function. We train a neural network to infer the fill probability function for a fixed horizon. Since we aim at providing a high-frequency execution framework, we use a simple architecture. A weighting method is applied to the loss function such that the model learns from censored data. By comparing numerical results obtained on both digital asset centralized exchanges (CEXs) and stock markets, we are able to analyze dissimilarities between feature importances of the fill probability of small tick crypto pairs and Euronext equities. The practical use of this model is illustrated with a fixed time horizon execution problem in which both the decision to post a limit order or to immediately execute and the optimal distance of placement are characterized. We discuss the importance of accurately estimating the clean-up cost that occurs in the case of a non-execution and we show it can be well approximated by a smooth function of market features. We finally assess the performance of our model with a backtesting approach that avoids the insertion of hypothetical orders and makes possible to test the order placement algorithm with orders that realistically impact the price formation process. ...

Online Learning of Order Flow and Market Impact with Bayesian Change-Point Detection Methods

Online Learning of Order Flow and Market Impact with Bayesian Change-Point Detection Methods ArXiv ID: 2307.02375 “View on arXiv” Authors: Unknown Abstract Financial order flow exhibits a remarkable level of persistence, wherein buy (sell) trades are often followed by subsequent buy (sell) trades over extended periods. This persistence can be attributed to the division and gradual execution of large orders. Consequently, distinct order flow regimes might emerge, which can be identified through suitable time series models applied to market data. In this paper, we propose the use of Bayesian online change-point detection (BOCPD) methods to identify regime shifts in real-time and enable online predictions of order flow and market impact. To enhance the effectiveness of our approach, we have developed a novel BOCPD method using a score-driven approach. This method accommodates temporal correlations and time-varying parameters within each regime. Through empirical application to NASDAQ data, we have found that: (i) Our newly proposed model demonstrates superior out-of-sample predictive performance compared to existing models that assume i.i.d. behavior within each regime; (ii) When examining the residuals, our model demonstrates good specification in terms of both distributional assumptions and temporal correlations; (iii) Within a given regime, the price dynamics exhibit a concave relationship with respect to time and volume, mirroring the characteristics of actual large orders; (iv) By incorporating regime information, our model produces more accurate online predictions of order flow and market impact compared to models that do not consider regimes. ...

Integrating Tick-level Data and Periodical Signal for High-frequency Market Making

Integrating Tick-level Data and Periodical Signal for High-frequency Market Making ArXiv ID: 2306.17179 “View on arXiv” Authors: Unknown Abstract We focus on the problem of market making in high-frequency trading. Market making is a critical function in financial markets that involves providing liquidity by buying and selling assets. However, the increasing complexity of financial markets and the high volume of data generated by tick-level trading makes it challenging to develop effective market making strategies. To address this challenge, we propose a deep reinforcement learning approach that fuses tick-level data with periodic prediction signals to develop a more accurate and robust market making strategy. Our results of market making strategies based on different deep reinforcement learning algorithms under the simulation scenarios and real data experiments in the cryptocurrency markets show that the proposed framework outperforms existing methods in terms of profitability and risk management. ...

Statistical Modeling of High Frequency Financial Data: Facts, Models and Challenges

Statistical Modeling of High Frequency Financial Data: Facts, Models and Challenges ArXiv ID: ssrn-1748022 “View on arXiv” Authors: Unknown Abstract The availability of high-frequency data on transactions, quotes and order flow in electronic order-driven markets has revolutionized data processing and statist Keywords: High-Frequency Trading, Market Microstructure, Electronization, Algorithmic Trading, Time-Series Analysis, Equity / Quantitative Finance Complexity vs Empirical Score Math Complexity: 7.5/10 Empirical Rigor: 6.0/10 Quadrant: Holy Grail Why: The paper involves advanced stochastic calculus and modeling of high-frequency data, indicating high mathematical complexity, while its focus on empirical high-frequency data and statistical methods suggests a strong, though not code-heavy, empirical backing. flowchart TD A["Research Goal: Model High-Frequency Financial Data in Order-Driven Markets"] --> B["Data Collection: Transactions, Quotes, Order Flow"] B --> C["Methodology: Time-Series & Statistical Analysis"] C --> D["Computational Modeling: Volatility Estimation & Microstructure"] D --> E["Key Finding 1: Data Irregularities (Clock Effects)"] D --> F["Key Finding 2: Microstructure Noise Bias"] D --> G["Key Finding 3: Modeling Challenges & Solutions"]