Neural Networks

Empirical Models of the Time Evolution of SPX Option Prices

Empirical Models of the Time Evolution of SPX Option Prices ArXiv ID: 2506.17511 “View on arXiv” Authors: Alessio Brini, David A. Hsieh, Patrick Kuiper, Sean Moushegian, David Ye Abstract The key objective of this paper is to develop an empirical model for pricing SPX options that can be simulated over future paths of the SPX. To accomplish this, we formulate and rigorously evaluate several statistical models, including neural network, random forest, and linear regression. These models use the observed characteristics of the options as inputs – their price, moneyness and time-to-maturity, as well as a small set of external inputs, such as the SPX and its past history, dividend yield, and the risk-free rate. Model evaluation is performed on historical options data, spanning 30 years of daily observations. Significant effort is given to understanding the data and ensuring explainability for the neural network. A neural network model with two hidden layers and four neurons per layer, trained with minimal hyperparameter tuning, performs well against the theoretical Black-Scholes-Merton model for European options, as well as two other empirical models based on the random forest and the linear regression. It delivers arbitrage-free option prices without requiring these conditions to be imposed. ...

Model-Free Deep Hedging with Transaction Costs and Light Data Requirements

Model-Free Deep Hedging with Transaction Costs and Light Data Requirements ArXiv ID: 2505.22836 “View on arXiv” Authors: Pierre Brugière, Gabriel Turinici Abstract Option pricing theory, such as the Black and Scholes (1973) model, provides an explicit solution to construct a strategy that perfectly hedges an option in a continuous-time setting. In practice, however, trading occurs in discrete time and often involves transaction costs, making the direct application of continuous-time solutions potentially suboptimal. Previous studies, such as those by Buehler et al. (2018), Buehler et al. (2019) and Cao et al. (2019), have shown that deep learning or reinforcement learning can be used to derive better hedging strategies than those based on continuous-time models. However, these approaches typically rely on a large number of trajectories (of the order of $10^5$ or $10^6$) to train the model. In this work, we show that using as few as 256 trajectories is sufficient to train a neural network that significantly outperforms, in the Geometric Brownian Motion framework, both the classical Black & Scholes formula and the Leland model, which is arguably one of the most effective explicit alternatives for incorporating transaction costs. The ability to train neural networks with such a small number of trajectories suggests the potential for more practical and simple implementation on real-time financial series. ...

Learning the Spoofability of Limit Order Books With Interpretable Probabilistic Neural Networks

Learning the Spoofability of Limit Order Books With Interpretable Probabilistic Neural Networks ArXiv ID: 2504.15908 “View on arXiv” Authors: Unknown Abstract This paper investigates real-time detection of spoofing activity in limit order books, focusing on cryptocurrency centralized exchanges. We first introduce novel order flow variables based on multi-scale Hawkes processes that account both for the size and placement distance from current best prices of new limit orders. Using a Level-3 data set, we train a neural network model to predict the conditional probability distribution of mid price movements based on these features. Our empirical analysis highlights the critical role of the posting distance of limit orders in the price formation process, showing that spoofing detection models that do not take the posting distance into account are inadequate to describe the data. Next, we propose a spoofing detection framework based on the probabilistic market manipulation gain of a spoofing agent and use the previously trained neural network to compute the expected gain. Running this algorithm on all submitted limit orders in the period 2024-12-04 to 2024-12-07, we find that 31% of large orders could spoof the market. Because of its simple neuronal architecture, our model can be run in real time. This work contributes to enhancing market integrity by providing a robust tool for monitoring and mitigating spoofing in both cryptocurrency exchanges and traditional financial markets. ...

VWAP Execution with Signature-Enhanced Transformers: A Multi-Asset Learning Approach

VWAP Execution with Signature-Enhanced Transformers: A Multi-Asset Learning Approach ArXiv ID: 2503.02680 “View on arXiv” Authors: Unknown Abstract In this paper I propose a novel approach to Volume Weighted Average Price (VWAP) execution that addresses two key practical challenges: the need for asset-specific model training and the capture of complex temporal dependencies. Building upon my recent work in dynamic VWAP execution arXiv:2502.18177, I demonstrate that a single neural network trained across multiple assets can achieve performance comparable to or better than traditional asset-specific models. The proposed architecture combines a transformer-based design inspired by arXiv:2406.02486 with path signatures for capturing geometric features of price-volume trajectories, as in arXiv:2406.17890. The empirical analysis, conducted on hourly cryptocurrency trading data from 80 trading pairs, shows that the globally-fitted model with signature features (GFT-Sig) achieves superior performance in both absolute and quadratic VWAP loss metrics compared to asset-specific approaches. Notably, these improvements persist for out-of-sample assets, demonstrating the model’s ability to generalize across different market conditions. The results suggest that combining global parameter sharing with signature-based feature extraction provides a scalable and robust approach to VWAP execution, offering significant practical advantages over traditional asset-specific implementations. ...

Gaining efficiency in deep policy gradient method for continuous-time optimal control problems

Gaining efficiency in deep policy gradient method for continuous-time optimal control problems ArXiv ID: 2502.14141 “View on arXiv” Authors: Unknown Abstract In this paper, we propose an efficient implementation of deep policy gradient method (PGM) for optimal control problems in continuous time. The proposed method has the ability to manage the allocation of computational resources, number of trajectories, and complexity of architecture of the neural network. This is, in particular, important for continuous-time problems that require a fine time discretization. Each step of this method focuses on a different time scale and learns a policy, modeled by a neural network, for a discretized optimal control problem. The first step has the coarsest time discretization. As we proceed to other steps, the time discretization becomes finer. The optimal trained policy in each step is also used to provide data for the next step. We accompany the multi-scale deep PGM with a theoretical result on allocation of computational resources to obtain a targeted efficiency and test our methods on the linear-quadratic stochastic optimal control problem. ...

Generalized Factor Neural Network Model for High-dimensional Regression

Generalized Factor Neural Network Model for High-dimensional Regression ArXiv ID: 2502.11310 “View on arXiv” Authors: Unknown Abstract We tackle the challenges of modeling high-dimensional data sets, particularly those with latent low-dimensional structures hidden within complex, non-linear, and noisy relationships. Our approach enables a seamless integration of concepts from non-parametric regression, factor models, and neural networks for high-dimensional regression. Our approach introduces PCA and Soft PCA layers, which can be embedded at any stage of a neural network architecture, allowing the model to alternate between factor modeling and non-linear transformations. This flexibility makes our method especially effective for processing hierarchical compositional data. We explore ours and other techniques for imposing low-rank structures on neural networks and examine how architectural design impacts model performance. The effectiveness of our method is demonstrated through simulation studies, as well as applications to forecasting future price movements of equity ETF indices and nowcasting with macroeconomic data. ...

The AI Black-Scholes: Finance-Informed Neural Network

The AI Black-Scholes: Finance-Informed Neural Network ArXiv ID: 2412.12213 “View on arXiv” Authors: Unknown Abstract In the realm of option pricing, existing models are typically classified into principle-driven methods, such as solving partial differential equations (PDEs) that pricing function satisfies, and data-driven approaches, such as machine learning (ML) techniques that parameterize the pricing function directly. While principle-driven models offer a rigorous theoretical framework, they often rely on unrealistic assumptions, such as asset processes adhering to fixed stochastic differential equations (SDEs). Moreover, they can become computationally intensive, particularly in high-dimensional settings when analytical solutions are not available and thus numerical solutions are needed. In contrast, data-driven models excel in capturing market data trends, but they often lack alignment with core financial principles, raising concerns about interpretability and predictive accuracy, especially when dealing with limited or biased datasets. This work proposes a hybrid approach to address these limitations by integrating the strengths of both principled and data-driven methodologies. Our framework combines the theoretical rigor and interpretability of PDE-based models with the adaptability of machine learning techniques, yielding a more versatile methodology for pricing a broad spectrum of options. We validate our approach across different volatility modeling approaches-both with constant volatility (Black-Scholes) and stochastic volatility (Heston), demonstrating that our proposed framework, Finance-Informed Neural Network (FINN), not only enhances predictive accuracy but also maintains adherence to core financial principles. FINN presents a promising tool for practitioners, offering robust performance across a variety of market conditions. ...

Unsupervised learning-based calibration scheme for Rough Bergomi model

Unsupervised learning-based calibration scheme for Rough Bergomi model ArXiv ID: 2412.02135 “View on arXiv” Authors: Unknown Abstract Current deep learning-based calibration schemes for rough volatility models are based on the supervised learning framework, which can be costly due to a large amount of training data being generated. In this work, we propose a novel unsupervised learning-based scheme for the rough Bergomi (rBergomi) model which does not require accessing training data. The main idea is to use the backward stochastic differential equation (BSDE) derived in [“Bayer, Qiu and Yao, {“SIAM J. Financial Math.”}, 2022”] and simultaneously learn the BSDE solutions with the model parameters. We establish that the mean squares error between the option prices under the learned model parameters and the historical data is bounded by the loss function. Moreover, the loss can be made arbitrarily small under suitable conditions on the fitting ability of the rBergomi model to the market and the universal approximation capability of neural networks. Numerical experiments for both simulated and historical data confirm the efficiency of scheme. ...

Neural and Time-Series Approaches for Pricing Weather Derivatives: Performance and Regime Adaptation Using Satellite Data

Neural and Time-Series Approaches for Pricing Weather Derivatives: Performance and Regime Adaptation Using Satellite Data ArXiv ID: 2411.12013 “View on arXiv” Authors: Unknown Abstract This paper studies pricing of weather-derivative (WD) contracts on temperature and precipitation. For temperature-linked strangles in Toronto and Chicago, we benchmark a harmonic-regression/ARMA model against a feed-forward neural network (NN), finding that the NN reduces out-of-sample mean-squared error (MSE) and materially shifts December fair values relative to both the time-series model and the industry-standard Historic Burn Approach (HBA). For precipitation, we employ a compound Poisson–Gamma framework: shape and scale parameters are estimated via maximum likelihood estimation (MLE) and via a convolutional neural network (CNN) trained on 30-day rainfall sequences spanning multiple seasons. The CNN adaptively learns season-specific $(α,β)$ mappings, thereby capturing heterogeneity across regimes that static i.i.d.\ fits miss. At valuation, we assume days are i.i.d.\ $Γ(\hatα,\hatβ)$ within each regime and apply a mean-count approximation (replacing the Poisson count by its mean ($n\hatλ$) to derive closed-form strangle prices. Exploratory analysis of 1981–2023 NASA POWER data confirms pronounced seasonal heterogeneity in $(α,β)$ between summer and winter, demonstrating that static global fits are inadequate. Back-testing on Toronto and Chicago grids shows that our regime-adaptive CNN yields competitive valuations and underscores how model choice can shift strangle prices. Payoffs are evaluated analytically when possible and by simulation elsewhere, enabling a like-for-like comparison of forecasting and valuation methods. ...

Guided Learning: Lubricating End-to-End Modeling for Multi-stage Decision-making

Guided Learning: Lubricating End-to-End Modeling for Multi-stage Decision-making ArXiv ID: 2411.10496 “View on arXiv” Authors: Unknown Abstract Multi-stage decision-making is crucial in various real-world artificial intelligence applications, including recommendation systems, autonomous driving, and quantitative investment systems. In quantitative investment, for example, the process typically involves several sequential stages such as factor mining, alpha prediction, portfolio optimization, and sometimes order execution. While state-of-the-art end-to-end modeling aims to unify these stages into a single global framework, it faces significant challenges: (1) training such a unified neural network consisting of multiple stages between initial inputs and final outputs often leads to suboptimal solutions, or even collapse, and (2) many decision-making scenarios are not easily reducible to standard prediction problems. To overcome these challenges, we propose Guided Learning, a novel methodological framework designed to enhance end-to-end learning in multi-stage decision-making. We introduce the concept of a guide'', a function that induces the training of intermediate neural network layers towards some phased goals, directing gradients away from suboptimal collapse. For decision scenarios lacking explicit supervisory labels, we incorporate a utility function that quantifies the reward’’ of the throughout decision. Additionally, we explore the connections between Guided Learning and classic machine learning paradigms such as supervised, unsupervised, semi-supervised, multi-task, and reinforcement learning. Experiments on quantitative investment strategy building demonstrate that guided learning significantly outperforms both traditional stage-wise approaches and existing end-to-end methods. ...