Neural Networks

On Deep Learning for computing the Dynamic Initial Margin and Margin Value Adjustment

On Deep Learning for computing the Dynamic Initial Margin and Margin Value Adjustment ArXiv ID: 2407.16435 “View on arXiv” Authors: Unknown Abstract The present work addresses the challenge of training neural networks for Dynamic Initial Margin (DIM) computation in counterparty credit risk, a task traditionally burdened by the high costs associated with generating training datasets through nested Monte Carlo (MC) simulations. By condensing the initial market state variables into an input vector, determined through an interest rate model and a parsimonious parameterization of the current interest rate term structure, we construct a training dataset where labels are noisy but unbiased DIM samples derived from single MC paths. A multi-output neural network structure is employed to handle DIM as a time-dependent function, facilitating training across a mesh of monitoring times. The methodology offers significant advantages: it reduces the dataset generation cost to a single MC execution and parameterizes the neural network by initial market state variables, obviating the need for repeated training. Experimental results demonstrate the approach’s convergence properties and robustness across different interest rate models (Vasicek and Hull-White) and portfolio complexities, validating its general applicability and efficiency in more realistic scenarios. ...

Leveraging Machine Learning for High-Dimensional Option Pricing within the Uncertain Volatility Model

Leveraging Machine Learning for High-Dimensional Option Pricing within the Uncertain Volatility Model ArXiv ID: 2407.13213 “View on arXiv” Authors: Unknown Abstract This paper explores the application of Machine Learning techniques for pricing high-dimensional options within the framework of the Uncertain Volatility Model (UVM). The UVM is a robust framework that accounts for the inherent unpredictability of market volatility by setting upper and lower bounds on volatility and the correlation among underlying assets. By integrating advanced Machine Learning algorithms, we aim to enhance the accuracy and efficiency of option pricing under the UVM, especially when the option price depends on a large number of variables, such as in basket or path-dependent options. In this paper, we consider two approaches based on Machine Learning. The first one, termed GTU, evolves backward in time, dynamically selecting at each time step the most expensive volatility and correlation for each market state. Specifically, it identifies the particular values of volatility and correlation that maximize the expected option value at the next time step, and therefore, an optimization problem must be solved. This is achieved through the use of Gaussian Process regression, the computation of expectations via a single step of a multidimensional tree and the Sequential Quadratic Programming optimization algorithm. The second approach, referred to as NNU, leverages neural networks and frames pricing in the UVM as a control problem. Specifically, we train a neural network to determine the most adverse volatility and correlation for each simulated market state, generated via random simulations. The option price is then obtained through Monte Carlo simulations, which are performed using the values for the uncertain parameters provided by the neural network. The numerical results demonstrate that the proposed approaches can significantly improve the precision of option pricing particularly in high-dimensional contexts. ...

Machine Learning Methods for Pricing Financial Derivatives

Machine Learning Methods for Pricing Financial Derivatives ArXiv ID: 2406.00459 “View on arXiv” Authors: Unknown Abstract Stochastic differential equation (SDE) models are the foundation for pricing and hedging financial derivatives. The drift and volatility functions in SDE models are typically chosen to be algebraic functions with a small number (less than 5) parameters which can be calibrated to market data. A more flexible approach is to use neural networks to model the drift and volatility functions, which provides more degrees-of-freedom to match observed market data. Training of models requires optimizing over an SDE, which is computationally challenging. For European options, we develop a fast stochastic gradient descent (SGD) algorithm for training the neural network-SDE model. Our SGD algorithm uses two independent SDE paths to obtain an unbiased estimate of the direction of steepest descent. For American options, we optimize over the corresponding Kolmogorov partial differential equation (PDE). The neural network appears as coefficient functions in the PDE. Models are trained on large datasets (many contracts), requiring either large simulations (many Monte Carlo samples for the stock price paths) or large numbers of PDEs (a PDE must be solved for each contract). Numerical results are presented for real market data including S&P 500 index options, S&P 100 index options, and single-stock American options. The neural-network-based SDE models are compared against the Black-Scholes model, the Dupire’s local volatility model, and the Heston model. Models are evaluated in terms of how accurate they are at pricing out-of-sample financial derivatives, which is a core task in derivative pricing at financial institutions. ...

Comparative Study of Bitcoin Price Prediction

Comparative Study of Bitcoin Price Prediction ArXiv ID: 2405.08089 “View on arXiv” Authors: Unknown Abstract Prediction of stock prices has been a crucial and challenging task, especially in the case of highly volatile digital currencies such as Bitcoin. This research examineS the potential of using neural network models, namely LSTMs and GRUs, to forecast Bitcoin’s price movements. We employ five-fold cross-validation to enhance generalization and utilize L2 regularization to reduce overfitting and noise. Our study demonstrates that the GRUs models offer better accuracy than LSTMs model for predicting Bitcoin’s price. Specifically, the GRU model has an MSE of 4.67, while the LSTM model has an MSE of 6.25 when compared to the actual prices in the test set data. This finding indicates that GRU models are better equipped to process sequential data with long-term dependencies, a characteristic of financial time series data such as Bitcoin prices. In summary, our results provide valuable insights into the potential of neural network models for accurate Bitcoin price prediction and emphasize the importance of employing appropriate regularization techniques to enhance model performance. ...

Neural Network Learning of Black-Scholes Equation for Option Pricing

Neural Network Learning of Black-Scholes Equation for Option Pricing ArXiv ID: 2405.05780 “View on arXiv” Authors: Unknown Abstract One of the most discussed problems in the financial world is stock option pricing. The Black-Scholes Equation is a Parabolic Partial Differential Equation which provides an option pricing model. The present work proposes an approach based on Neural Networks to solve the Black-Scholes Equations. Real-world data from the stock options market were used as the initial boundary to solve the Black-Scholes Equation. In particular, times series of call options prices of Brazilian companies Petrobras and Vale were employed. The results indicate that the network can learn to solve the Black-Sholes Equation for a specific real-world stock options time series. The experimental results showed that the Neural network option pricing based on the Black-Sholes Equation solution can reach an option pricing forecasting more accurate than the traditional Black-Sholes analytical solutions. The experimental results making it possible to use this methodology to make short-term call option price forecasts in options markets. ...

Mathematics of Differential Machine Learning in Derivative Pricing and Hedging

Mathematics of Differential Machine Learning in Derivative Pricing and Hedging ArXiv ID: 2405.01233 “View on arXiv” Authors: Unknown Abstract This article introduces the groundbreaking concept of the financial differential machine learning algorithm through a rigorous mathematical framework. Diverging from existing literature on financial machine learning, the work highlights the profound implications of theoretical assumptions within financial models on the construction of machine learning algorithms. This endeavour is particularly timely as the finance landscape witnesses a surge in interest towards data-driven models for the valuation and hedging of derivative products. Notably, the predictive capabilities of neural networks have garnered substantial attention in both academic research and practical financial applications. The approach offers a unified theoretical foundation that facilitates comprehensive comparisons, both at a theoretical level and in experimental outcomes. Importantly, this theoretical grounding lends substantial weight to the experimental results, affirming the differential machine learning method’s optimality within the prevailing context. By anchoring the insights in rigorous mathematics, the article bridges the gap between abstract financial concepts and practical algorithmic implementations. ...

Application of Deep Learning for Factor Timing in Asset Management

Application of Deep Learning for Factor Timing in Asset Management ArXiv ID: 2404.18017 “View on arXiv” Authors: Unknown Abstract The paper examines the performance of regression models (OLS linear regression, Ridge regression, Random Forest, and Fully-connected Neural Network) on the prediction of CMA (Conservative Minus Aggressive) factor premium and the performance of factor timing investment with them. Out-of-sample R-squared shows that more flexible models have better performance in explaining the variance in factor premium of the unseen period, and the back testing affirms that the factor timing based on more flexible models tends to over perform the ones with linear models. However, for flexible models like neural networks, the optimal weights based on their prediction tend to be unstable, which can lead to high transaction costs and market impacts. We verify that tilting down the rebalance frequency according to the historical optimal rebalancing scheme can help reduce the transaction costs. ...

Enhancing path-integral approximation for non-linear diffusion with neural network

Enhancing path-integral approximation for non-linear diffusion with neural network ArXiv ID: 2404.08903 “View on arXiv” Authors: Unknown Abstract Enhancing the existing solution for pricing of fixed income instruments within Black-Karasinski model structure, with neural network at various parameterisation points to demonstrate that the method is able to achieve superior outcomes for multiple calibrations across extended projection horizons. Keywords: Black-Karasinski Model, Fixed Income Pricing, Neural Networks, Interest Rate Models, Fixed Income Complexity vs Empirical Score Math Complexity: 8.5/10 Empirical Rigor: 3.0/10 Quadrant: Lab Rats Why: The paper employs advanced mathematical concepts including path integrals, Taylor series expansions, and PDE approximations, but lacks empirical validation with backtests or statistical metrics, focusing instead on theoretical model formulation. flowchart TD A["Research Goal"] --> B["Data & Calibration"] A --> C["Methodology"] B --> D["Path-Integral Approx."] C --> D D --> E["Neural Network Enh."] E --> F["Computational Process"] F --> G["Key Outcomes"] subgraph Inputs A B C end subgraph Processing D E F end subgraph Results G end

Non-Parametric Estimation of Multi-dimensional Marked Hawkes Processes

Non-Parametric Estimation of Multi-dimensional Marked Hawkes Processes ArXiv ID: 2402.04740 “View on arXiv” Authors: Unknown Abstract An extension of the Hawkes process, the Marked Hawkes process distinguishes itself by featuring variable jump size across each event, in contrast to the constant jump size observed in a Hawkes process without marks. While extensive literature has been dedicated to the non-parametric estimation of both the linear and non-linear Hawkes process, there remains a significant gap in the literature regarding the marked Hawkes process. In response to this, we propose a methodology for estimating the conditional intensity of the marked Hawkes process. We introduce two distinct models: \textit{“Shallow Neural Hawkes with marks”}- for Hawkes processes with excitatory kernels and \textit{“Neural Network for Non-Linear Hawkes with Marks”}- for non-linear Hawkes processes. Both these approaches take the past arrival times and their corresponding marks as the input to obtain the arrival intensity. This approach is entirely non-parametric, preserving the interpretability associated with the marked Hawkes process. To validate the efficacy of our method, we subject the method to synthetic datasets with known ground truth. Additionally, we apply our method to model cryptocurrency order book data, demonstrating its applicability to real-world scenarios. ...

Neural Hawkes: Non-Parametric Estimation in High Dimension and Causality Analysis in Cryptocurrency Markets

Neural Hawkes: Non-Parametric Estimation in High Dimension and Causality Analysis in Cryptocurrency Markets ArXiv ID: 2401.09361 “View on arXiv” Authors: Unknown Abstract We propose a novel approach to marked Hawkes kernel inference which we name the moment-based neural Hawkes estimation method. Hawkes processes are fully characterized by their first and second order statistics through a Fredholm integral equation of the second kind. Using recent advances in solving partial differential equations with physics-informed neural networks, we provide a numerical procedure to solve this integral equation in high dimension. Together with an adapted training pipeline, we give a generic set of hyperparameters that produces robust results across a wide range of kernel shapes. We conduct an extensive numerical validation on simulated data. We finally propose two applications of the method to the analysis of the microstructure of cryptocurrency markets. In a first application we extract the influence of volume on the arrival rate of BTC-USD trades and in a second application we analyze the causality relationships and their directions amongst a universe of 15 cryptocurrency pairs in a centralized exchange. ...