false

DiffVolume: Diffusion Models for Volume Generation in Limit Order Books

DiffVolume: Diffusion Models for Volume Generation in Limit Order Books ArXiv ID: 2508.08698 “View on arXiv” Authors: Zhuohan Wang, Carmine Ventre Abstract Modeling limit order books (LOBs) dynamics is a fundamental problem in market microstructure research. In particular, generating high-dimensional volume snapshots with strong temporal and liquidity-dependent patterns remains a challenging task, despite recent work exploring the application of Generative Adversarial Networks to LOBs. In this work, we propose a conditional \textbf{“Diff”}usion model for the generation of future LOB \textbf{“Volume”} snapshots (\textbf{“DiffVolume”}). We evaluate our model across three axes: (1) \textit{“Realism”}, where we show that DiffVolume, conditioned on past volume history and time of day, better reproduces statistical properties such as marginal distribution, spatial correlation, and autocorrelation decay; (2) \textit{“Counterfactual generation”}, allowing for controllable generation under hypothetical liquidity scenarios by additionally conditioning on a target future liquidity profile; and (3) \textit{“Downstream prediction”}, where we show that the synthetic counterfactual data from our model improves the performance of future liquidity forecasting models. Together, these results suggest that DiffVolume provides a powerful and flexible framework for realistic and controllable LOB volume generation. ...

August 12, 2025 · 2 min · Research Team

Evaluating utility in synthetic banking microdata applications

Evaluating utility in synthetic banking microdata applications ArXiv ID: 2410.22519 “View on arXiv” Authors: Unknown Abstract Financial regulators such as central banks collect vast amounts of data, but access to the resulting fine-grained banking microdata is severely restricted by banking secrecy laws. Recent developments have resulted in mechanisms that generate faithful synthetic data, but current evaluation frameworks lack a focus on the specific challenges of banking institutions and microdata. We develop a framework that considers the utility and privacy requirements of regulators, and apply this to financial usage indices, term deposit yield curves, and credit card transition matrices. Using the Central Bank of Paraguay’s data, we provide the first implementation of synthetic banking microdata using a central bank’s collected information, with the resulting synthetic datasets for all three domain applications being publicly available and featuring information not yet released in statistical disclosure. We find that applications less susceptible to post-processing information loss, which are based on frequency tables, are particularly suited for this approach, and that marginal-based inference mechanisms to outperform generative adversarial network models for these applications. Our results demonstrate that synthetic data generation is a promising privacy-enhancing technology for financial regulators seeking to complement their statistical disclosure, while highlighting the crucial role of evaluating such endeavors in terms of utility and privacy requirements. ...

October 29, 2024 · 2 min · Research Team

Can GANs Learn the Stylized Facts of Financial Time Series?

Can GANs Learn the Stylized Facts of Financial Time Series? ArXiv ID: 2410.09850 “View on arXiv” Authors: Unknown Abstract In the financial sector, a sophisticated financial time series simulator is essential for evaluating financial products and investment strategies. Traditional back-testing methods have mainly relied on historical data-driven approaches or mathematical model-driven approaches, such as various stochastic processes. However, in the current era of AI, data-driven approaches, where models learn the intrinsic characteristics of data directly, have emerged as promising techniques. Generative Adversarial Networks (GANs) have surfaced as promising generative models, capturing data distributions through adversarial learning. Financial time series, characterized ‘stylized facts’ such as random walks, mean-reverting patterns, unexpected jumps, and time-varying volatility, present significant challenges for deep neural networks to learn their intrinsic characteristics. This study examines the ability of GANs to learn diverse and complex temporal patterns (i.e., stylized facts) of both univariate and multivariate financial time series. Our extensive experiments revealed that GANs can capture various stylized facts of financial time series, but their performance varies significantly depending on the choice of generator architecture. This suggests that naively applying GANs might not effectively capture the intricate characteristics inherent in financial time series, highlighting the importance of carefully considering and validating the modeling choices. ...

October 13, 2024 · 2 min · Research Team

Robust Utility Optimization via a GAN Approach

Robust Utility Optimization via a GAN Approach ArXiv ID: 2403.15243 “View on arXiv” Authors: Unknown Abstract Robust utility optimization enables an investor to deal with market uncertainty in a structured way, with the goal of maximizing the worst-case outcome. In this work, we propose a generative adversarial network (GAN) approach to (approximately) solve robust utility optimization problems in general and realistic settings. In particular, we model both the investor and the market by neural networks (NN) and train them in a mini-max zero-sum game. This approach is applicable for any continuous utility function and in realistic market settings with trading costs, where only observable information of the market can be used. A large empirical study shows the versatile usability of our method. Whenever an optimal reference strategy is available, our method performs on par with it and in the (many) settings without known optimal strategy, our method outperforms all other reference strategies. Moreover, we can conclude from our study that the trained path-dependent strategies do not outperform Markovian ones. Lastly, we uncover that our generative approach for learning optimal, (non-) robust investments under trading costs generates universally applicable alternatives to well known asymptotic strategies of idealized settings. ...

March 22, 2024 · 2 min · Research Team

RAGIC: Risk-Aware Generative Adversarial Model for Stock Interval Construction

RAGIC: Risk-Aware Generative Adversarial Model for Stock Interval Construction ArXiv ID: 2402.10760 “View on arXiv” Authors: Unknown Abstract Efforts to predict stock market outcomes have yielded limited success due to the inherently stochastic nature of the market, influenced by numerous unpredictable factors. Many existing prediction approaches focus on single-point predictions, lacking the depth needed for effective decision-making and often overlooking market risk. To bridge this gap, we propose a novel model, RAGIC, which introduces sequence generation for stock interval prediction to quantify uncertainty more effectively. Our approach leverages a Generative Adversarial Network (GAN) to produce future price sequences infused with randomness inherent in financial markets. RAGIC’s generator includes a risk module, capturing the risk perception of informed investors, and a temporal module, accounting for historical price trends and seasonality. This multi-faceted generator informs the creation of risk-sensitive intervals through statistical inference, incorporating horizon-wise insights. The interval’s width is carefully adjusted to reflect market volatility. Importantly, our approach relies solely on publicly available data and incurs only low computational overhead. RAGIC’s evaluation across globally recognized broad-based indices demonstrates its balanced performance, offering both accuracy and informativeness. Achieving a consistent 95% coverage, RAGIC maintains a narrow interval width. This promising outcome suggests that our approach effectively addresses the challenges of stock market prediction while incorporating vital risk considerations. ...

February 16, 2024 · 2 min · Research Team