false

From Deep Learning to LLMs: A survey of AI in Quantitative Investment

From Deep Learning to LLMs: A survey of AI in Quantitative Investment ArXiv ID: 2503.21422 “View on arXiv” Authors: Unknown Abstract Quantitative investment (quant) is an emerging, technology-driven approach in asset management, increasingy shaped by advancements in artificial intelligence. Recent advances in deep learning and large language models (LLMs) for quant finance have improved predictive modeling and enabled agent-based automation, suggesting a potential paradigm shift in this field. In this survey, taking alpha strategy as a representative example, we explore how AI contributes to the quantitative investment pipeline. We first examine the early stage of quant research, centered on human-crafted features and traditional statistical models with an established alpha pipeline. We then discuss the rise of deep learning, which enabled scalable modeling across the entire pipeline from data processing to order execution. Building on this, we highlight the emerging role of LLMs in extending AI beyond prediction, empowering autonomous agents to process unstructured data, generate alphas, and support self-iterative workflows. ...

March 27, 2025 · 2 min · Research Team

Financial Wind Tunnel: A Retrieval-Augmented Market Simulator

Financial Wind Tunnel: A Retrieval-Augmented Market Simulator ArXiv ID: 2503.17909 “View on arXiv” Authors: Unknown Abstract Market simulator tries to create high-quality synthetic financial data that mimics real-world market dynamics, which is crucial for model development and robust assessment. Despite continuous advancements in simulation methodologies, market fluctuations vary in terms of scale and sources, but existing frameworks often excel in only specific tasks. To address this challenge, we propose Financial Wind Tunnel (FWT), a retrieval-augmented market simulator designed to generate controllable, reasonable, and adaptable market dynamics for model testing. FWT offers a more comprehensive and systematic generative capability across different data frequencies. By leveraging a retrieval method to discover cross-sectional information as the augmented condition, our diffusion-based simulator seamlessly integrates both macro- and micro-level market patterns. Furthermore, our framework allows the simulation to be controlled with wide applicability, including causal generation through “what-if” prompts or unprecedented cross-market trend synthesis. Additionally, we develop an automated optimizer for downstream quantitative models, using stress testing of simulated scenarios via FWT to enhance returns while controlling risks. Experimental results demonstrate that our approach enables the generalizable and reliable market simulation, significantly improve the performance and adaptability of downstream models, particularly in highly complex and volatile market conditions. Our code and data sample is available at https://anonymous.4open.science/r/fwt_-E852 ...

March 23, 2025 · 2 min · Research Team

Bayesian Optimization for CVaR-based portfolio optimization

Bayesian Optimization for CVaR-based portfolio optimization ArXiv ID: 2503.17737 “View on arXiv” Authors: Unknown Abstract Optimal portfolio allocation is often formulated as a constrained risk problem, where one aims to minimize a risk measure subject to some performance constraints. This paper presents new Bayesian Optimization algorithms for such constrained minimization problems, seeking to minimize the conditional value-at-risk (a computationally intensive risk measure) under a minimum expected return constraint. The proposed algorithms utilize a new acquisition function, which drives sampling towards the optimal region. Additionally, a new two-stage procedure is developed, which significantly reduces the number of evaluations of the expensive-to-evaluate objective function. The proposed algorithm’s competitive performance is demonstrated through practical examples. ...

March 22, 2025 · 2 min · Research Team

Dynamic Factor Model-Based Multiperiod Mean-Variance Portfolio Selection with Portfolio Constraints

Dynamic Factor Model-Based Multiperiod Mean-Variance Portfolio Selection with Portfolio Constraints ArXiv ID: 2502.17915 “View on arXiv” Authors: Unknown Abstract Motivated by practical applications, we explore the constrained multi-period mean-variance portfolio selection problem within a market characterized by a dynamic factor model. This model captures predictability in asset returns driven by state variables and incorporates cone-type portfolio constraints that are crucial in practice. The model is broad enough to encompass various dynamic factor frameworks, including practical considerations such as no-short-selling and cardinality constraints. We derive a semi-analytical optimal solution using dynamic programming, revealing it as a piecewise linear feedback policy to wealth, with all factors embedded within the allocation vectors. Additionally, we demonstrate that the portfolio policies are determined by two specific stochastic processes resulting from the stochastic optimizations, for which we provide detailed algorithms. These processes reflect the investor’s assessment of future investment opportunities and play a crucial role in characterizing the time consistency and efficiency of the optimal policy through the variance-optimal signed supermartingale measure of the market. We present numerical examples that illustrate the model’s application in various settings. Using real market data, we investigate how the factors influence portfolio policies and demonstrate that incorporating the factor structure may enhance out-of-sample performance. ...

February 25, 2025 · 2 min · Research Team

Trends and Reversion in Financial Markets on Time Scales from Minutes to Decades

Trends and Reversion in Financial Markets on Time Scales from Minutes to Decades ArXiv ID: 2501.16772 “View on arXiv” Authors: Unknown Abstract We empirically analyze the reversion of financial market trends with time horizons ranging from minutes to decades. The analysis covers equities, interest rates, currencies and commodities and combines 14 years of futures tick data, 30 years of daily futures prices, 330 years of monthly asset prices, and yearly financial data since medieval times. Across asset classes, we find that markets are in a trending regime on time scales that range from a few hours to a few years, while they are in a reversion regime on shorter and longer time scales. In the trending regime, weak trends tend to persist, which can be explained by herding behavior of investors. However, in this regime trends tend to revert before they become strong enough to be statistically significant, which can be interpreted as a return of asset prices to their intrinsic value. In the reversion regime, we find the opposite pattern: weak trends tend to revert, while those trends that become statistically significant tend to persist. Our results provide a set of empirical tests of theoretical models of financial markets. We interpret them in the light of a recently proposed lattice gas model, where the lattice represents the social network of traders, the gas molecules represent the shares of financial assets, and efficient markets correspond to the critical point. If this model is accurate, the lattice gas must be near this critical point on time scales from 1 hour to a few days, with a correlation time of a few years. ...

January 28, 2025 · 3 min · Research Team

Breaking the Dimensional Barrier for Constrained Dynamic Portfolio Choice

Breaking the Dimensional Barrier for Constrained Dynamic Portfolio Choice ArXiv ID: 2501.12600 “View on arXiv” Authors: Unknown Abstract We propose a scalable, policy-centric framework for continuous-time multi-asset portfolio-consumption optimization under inequality constraints. Our method integrates neural policies with Pontryagin’s Maximum Principle (PMP) and enforces feasibility by maximizing a log-barrier-regularized Hamiltonian at each time-state pair, thereby satisfying KKT conditions without value-function grids. Theoretically, we show that the barrier-regularized Hamiltonian yields O($ε$) policy error and a linear Hamiltonian gap (quadratic when the KKT solution is interior), and we extend the BPTT-PMP correspondence to constrained settings with stable costate convergence. Empirically, PG-DPO and its projected variant (P-PGDPO) recover KKT-optimal policies in canonical short-sale and consumption-cap problems while maintaining strict feasibility across dimensions; unlike PDE/BSDE solvers, runtime scales linearly with the number of assets and remains practical at n=100. These results provide a rigorous and scalable foundation for high-dimensional constrained continuous-time portfolio optimization. ...

January 22, 2025 · 2 min · Research Team

Optimizing Portfolio Performance through Clustering and Sharpe Ratio-Based Optimization: A Comparative Backtesting Approach

Optimizing Portfolio Performance through Clustering and Sharpe Ratio-Based Optimization: A Comparative Backtesting Approach ArXiv ID: 2501.12074 “View on arXiv” Authors: Unknown Abstract Optimizing portfolio performance is a fundamental challenge in financial modeling, requiring the integration of advanced clustering techniques and data-driven optimization strategies. This paper introduces a comparative backtesting approach that combines clustering-based portfolio segmentation and Sharpe ratio-based optimization to enhance investment decision-making. First, we segment a diverse set of financial assets into clusters based on their historical log-returns using K-Means clustering. This segmentation enables the grouping of assets with similar return characteristics, facilitating targeted portfolio construction. Next, for each cluster, we apply a Sharpe ratio-based optimization model to derive optimal weights that maximize risk-adjusted returns. Unlike traditional mean-variance optimization, this approach directly incorporates the trade-off between returns and volatility, resulting in a more balanced allocation of resources within each cluster. The proposed framework is evaluated through a backtesting study using historical data spanning multiple asset classes. Optimized portfolios for each cluster are constructed and their cumulative returns are compared over time against a traditional equal-weighted benchmark portfolio. ...

January 21, 2025 · 2 min · Research Team

Federated Diffusion Modeling with Differential Privacy for Tabular Data Synthesis

Federated Diffusion Modeling with Differential Privacy for Tabular Data Synthesis ArXiv ID: 2412.16083 “View on arXiv” Authors: Unknown Abstract The increasing demand for privacy-preserving data analytics in various domains necessitates solutions for synthetic data generation that rigorously uphold privacy standards. We introduce the DP-FedTabDiff framework, a novel integration of Differential Privacy, Federated Learning and Denoising Diffusion Probabilistic Models designed to generate high-fidelity synthetic tabular data. This framework ensures compliance with privacy regulations while maintaining data utility. We demonstrate the effectiveness of DP-FedTabDiff on multiple real-world mixed-type tabular datasets, achieving significant improvements in privacy guarantees without compromising data quality. Our empirical evaluations reveal the optimal trade-offs between privacy budgets, client configurations, and federated optimization strategies. The results affirm the potential of DP-FedTabDiff to enable secure data sharing and analytics in highly regulated domains, paving the way for further advances in federated learning and privacy-preserving data synthesis. ...

December 20, 2024 · 2 min · Research Team

Financial Fine-tuning a Large Time Series Model

Financial Fine-tuning a Large Time Series Model ArXiv ID: 2412.09880 “View on arXiv” Authors: Unknown Abstract Large models have shown unprecedented capabilities in natural language processing, image generation, and most recently, time series forecasting. This leads us to ask the question: treating market prices as a time series, can large models be used to predict the market? In this paper, we answer this by evaluating the performance of the latest time series foundation model TimesFM on price prediction. We find that due to the irregular nature of price data, directly applying TimesFM gives unsatisfactory results and propose to fine-tune TimeFM on financial data for the task of price prediction. This is done by continual pre-training of the latest time series foundation model TimesFM on price data containing 100 million time points, spanning a range of financial instruments spanning hourly and daily granularities. The fine-tuned model demonstrates higher price prediction accuracy than the baseline model. We conduct mock trading for our model in various financial markets and show that it outperforms various benchmarks in terms of returns, sharpe ratio, max drawdown and trading cost. ...

December 13, 2024 · 2 min · Research Team

Efficient and Verified Continuous Double Auctions

Efficient and Verified Continuous Double Auctions ArXiv ID: 2412.08624 “View on arXiv” Authors: Unknown Abstract Continuous double auctions are commonly used to match orders at currency, stock, and commodities exchanges. A verified implementation of continuous double auctions is a useful tool for market regulators as they give rise to automated checkers that are guaranteed to detect errors in the trade logs of an existing exchange if they contain trades that violate the matching rules. We provide an efficient and formally verified implementation of continuous double auctions that takes $O(n \log n)$ time to match $n$ orders. This improves an earlier $O(n^2)$ verified implementation. We also prove a matching $Ω(n\log n)$ lower bound on the running time for continuous double auctions. Our new implementation takes only a couple of minutes to run on ten million randomly generated orders as opposed to a few days taken by the earlier implementation. Our new implementation gives rise to an efficient automatic checker. We use the Coq proof assistant for verifying our implementation and extracting a verified OCaml program. While using Coq’s standard library implementation of red-black trees to obtain our improvement, we observed that its specification has serious gaps, which we fill in this work; this might be of independent interest. ...

December 11, 2024 · 2 min · Research Team