false

ARL-Based Multi-Action Market Making with Hawkes Processes and Variable Volatility

ARL-Based Multi-Action Market Making with Hawkes Processes and Variable Volatility ArXiv ID: 2508.16589 “View on arXiv” Authors: Ziyi Wang, Carmine Ventre, Maria Polukarov Abstract We advance market-making strategies by integrating Adversarial Reinforcement Learning (ARL), Hawkes Processes, and variable volatility levels while also expanding the action space available to market makers (MMs). To enhance the adaptability and robustness of these strategies – which can quote always, quote only on one side of the market or not quote at all – we shift from the commonly used Poisson process to the Hawkes process, which better captures real market dynamics and self-exciting behaviors. We then train and evaluate strategies under volatility levels of 2 and 200. Our findings show that the 4-action MM trained in a low-volatility environment effectively adapts to high-volatility conditions, maintaining stable performance and providing two-sided quotes at least 92% of the time. This indicates that incorporating flexible quoting mechanisms and realistic market simulations significantly enhances the effectiveness of market-making strategies. ...

August 7, 2025 · 2 min · Research Team

Robust Market Making: To Quote, or not To Quote

Robust Market Making: To Quote, or not To Quote ArXiv ID: 2508.16588 “View on arXiv” Authors: Ziyi Wang, Carmine Ventre, Maria Polukarov Abstract Market making is a popular trading strategy, which aims to generate profit from the spread between the quotes posted at either side of the market. It has been shown that training market makers (MMs) with adversarial reinforcement learning allows to overcome the risks due to changing market conditions and to lead to robust performances. Prior work assumes, however, that MMs keep quoting throughout the trading process, but in practice this is not required, even for ``registered’’ MMs (that only need to satisfy quoting ratios defined by the market rules). In this paper, we build on this line of work and enrich the strategy space of the MM by allowing to occasionally not quote or provide single-sided quotes. Towards this end, in addition to the MM agents that provide continuous bid-ask quotes, we have designed two new agents with increasingly richer action spaces. The first has the option to provide bid-ask quotes or refuse to quote. The second has the option to provide bid-ask quotes, refuse to quote, or only provide single-sided ask or bid quotes. We employ a model-driven approach to empirically compare the performance of the continuously quoting MM with the two agents above in various types of adversarial environments. We demonstrate how occasional refusal to provide bid-ask quotes improves returns and/or Sharpe ratios. The quoting ratios of well-trained MMs can basically meet any market requirements, reaching up to 99.9$%$ in some cases. ...

August 7, 2025 · 2 min · Research Team

Comparing Normalization Methods for Portfolio Optimization with Reinforcement Learning

Comparing Normalization Methods for Portfolio Optimization with Reinforcement Learning ArXiv ID: 2508.03910 “View on arXiv” Authors: Caio de Souza Barbosa Costa, Anna Helena Reali Costa Abstract Recently, reinforcement learning has achieved remarkable results in various domains, including robotics, games, natural language processing, and finance. In the financial domain, this approach has been applied to tasks such as portfolio optimization, where an agent continuously adjusts the allocation of assets within a financial portfolio to maximize profit. Numerous studies have introduced new simulation environments, neural network architectures, and training algorithms for this purpose. Among these, a domain-specific policy gradient algorithm has gained significant attention in the research community for being lightweight, fast, and for outperforming other approaches. However, recent studies have shown that this algorithm can yield inconsistent results and underperform, especially when the portfolio does not consist of cryptocurrencies. One possible explanation for this issue is that the commonly used state normalization method may cause the agent to lose critical information about the true value of the assets being traded. This paper explores this hypothesis by evaluating two of the most widely used normalization methods across three different markets (IBOVESPA, NYSE, and cryptocurrencies) and comparing them with the standard practice of normalizing data before training. The results indicate that, in this specific domain, the state normalization can indeed degrade the agent’s performance. ...

August 5, 2025 · 2 min · Research Team

Momentum-integrated Multi-task Stock Recommendation with Converge-based Optimization

Momentum-integrated Multi-task Stock Recommendation with Converge-based Optimization ArXiv ID: 2509.10461 “View on arXiv” Authors: Hao Wang, Jingshu Peng, Yanyan Shen, Xujia Li, Lei Chen Abstract Stock recommendation is critical in Fintech applications, which use price series and alternative information to estimate future stock performance. Although deep learning models are prevalent in stock recommendation systems, traditional time-series forecasting training often fails to capture stock trends and rankings simultaneously, which are essential consideration factors for investors. To tackle this issue, we introduce a Multi-Task Learning (MTL) framework for stock recommendation, \textbf{“M”}omentum-\textbf{“i”}ntegrated \textbf{“M”}ulti-task \textbf{“Stoc”}k \textbf{“R”}ecommendation with Converge-based Optimization (\textbf{“MiM-StocR”}). To improve the model’s ability to capture short-term trends, we novelly invoke a momentum line indicator in model training. To prioritize top-performing stocks and optimize investment allocation, we propose a list-wise ranking loss function called Adaptive-k ApproxNDCG. Moreover, due to the volatility and uncertainty of the stock market, existing MTL frameworks face overfitting issues when applied to stock time series. To mitigate this issue, we introduce the Converge-based Quad-Balancing (CQB) method. We conducted extensive experiments on three stock benchmarks: SEE50, CSI 100, and CSI 300. MiM-StocR outperforms state-of-the-art MTL baselines across both ranking and profitable evaluations. ...

August 5, 2025 · 2 min · Research Team

To Bubble or Not to Bubble: Asset Price Dynamics and Optimality in OLG Economies

To Bubble or Not to Bubble: Asset Price Dynamics and Optimality in OLG Economies ArXiv ID: 2508.03230 “View on arXiv” Authors: Stefano Bosi, Cuong Le Van, Ngoc-Sang Pham Abstract We study an overlapping generations (OLG) exchange economy with an asset that yields dividends. First, we derive general conditions, based on exogenous parameters, that give rise to three distinct scenarios: (1) only bubbleless equilibria exist, (2) a bubbleless equilibrium coexists with a continuum of bubbly equilibria, and (3) all equilibria are bubbly. Under stationary endowments and standard assumptions, we provide a complete characterization of the equilibrium set and the associated asset price dynamics. In this setting, a bubbly equilibrium exists if and only if the interest rate in the economy without the asset is strictly lower than the population growth rate and the sum of per capita dividends is finite. Second, we establish necessary and sufficient conditions for Pareto optimality. Finally, we investigate the relationship between asset price behaviors and the optimality of equilibria. ...

August 5, 2025 · 2 min · Research Team

Markowitz Variance May Vastly Undervalue or Overestimate Portfolio Variance and Risks

Markowitz Variance May Vastly Undervalue or Overestimate Portfolio Variance and Risks ArXiv ID: 2507.21824 “View on arXiv” Authors: Victor Olkhov Abstract We consider the investor who doesn’t trade shares of his portfolio. The investor only observes the current trades made in the market with his securities to estimate the current return, variance, and risks of his unchanged portfolio. We show how the time series of consecutive trades made in the market with the securities of the portfolio can determine the time series that model the trades with the portfolio as with a single security. That establishes the equal description of the market-based variance of the securities and of the portfolio composed of these securities that account for the fluctuations of the volumes of the consecutive trades. We show that Markowitz’s (1952) variance describes only the approximation when all volumes of the consecutive trades with securities are assumed constant. The market-based variance depends on the coefficient of variation of fluctuations of volumes of trades. To emphasize this dependence and to estimate possible deviation from Markowitz variance, we derive the Taylor series of the market-based variance up to the 2nd term by the coefficient of variation, taking Markowitz variance as a zero approximation. We consider three limiting cases with low and high fluctuations of the portfolio returns, and with a zero covariance of trade values and volumes and show that the impact of the coefficient of variation of trade volume fluctuations can cause Markowitz’s assessment to highly undervalue or overestimate the market-based variance of the portfolio. Incorrect assessments of the variances of securities and of the portfolio cause wrong risk estimates, disturb optimal portfolio selection, and result in unexpected losses. The major investors, portfolio managers, and developers of macroeconomic models like BlackRock, JP Morgan, and the U.S. Fed should use market-based variance to adjust their predictions to the randomness of market trades. ...

July 29, 2025 · 3 min · Research Team

Quantum generative modeling for financial time series with temporal correlations

Quantum generative modeling for financial time series with temporal correlations ArXiv ID: 2507.22035 “View on arXiv” Authors: David Dechant, Eliot Schwander, Lucas van Drooge, Charles Moussa, Diego Garlaschelli, Vedran Dunjko, Jordi Tura Abstract Quantum generative adversarial networks (QGANs) have been investigated as a method for generating synthetic data with the goal of augmenting training data sets for neural networks. This is especially relevant for financial time series, since we only ever observe one realization of the process, namely the historical evolution of the market, which is further limited by data availability and the age of the market. However, for classical generative adversarial networks it has been shown that generated data may (often) not exhibit desired properties (also called stylized facts), such as matching a certain distribution or showing specific temporal correlations. Here, we investigate whether quantum correlations in quantum inspired models of QGANs can help in the generation of financial time series. We train QGANs, composed of a quantum generator and a classical discriminator, and investigate two approaches for simulating the quantum generator: a full simulation of the quantum circuits, and an approximate simulation using tensor network methods. We tested how the choice of hyperparameters, such as the circuit depth and bond dimensions, influenced the quality of the generated time series. The QGAN that we trained generate synthetic financial time series that not only match the target distribution but also exhibit the desired temporal correlations, with the quality of each property depending on the hyperparameters and simulation method. ...

July 29, 2025 · 2 min · Research Team

Your AI, Not Your View: The Bias of LLMs in Investment Analysis

Your AI, Not Your View: The Bias of LLMs in Investment Analysis ArXiv ID: 2507.20957 “View on arXiv” Authors: Hoyoung Lee, Junhyuk Seo, Suhwan Park, Junhyeong Lee, Wonbin Ahn, Chanyeol Choi, Alejandro Lopez-Lira, Yongjae Lee Abstract In finance, Large Language Models (LLMs) face frequent knowledge conflicts arising from discrepancies between their pre-trained parametric knowledge and real-time market data. These conflicts are especially problematic in real-world investment services, where a model’s inherent biases can misalign with institutional objectives, leading to unreliable recommendations. Despite this risk, the intrinsic investment biases of LLMs remain underexplored. We propose an experimental framework to investigate emergent behaviors in such conflict scenarios, offering a quantitative analysis of bias in LLM-based investment analysis. Using hypothetical scenarios with balanced and imbalanced arguments, we extract the latent biases of models and measure their persistence. Our analysis, centered on sector, size, and momentum, reveals distinct, model-specific biases. Across most models, a tendency to prefer technology stocks, large-cap stocks, and contrarian strategies is observed. These foundational biases often escalate into confirmation bias, causing models to cling to initial judgments even when faced with increasing counter-evidence. A public leaderboard benchmarking bias across a broader set of models is available at https://linqalpha.com/leaderboard ...

July 28, 2025 · 2 min · Research Team

Learning from Expert Factors: Trajectory-level Reward Shaping for Formulaic Alpha Mining

Learning from Expert Factors: Trajectory-level Reward Shaping for Formulaic Alpha Mining ArXiv ID: 2507.20263 “View on arXiv” Authors: Junjie Zhao, Chengxi Zhang, Chenkai Wang, Peng Yang Abstract Reinforcement learning (RL) has successfully automated the complex process of mining formulaic alpha factors, for creating interpretable and profitable investment strategies. However, existing methods are hampered by the sparse rewards given the underlying Markov Decision Process. This inefficiency limits the exploration of the vast symbolic search space and destabilizes the training process. To address this, Trajectory-level Reward Shaping (TLRS), a novel reward shaping method, is proposed. TLRS provides dense, intermediate rewards by measuring the subsequence-level similarity between partially generated expressions and a set of expert-designed formulas. Furthermore, a reward centering mechanism is introduced to reduce training variance. Extensive experiments on six major Chinese and U.S. stock indices show that TLRS significantly improves the predictive power of mined factors, boosting the Rank Information Coefficient by 9.29% over existing potential-based shaping algorithms. Notably, TLRS achieves a major leap in computational efficiency by reducing its time complexity with respect to the feature dimension from linear to constant, which is a significant improvement over distance-based baselines. ...

July 27, 2025 · 2 min · Research Team

Technical Indicator Networks (TINs): An Interpretable Neural Architecture Modernizing Classic al Technical Analysis for Adaptive Algorithmic Trading

Technical Indicator Networks (TINs): An Interpretable Neural Architecture Modernizing Classic al Technical Analysis for Adaptive Algorithmic Trading ArXiv ID: 2507.20202 “View on arXiv” Authors: Longfei Lu Abstract Deep neural networks (DNNs) have transformed fields such as computer vision and natural language processing by employing architectures aligned with domain-specific structural patterns. In algorithmic trading, however, there remains a lack of architectures that directly incorporate the logic of traditional technical indicators. This study introduces Technical Indicator Networks (TINs), a structured neural design that reformulates rule-based financial heuristics into trainable and interpretable modules. The architecture preserves the core mathematical definitions of conventional indicators while extending them to multidimensional data and supporting optimization through diverse learning paradigms, including reinforcement learning. Analytical transformations such as averaging, clipping, and ratio computation are expressed as vectorized layer operators, enabling transparent network construction and principled initialization. This formulation retains the clarity and interpretability of classical strategies while allowing adaptive adjustment and data-driven refinement. As a proof of concept, the framework is validated on the Dow Jones Industrial Average constituents using a Moving Average Convergence Divergence (MACD) TIN. Empirical results demonstrate improved risk-adjusted performance relative to traditional indicator-based strategies. Overall, the findings suggest that TINs provide a generalizable foundation for interpretable, adaptive, and extensible learning architectures in structured decision-making domains and indicate substantial commercial potential for upgrading trading platforms with cross-market visibility and enhanced decision-support capabilities. ...

July 27, 2025 · 2 min · Research Team