false

Exploiting Risk-Aversion and Size-dependent fees in FX Trading with Fitted Natural Actor-Critic

Exploiting Risk-Aversion and Size-dependent fees in FX Trading with Fitted Natural Actor-Critic ArXiv ID: 2410.23294 “View on arXiv” Authors: Unknown Abstract In recent years, the popularity of artificial intelligence has surged due to its widespread application in various fields. The financial sector has harnessed its advantages for multiple purposes, including the development of automated trading systems designed to interact autonomously with markets to pursue different aims. In this work, we focus on the possibility of recognizing and leveraging intraday price patterns in the Foreign Exchange market, known for its extensive liquidity and flexibility. Our approach involves the implementation of a Reinforcement Learning algorithm called Fitted Natural Actor-Critic. This algorithm allows the training of an agent capable of effectively trading by means of continuous actions, which enable the possibility of executing orders with variable trading sizes. This feature is instrumental to realistically model transaction costs, as they typically depend on the order size. Furthermore, it facilitates the integration of risk-averse approaches to induce the agent to adopt more conservative behavior. The proposed approaches have been empirically validated on EUR-USD historical data. ...

October 15, 2024 · 2 min · Research Team

Improving Portfolio Optimization Results with Bandit Networks

Improving Portfolio Optimization Results with Bandit Networks ArXiv ID: 2410.04217 “View on arXiv” Authors: Unknown Abstract In Reinforcement Learning (RL), multi-armed Bandit (MAB) problems have found applications across diverse domains such as recommender systems, healthcare, and finance. Traditional MAB algorithms typically assume stationary reward distributions, which limits their effectiveness in real-world scenarios characterized by non-stationary dynamics. This paper addresses this limitation by introducing and evaluating novel Bandit algorithms designed for non-stationary environments. First, we present the Adaptive Discounted Thompson Sampling (ADTS) algorithm, which enhances adaptability through relaxed discounting and sliding window mechanisms to better respond to changes in reward distributions. We then extend this approach to the Portfolio Optimization problem by introducing the Combinatorial Adaptive Discounted Thompson Sampling (CADTS) algorithm, which addresses computational challenges within Combinatorial Bandits and improves dynamic asset allocation. Additionally, we propose a novel architecture called Bandit Networks, which integrates the outputs of ADTS and CADTS, thereby mitigating computational limitations in stock selection. Through extensive experiments using real financial market data, we demonstrate the potential of these algorithms and architectures in adapting to dynamic environments and optimizing decision-making processes. For instance, the proposed bandit network instances present superior performance when compared to classic portfolio optimization approaches, such as capital asset pricing model, equal weights, risk parity, and Markovitz, with the best network presenting an out-of-sample Sharpe Ratio 20% higher than the best performing classical model. ...

October 5, 2024 · 2 min · Research Team

Robust Reinforcement Learning with Dynamic Distortion Risk Measures

Robust Reinforcement Learning with Dynamic Distortion Risk Measures ArXiv ID: 2409.10096 “View on arXiv” Authors: Unknown Abstract In a reinforcement learning (RL) setting, the agent’s optimal strategy heavily depends on her risk preferences and the underlying model dynamics of the training environment. These two aspects influence the agent’s ability to make well-informed and time-consistent decisions when facing testing environments. In this work, we devise a framework to solve robust risk-aware RL problems where we simultaneously account for environmental uncertainty and risk with a class of dynamic robust distortion risk measures. Robustness is introduced by considering all models within a Wasserstein ball around a reference model. We estimate such dynamic robust risk measures using neural networks by making use of strictly consistent scoring functions, derive policy gradient formulae using the quantile representation of distortion risk measures, and construct an actor-critic algorithm to solve this class of robust risk-aware RL problems. We demonstrate the performance of our algorithm on a portfolio allocation example. ...

September 16, 2024 · 2 min · Research Team

Deviations from the Nash equilibrium and emergence of tacit collusion in a two-player optimal execution game with reinforcement learning

Deviations from the Nash equilibrium and emergence of tacit collusion in a two-player optimal execution game with reinforcement learning ArXiv ID: 2408.11773 “View on arXiv” Authors: Unknown Abstract The use of reinforcement learning algorithms in financial trading is becoming increasingly prevalent. However, the autonomous nature of these algorithms can lead to unexpected outcomes that deviate from traditional game-theoretical predictions and may even destabilize markets. In this study, we examine a scenario in which two autonomous agents, modeled with Double Deep Q-Learning, learn to liquidate the same asset optimally in the presence of market impact, using the Almgren-Chriss (2000) framework. Our results show that the strategies learned by the agents deviate significantly from the Nash equilibrium of the corresponding market impact game. Notably, the learned strategies exhibit tacit collusion, closely aligning with the Pareto-optimal solution. We further explore how different levels of market volatility influence the agents’ performance and the equilibria they discover, including scenarios where volatility differs between the training and testing phases. ...

August 21, 2024 · 2 min · Research Team

Optimizing Portfolio with Two-Sided Transactions and Lending: A Reinforcement Learning Framework

Optimizing Portfolio with Two-Sided Transactions and Lending: A Reinforcement Learning Framework ArXiv ID: 2408.05382 “View on arXiv” Authors: Unknown Abstract This study presents a Reinforcement Learning (RL)-based portfolio management model tailored for high-risk environments, addressing the limitations of traditional RL models and exploiting market opportunities through two-sided transactions and lending. Our approach integrates a new environmental formulation with a Profit and Loss (PnL)-based reward function, enhancing the RL agent’s ability in downside risk management and capital optimization. We implemented the model using the Soft Actor-Critic (SAC) agent with a Convolutional Neural Network with Multi-Head Attention (CNN-MHA). This setup effectively manages a diversified 12-crypto asset portfolio in the Binance perpetual futures market, leveraging USDT for both granting and receiving loans and rebalancing every 4 hours, utilizing market data from the preceding 48 hours. Tested over two 16-month periods of varying market volatility, the model significantly outperformed benchmarks, particularly in high-volatility scenarios, achieving higher return-to-risk ratios and demonstrating robust profitability. These results confirm the model’s effectiveness in leveraging market dynamics and managing risks in volatile environments like the cryptocurrency market. ...

August 9, 2024 · 2 min · Research Team

Enhancing Deep Hedging of Options with Implied Volatility Surface Feedback Information

Enhancing Deep Hedging of Options with Implied Volatility Surface Feedback Information ArXiv ID: 2407.21138 “View on arXiv” Authors: Unknown Abstract We present a dynamic hedging scheme for S&P 500 options, where rebalancing decisions are enhanced by integrating information about the implied volatility surface dynamics. The optimal hedging strategy is obtained through a deep policy gradient-type reinforcement learning algorithm. The favorable inclusion of forward-looking information embedded in the volatility surface allows our procedure to outperform several conventional benchmarks such as practitioner and smiled-implied delta hedging procedures, both in simulation and backtesting experiments. The outperformance is more pronounced in the presence of transaction costs. ...

July 30, 2024 · 2 min · Research Team

Advanced Financial Fraud Detection Using GNN-CL Model

Advanced Financial Fraud Detection Using GNN-CL Model ArXiv ID: 2407.06529 “View on arXiv” Authors: Unknown Abstract The innovative GNN-CL model proposed in this paper marks a breakthrough in the field of financial fraud detection by synergistically combining the advantages of graph neural networks (gnn), convolutional neural networks (cnn) and long short-term memory (LSTM) networks. This convergence enables multifaceted analysis of complex transaction patterns, improving detection accuracy and resilience against complex fraudulent activities. A key novelty of this paper is the use of multilayer perceptrons (MLPS) to estimate node similarity, effectively filtering out neighborhood noise that can lead to false positives. This intelligent purification mechanism ensures that only the most relevant information is considered, thereby improving the model’s understanding of the network structure. Feature weakening often plagues graph-based models due to the dilution of key signals. In order to further address the challenge of feature weakening, GNN-CL adopts reinforcement learning strategies. By dynamically adjusting the weights assigned to central nodes, it reinforces the importance of these influential entities to retain important clues of fraud even in less informative data. Experimental evaluations on Yelp datasets show that the results highlight the superior performance of GNN-CL compared to existing methods. ...

July 9, 2024 · 2 min · Research Team

AlphaForge: A Framework to Mine and Dynamically Combine Formulaic Alpha Factors

AlphaForge: A Framework to Mine and Dynamically Combine Formulaic Alpha Factors ArXiv ID: 2406.18394 “View on arXiv” Authors: Unknown Abstract The complexity of financial data, characterized by its variability and low signal-to-noise ratio, necessitates advanced methods in quantitative investment that prioritize both performance and interpretability.Transitioning from early manual extraction to genetic programming, the most advanced approach in the alpha factor mining domain currently employs reinforcement learning to mine a set of combination factors with fixed weights. However, the performance of resultant alpha factors exhibits inconsistency, and the inflexibility of fixed factor weights proves insufficient in adapting to the dynamic nature of financial markets. To address this issue, this paper proposes a two-stage formulaic alpha generating framework AlphaForge, for alpha factor mining and factor combination. This framework employs a generative-predictive neural network to generate factors, leveraging the robust spatial exploration capabilities inherent in deep learning while concurrently preserving diversity. The combination model within the framework incorporates the temporal performance of factors for selection and dynamically adjusts the weights assigned to each component alpha factor. Experiments conducted on real-world datasets demonstrate that our proposed model outperforms contemporary benchmarks in formulaic alpha factor mining. Furthermore, our model exhibits a notable enhancement in portfolio returns within the realm of quantitative investment and real money investment. ...

June 26, 2024 · 2 min · Research Team

Reinforcement Learning for Corporate Bond Trading: A Sell Side Perspective

Reinforcement Learning for Corporate Bond Trading: A Sell Side Perspective ArXiv ID: 2406.12983 “View on arXiv” Authors: Unknown Abstract A corporate bond trader in a typical sell side institution such as a bank provides liquidity to the market participants by buying/selling securities and maintaining an inventory. Upon receiving a request for a buy/sell price quote (RFQ), the trader provides a quote by adding a spread over a \textit{“prevalent market price”}. For illiquid bonds, the market price is harder to observe, and traders often resort to available benchmark bond prices (such as MarketAxess, Bloomberg, etc.). In \cite{“Bergault2023ModelingLI”}, the concept of \textit{“Fair Transfer Price”} for an illiquid corporate bond was introduced which is derived from an infinite horizon stochastic optimal control problem (for maximizing the trader’s expected P&L, regularized by the quadratic variation). In this paper, we consider the same optimization objective, however, we approach the estimation of an optimal bid-ask spread quoting strategy in a data driven manner and show that it can be learned using Reinforcement Learning. Furthermore, we perform extensive outcome analysis to examine the reasonableness of the trained agent’s behavior. ...

June 18, 2024 · 2 min · Research Team

MOT: A Mixture of Actors Reinforcement Learning Method by Optimal Transport for Algorithmic Trading

MOT: A Mixture of Actors Reinforcement Learning Method by Optimal Transport for Algorithmic Trading ArXiv ID: 2407.01577 “View on arXiv” Authors: Unknown Abstract Algorithmic trading refers to executing buy and sell orders for specific assets based on automatically identified trading opportunities. Strategies based on reinforcement learning (RL) have demonstrated remarkable capabilities in addressing algorithmic trading problems. However, the trading patterns differ among market conditions due to shifted distribution data. Ignoring multiple patterns in the data will undermine the performance of RL. In this paper, we propose MOT,which designs multiple actors with disentangled representation learning to model the different patterns of the market. Furthermore, we incorporate the Optimal Transport (OT) algorithm to allocate samples to the appropriate actor by introducing a regularization loss term. Additionally, we propose Pretrain Module to facilitate imitation learning by aligning the outputs of actors with expert strategy and better balance the exploration and exploitation of RL. Experimental results on real futures market data demonstrate that MOT exhibits excellent profit capabilities while balancing risks. Ablation studies validate the effectiveness of the components of MOT. ...

June 3, 2024 · 2 min · Research Team