false

Multi-Objective Bayesian Optimization of Deep Reinforcement Learning for Environmental, Social, and Governance (ESG) Financial Portfolio Management

Multi-Objective Bayesian Optimization of Deep Reinforcement Learning for Environmental, Social, and Governance (ESG) Financial Portfolio Management ArXiv ID: 2512.14992 “View on arXiv” Authors: M. Coronado-Vaca Abstract DRL agents circumvent the issue of classic models in the sense that they do not make assumptions like the financial returns being normally distributed and are able to deal with any information like the ESG score if they are configured to gain a reward that makes an objective better. However, the performance of DRL agents has high variability and it is very sensible to the value of their hyperparameters. Bayesian optimization is a class of methods that are suited to the optimization of black-box functions, that is, functions whose analytical expression is unknown, are noisy and expensive to evaluate. The hyperparameter tuning problem of DRL algorithms perfectly suits this scenario. As training an agent just for one objective is a very expensive period, requiring millions of timesteps, instead of optimizing an objective being a mixture of a risk-performance metric and an ESG metric, we choose to separate the objective and solve the multi-objective scenario to obtain an optimal Pareto set of portfolios representing the best tradeoff between the Sharpe ratio and the ESG mean score of the portfolio and leaving to the investor the choice of the final portfolio. We conducted our experiments using environments encoded within the OpenAI Gym, adapted from the FinRL platform. The experiments are carried out in the Dow Jones Industrial Average (DJIA) and the NASDAQ markets in terms of the Sharpe ratio achieved by the agent and the mean ESG score of the portfolio. We compare the performance of the obtained Pareto sets in hypervolume terms illustrating how portfolios are the best trade-off between the Sharpe ratio and mean ESG score. Also, we show the usefulness of our proposed methodology by comparing the obtained hypervolume with one achieved by a Random Search methodology on the DRL hyperparameter space. ...

December 17, 2025 · 3 min · Research Team

FX Market Making with Internal Liquidity

FX Market Making with Internal Liquidity ArXiv ID: 2512.04603 “View on arXiv” Authors: Alexander Barzykin, Robert Boyce, Eyal Neuman Abstract As the FX markets continue to evolve, many institutions have started offering passive access to their internal liquidity pools. Market makers act as principal and have the opportunity to fill those orders as part of their risk management, or they may choose to adjust pricing to their external OTC franchise to facilitate the matching flow. It is, a priori, unclear how the strategies managing internal liquidity should depend on market condions, the market maker’s risk appetite, and the placement algorithms deployed by participating clients. The market maker’s actions in the presence of passive orders are relevant not only for their own objectives, but also for those liquidity providers who have certain expectations of the execution speed. In this work, we investigate the optimal multi-objective strategy of a market maker with an option to take liquidity on an internal exchange, and draw important qualitative insights for real-world trading. ...

December 4, 2025 · 2 min · Research Team

NEAT Algorithm-based Stock Trading Strategy with Multiple Technical Indicators Resonance

NEAT Algorithm-based Stock Trading Strategy with Multiple Technical Indicators Resonance ArXiv ID: 2501.14736 “View on arXiv” Authors: Unknown Abstract In this study, we applied the NEAT (NeuroEvolution of Augmenting Topologies) algorithm to stock trading using multiple technical indicators. Our approach focused on maximizing earning, avoiding risk, and outperforming the Buy & Hold strategy. We used progressive training data and a multi-objective fitness function to guide the evolution of the population towards these objectives. The results of our study showed that the NEAT model achieved similar returns to the Buy & Hold strategy, but with lower risk exposure and greater stability. We also identified some challenges in the training process, including the presence of a large number of unused nodes and connections in the model architecture. In future work, it may be worthwhile to explore ways to improve the NEAT algorithm and apply it to shorter interval data in order to assess the potential impact on performance. ...

December 11, 2024 · 2 min · Research Team

MILLION: A General Multi-Objective Framework with Controllable Risk for Portfolio Management

MILLION: A General Multi-Objective Framework with Controllable Risk for Portfolio Management ArXiv ID: 2412.03038 “View on arXiv” Authors: Unknown Abstract Portfolio management is an important yet challenging task in AI for FinTech, which aims to allocate investors’ budgets among different assets to balance the risk and return of an investment. In this study, we propose a general Multi-objectIve framework with controLLable rIsk for pOrtfolio maNagement (MILLION), which consists of two main phases, i.e., return-related maximization and risk control. Specifically, in the return-related maximization phase, we introduce two auxiliary objectives, i.e., return rate prediction, and return rate ranking, combined with portfolio optimization to remit the overfitting problem and improve the generalization of the trained model to future markets. Subsequently, in the risk control phase, we propose two methods, i.e., portfolio interpolation and portfolio improvement, to achieve fine-grained risk control and fast risk adaption to a user-specified risk level. For the portfolio interpolation method, we theoretically prove that the risk can be perfectly controlled if the to-be-set risk level is in a proper interval. In addition, we also show that the return rate of the adjusted portfolio after portfolio interpolation is no less than that of the min-variance optimization, as long as the model in the reward maximization phase is effective. Furthermore, the portfolio improvement method can achieve greater return rates while keeping the same risk level compared to portfolio interpolation. Extensive experiments are conducted on three real-world datasets. The results demonstrate the effectiveness and efficiency of the proposed framework. ...

December 4, 2024 · 2 min · Research Team

Risk management in multi-objective portfolio optimization under uncertainty

Risk management in multi-objective portfolio optimization under uncertainty ArXiv ID: 2407.19936 “View on arXiv” Authors: Unknown Abstract In portfolio optimization, decision makers face difficulties from uncertainties inherent in real-world scenarios. These uncertainties significantly influence portfolio outcomes in both classical and multi-objective Markowitz models. To address these challenges, our research explores the power of robust multi-objective optimization. Since portfolio managers frequently measure their solutions against benchmarks, we enhance the multi-objective min-regret robustness concept by incorporating these benchmark comparisons. This approach bridges the gap between theoretical models and real-world investment scenarios, offering portfolio managers more reliable and adaptable strategies for navigating market uncertainties. Our framework provides a more nuanced and practical approach to portfolio optimization under real-world conditions. ...

July 29, 2024 · 2 min · Research Team

Vector-valued robust stochastic control

Vector-valued robust stochastic control ArXiv ID: 2407.00266 “View on arXiv” Authors: Unknown Abstract We study a dynamic stochastic control problem subject to Knightian uncertainty with multi-objective (vector-valued) criteria. Assuming the preferences across expected multi-loss vectors are represented by a given, yet general, preorder, we address the model uncertainty by adopting a robust or minimax perspective, minimizing expected loss across the worst-case model. For loss functions taking real (or scalar) values, there is no ambiguity in interpreting supremum and infimum. In contrast to the scalar case, major challenges for multi-loss control problems include properly defining and interpreting the notions of supremum and infimum, and addressing the non-uniqueness of these suprema and infima. To deal with these, we employ the notion of an ideal point vector-valued supremum for the robust part of the problem, while we view the control part as a multi-objective (or vector) optimization problem. Using a set-valued framework, we derive both a weak and strong version of the dynamic programming principle (DPP) or Bellman equations by taking the value function as the collection of all worst expected losses across all feasible actions. The weak version of Bellman’s principle is proved under minimal assumptions. To establish a stronger version of DPP, we introduce the rectangularity property with respect to a general preorder. We also further study a particular, but important, case of component-wise partial order of vectors, for which we additionally derive DPP under a different set-valued notion for the value function, the so-called upper image of the multi-objective problem. Finally, we provide illustrative examples motivated by financial problems. These results will serve as a foundation for addressing time-inconsistent problems subject to model uncertainty through the lens of a set-valued framework, as well as for studying multi-portfolio allocation problems under model uncertainty. ...

June 29, 2024 · 2 min · Research Team