false

Smart Predict--then--Optimize Paradigm for Portfolio Optimization in Real Markets

Smart Predict–then–Optimize Paradigm for Portfolio Optimization in Real Markets ArXiv ID: 2601.04062 “View on arXiv” Authors: Wang Yi, Takashi Hasuike Abstract Improvements in return forecast accuracy do not always lead to proportional improvements in portfolio decision quality, especially under realistic trading frictions and constraints. This paper adopts the Smart Predict–then–Optimize (SPO) paradigm for portfolio optimization in real markets, which explicitly aligns the learning objective with downstream portfolio decision quality rather than pointwise prediction accuracy. Within this paradigm, predictive models are trained using an SPO-based surrogate loss that directly reflects the performance of the resulting investment decisions. To preserve interpretability and robustness, we employ linear predictors built on return-based and technical-indicator features and integrate them with portfolio optimization models that incorporate transaction costs, turnover control, and regularization. We evaluate the proposed approach on U.S. ETF data (2015–2025) using a rolling-window backtest with monthly rebalancing. Empirical results show that decision-focused training consistently improves risk-adjusted performance over predict–then–optimize baselines and classical optimization benchmarks, and yields strong robustness during adverse market regimes (e.g., the 2020 COVID-19). These findings highlight the practical value of the Smart Predict–then–Optimize paradigm for portfolio optimization in realistic and non-stationary financial environments. ...

January 7, 2026 · 2 min · Research Team

Deep Hedging with Reinforcement Learning: A Practical Framework for Option Risk Management

Deep Hedging with Reinforcement Learning: A Practical Framework for Option Risk Management ArXiv ID: 2512.12420 “View on arXiv” Authors: Travon Lucius, Christian Koch, Jacob Starling, Julia Zhu, Miguel Urena, Carrie Hu Abstract We present a reinforcement-learning (RL) framework for dynamic hedging of equity index option exposures under realistic transaction costs and position limits. We hedge a normalized option-implied equity exposure (one unit of underlying delta, offset via SPY) by trading the underlying index ETF, using the option surface and macro variables only as state information and not as a direct pricing engine. Building on the “deep hedging” paradigm of Buehler et al. (2019), we design a leak-free environment, a cost-aware reward function, and a lightweight stochastic actor-critic agent trained on daily end-of-day panel data constructed from SPX/SPY implied volatility term structure, skew, realized volatility, and macro rate context. On a fixed train/validation/test split, the learned policy improves risk-adjusted performance versus no-hedge, momentum, and volatility-targeting baselines (higher point-estimate Sharpe); only the GAE policy’s test-sample Sharpe is statistically distinguishable from zero, although confidence intervals overlap with a long-SPY benchmark so we stop short of claiming formal dominance. Turnover remains controlled and the policy is robust to doubled transaction costs. The modular codebase, comprising a data pipeline, simulator, and training scripts, is engineered for extensibility to multi-asset overlays, alternative objectives (e.g., drawdown or CVaR), and intraday data. From a portfolio management perspective, the learned overlay is designed to sit on top of an existing SPX or SPY allocation, improving the portfolio’s mean-variance trade-off with controlled turnover and drawdowns. We discuss practical implications for portfolio overlays and outline avenues for future work. ...

December 13, 2025 · 2 min · Research Team

Goal-based portfolio selection with fixed transaction costs

Goal-based portfolio selection with fixed transaction costs ArXiv ID: 2510.21650 “View on arXiv” Authors: Erhan Bayraktar, Bingyan Han, Jingjie Zhang Abstract We study a goal-based portfolio selection problem in which an investor aims to meet multiple financial goals, each with a specific deadline and target amount. Trading the stock incurs a strictly positive transaction cost. Using the stochastic Perron’s method, we show that the value function is the unique viscosity solution to a system of quasi-variational inequalities. The existence of an optimal trading strategy and goal funding scheme is established. Numerical results reveal complex optimal trading regions and show that the optimal investment strategy differs substantially from the V-shaped strategy observed in the frictionless case. ...

October 24, 2025 · 2 min · Research Team

FR-LUX: Friction-Aware, Regime-Conditioned Policy Optimization for Implementable Portfolio Management

FR-LUX: Friction-Aware, Regime-Conditioned Policy Optimization for Implementable Portfolio Management ArXiv ID: 2510.02986 “View on arXiv” Authors: Jian’an Zhang Abstract Transaction costs and regime shifts are major reasons why paper portfolios fail in live trading. We introduce FR-LUX (Friction-aware, Regime-conditioned Learning under eXecution costs), a reinforcement learning framework that learns after-cost trading policies and remains robust across volatility-liquidity regimes. FR-LUX integrates three ingredients: (i) a microstructure-consistent execution model combining proportional and impact costs, directly embedded in the reward; (ii) a trade-space trust region that constrains changes in inventory flow rather than logits, yielding stable low-turnover updates; and (iii) explicit regime conditioning so the policy specializes to LL/LH/HL/HH states without fragmenting the data. On a 4 x 5 grid of regimes and cost levels with multiple random seeds, FR-LUX achieves the top average Sharpe ratio with narrow bootstrap confidence intervals, maintains a flatter cost-performance slope than strong baselines, and attains superior risk-return efficiency for a given turnover budget. Pairwise scenario-level improvements are strictly positive and remain statistically significant after multiple-testing corrections. We provide formal guarantees on optimality under convex frictions, monotonic improvement under a KL trust region, long-run turnover bounds and induced inaction bands due to proportional costs, positive value advantage for regime-conditioned policies, and robustness to cost misspecification. The methodology is implementable: costs are calibrated from standard liquidity proxies, scenario-level inference avoids pseudo-replication, and all figures and tables are reproducible from released artifacts. ...

October 3, 2025 · 2 min · Research Team

Functionally Generated Portfolios Under Stochastic Transaction Costs: Theory and Empirical Evidence

Functionally Generated Portfolios Under Stochastic Transaction Costs: Theory and Empirical Evidence ArXiv ID: 2507.09196 “View on arXiv” Authors: Nader Karimi, Erfan Salavati Abstract Assuming frictionless trading, classical stochastic portfolio theory (SPT) provides relative arbitrage strategies. However, the costs associated with real-world execution are state-dependent, volatile, and under increasing stress during liquidity shocks. Using an Ito diffusion that may be connected with asset prices, we extend SPT to a continuous-time equity market with proportional, stochastic transaction costs. We derive closed-form lower bounds on cost-adjusted relative wealth for a large class of functionally generated portfolios; these bounds provide sufficient conditions for relative arbitrage to survive random costs. A limit-order-book cost proxy in conjunction with a Milstein scheme validates the theoretical order-of-magnitude estimates. Finally, we use intraday bid-ask spreads as a stand-in for cost volatility in a back-test of CRSP small-cap data (1994–2024). Despite experiencing larger declines during the 2008 and 2020 liquidity crises, diversity- and entropy-weighted portfolios continue to beat the value-weighted benchmark by 3.6 and 2.9 percentage points annually, respectively, after cost deduction. ...

July 12, 2025 · 2 min · Research Team

Model-Free Deep Hedging with Transaction Costs and Light Data Requirements

Model-Free Deep Hedging with Transaction Costs and Light Data Requirements ArXiv ID: 2505.22836 “View on arXiv” Authors: Pierre Brugière, Gabriel Turinici Abstract Option pricing theory, such as the Black and Scholes (1973) model, provides an explicit solution to construct a strategy that perfectly hedges an option in a continuous-time setting. In practice, however, trading occurs in discrete time and often involves transaction costs, making the direct application of continuous-time solutions potentially suboptimal. Previous studies, such as those by Buehler et al. (2018), Buehler et al. (2019) and Cao et al. (2019), have shown that deep learning or reinforcement learning can be used to derive better hedging strategies than those based on continuous-time models. However, these approaches typically rely on a large number of trajectories (of the order of $10^5$ or $10^6$) to train the model. In this work, we show that using as few as 256 trajectories is sufficient to train a neural network that significantly outperforms, in the Geometric Brownian Motion framework, both the classical Black & Scholes formula and the Leland model, which is arguably one of the most effective explicit alternatives for incorporating transaction costs. The ability to train neural networks with such a small number of trajectories suggests the potential for more practical and simple implementation on real-time financial series. ...

May 28, 2025 · 2 min · Research Team

Consumption-portfolio choice with preferences for liquid assets

Consumption-portfolio choice with preferences for liquid assets ArXiv ID: 2503.02697 “View on arXiv” Authors: Unknown Abstract This paper investigates an infinite horizon, discounted, consumption-portfolio problem in a market with one bond, one liquid risky asset, and one illiquid risky asset with proportional transaction costs. We consider an agent with liquidity preference, modeled by a Cobb-Douglas utility function that includes the liquid wealth. We analyze the properties of the value function and divide the solvency region into three regions: the buying region, the no-trading region, and the selling region, and prove that all three regions are non-empty. We mathematically characterize and numerically solve the optimal policy and prove its optimality. Our numerical analysis sheds light on the impact of various parameters on the optimal policy, and some intuition and economic insights behind it are also analyzed. We find that liquidity preference encourages agents to retain more liquid wealth and inhibits consumption, and may even result in a negative allocation to the illiquid asset. The liquid risky asset not only affects the location of the three regions but also has an impact on consumption. However, whether this impact on consumption is promoted or inhibited depends on the degree of risk aversion of agents. ...

March 4, 2025 · 2 min · Research Team

To Hedge or Not to Hedge: Optimal Strategies for Stochastic Trade Flow Management

To Hedge or Not to Hedge: Optimal Strategies for Stochastic Trade Flow Management ArXiv ID: 2503.02496 “View on arXiv” Authors: Unknown Abstract This paper addresses the trade-off between internalisation and externalisation in the management of stochastic trade flows. We consider agents who must absorb flows and manage risk by deciding whether to warehouse it or hedge in the market, thereby incurring transaction costs and market impact. Unlike market makers, these agents cannot skew their quotes to attract offsetting flows and deter risk-increasing ones, leading to a fundamentally different problem. Within the Almgren-Chriss framework, we derive almost-closed-form solutions in the case of quadratic execution costs, while more general cases require numerical methods. In particular, we discuss the challenges posed by artificial boundary conditions when using classical grid-based numerical PDE techniques and propose reinforcement learning methods as an alternative. ...

March 4, 2025 · 2 min · Research Team

Why is the estimation of metaorder impact with public market data so challenging?

Why is the estimation of metaorder impact with public market data so challenging? ArXiv ID: 2501.17096 “View on arXiv” Authors: Unknown Abstract Estimating market impact and transaction costs of large trades (metaorders) is a very important topic in finance. However, using models of price and trade based on public market data provide average price trajectories which are qualitatively different from what is observed during real metaorder executions: the price increases linearly, rather than in a concave way, during the execution and the amount of reversion after its end is very limited. We claim that this is a generic phenomenon due to the fact that even sophisticated statistical models are unable to correctly describe the origin of the autocorrelation of the order flow. We propose a modified Transient Impact Model which provides more realistic trajectories by assuming that only a fraction of the metaorder trading triggers market order flow. Interestingly, in our model there is a critical condition on the kernels of the price and order flow equations in which market impact becomes permanent. ...

January 28, 2025 · 2 min · Research Team

Robust and Sparse Portfolio Selection: Quantitative Insights and Efficient Algorithms

Robust and Sparse Portfolio Selection: Quantitative Insights and Efficient Algorithms ArXiv ID: 2412.19462 “View on arXiv” Authors: Unknown Abstract We extend the classical mean-variance (MV) framework and propose a robust and sparse portfolio selection model incorporating an ellipsoidal uncertainty set to reduce the impact of estimation errors and fixed transaction costs to penalize over-diversification. In the literature, the MV model under fixed transaction costs is referred to as the sparse or cardinality-constrained MV optimization, which is a mixed integer problem and is challenging to solve when the number of assets is large. We develop an efficient semismooth Newton-based proximal difference-of-convex algorithm to solve the proposed model and prove its convergence to at least a local minimizer with a locally linear convergence rate. We explore properties of the robust and sparse portfolio both analytically and numerically. In particular, we show that the MV optimization is indeed a robust procedure as long as an investor makes the proper choice on the risk-aversion coefficient. We contribute to the literature by proving that there is a one-to-one correspondence between the risk-aversion coefficient and the level of robustness. Moreover, we characterize how the number of traded assets changes with respect to the interaction between the level of uncertainty on model parameters and the magnitude of transaction cost. ...

December 27, 2024 · 2 min · Research Team