false

Trading with market resistance and concave price impact

Trading with market resistance and concave price impact ArXiv ID: 2601.03215 “View on arXiv” Authors: Youssef Ouazzani Chahdi, Nathan De Carvalho, Grégoire Szymanski Abstract We consider an optimal trading problem under a market impact model with endogenous market resistance generated by a sophisticated trader who (partially) detects metaorders and trades against them to exploit price overreactions induced by the order flow. The model features a concave transient impact driven by a power-law propagator with a resistance term responding to the trader’s rate via a fixed-point equation involving a general resistance function. We derive a (non)linear stochastic Fredholm equation as the first-order optimality condition satisfied by optimal trading strategies. Existence and uniqueness of the optimal control are established when the resistance function is linear, and an existence result is obtained when it is strictly convex using coercivity and weak lower semicontinuity of the associated profit-and-loss functional. We also propose an iterative scheme to solve the nonlinear stochastic Fredholm equation and prove an exponential convergence rate. Numerical experiments confirm this behavior and illustrate optimal round-trip strategies under “buy” signals with various decay profiles and different market resistance specifications. ...

January 6, 2026 · 2 min · Research Team

RL-Exec: Impact-Aware Reinforcement Learning for Opportunistic Optimal Liquidation, Outperforms TWAP and a Book-Liquidity VWAP on BTC-USD Replays

RL-Exec: Impact-Aware Reinforcement Learning for Opportunistic Optimal Liquidation, Outperforms TWAP and a Book-Liquidity VWAP on BTC-USD Replays ArXiv ID: 2511.07434 “View on arXiv” Authors: Enzo Duflot, Stanislas Robineau Abstract We study opportunistic optimal liquidation over fixed deadlines on BTC-USD limit-order books (LOB). We present RL-Exec, a PPO agent trained on historical replays augmented with endogenous transient impact (resilience), partial fills, maker/taker fees, and latency. The policy observes depth-20 LOB features plus microstructure indicators and acts under a sell-only inventory constraint to reach a residual target. Evaluation follows a strict time split (train: Jan-2020; test: Feb-2020) and a per-day protocol: for each test day we run ten independent start times and aggregate to a single daily score, avoiding pseudo-replication. We compare the agent to (i) TWAP and (ii) a VWAP-like baseline allocating using opposite-side order-book liquidity (top-20 levels), both executed on identical timestamps and costs. Statistical inference uses one-sided Wilcoxon signed-rank tests on daily RL-baseline differences with Benjamini-Hochberg FDR correction and bootstrap confidence intervals. On the Feb-2020 test set, RL-Exec significantly outperforms both baselines and the gap increases with the execution horizon (+2-3 bps at 30 min, +7-8 bps at 60 min, +23 bps at 120 min). Code: github.com/Giafferri/RL-Exec ...

October 30, 2025 · 2 min · Research Team