Trading Volume Maximization with Online Learning
ArXiv ID: 2405.13102 “View on arXiv”
Authors: Unknown
Abstract
We explore brokerage between traders in an online learning framework. At any round $t$, two traders meet to exchange an asset, provided the exchange is mutually beneficial. The broker proposes a trading price, and each trader tries to sell their asset or buy the asset from the other party, depending on whether the price is higher or lower than their private valuations. A trade happens if one trader is willing to sell and the other is willing to buy at the proposed price. Previous work provided guidance to a broker aiming at enhancing traders’ total earnings by maximizing the gain from trade, defined as the sum of the traders’ net utilities after each interaction. In contrast, we investigate how the broker should behave to maximize the trading volume, i.e., the total number of trades. We model the traders’ valuations as an i.i.d. process with an unknown distribution. If the traders’ valuations are revealed after each interaction (full-feedback), and the traders’ valuations cumulative distribution function (cdf) is continuous, we provide an algorithm achieving logarithmic regret and show its optimality up to constant factors. If only their willingness to sell or buy at the proposed price is revealed after each interaction ($2$-bit feedback), we provide an algorithm achieving poly-logarithmic regret when the traders’ valuations cdf is Lipschitz and show that this rate is near-optimal. We complement our results by analyzing the implications of dropping the regularity assumptions on the unknown traders’ valuations cdf. If we drop the continuous cdf assumption, the regret rate degrades to $Θ(\sqrt{“T”})$ in the full-feedback case, where $T$ is the time horizon. If we drop the Lipschitz cdf assumption, learning becomes impossible in the $2$-bit feedback case.
Keywords: Market making, Online learning, Brokerage, Regret minimization, Cumulative distribution function (cdf), General/Trading Strategy
Complexity vs Empirical Score
- Math Complexity: 7.5/10
- Empirical Rigor: 2.0/10
- Quadrant: Lab Rats
- Why: The paper is dense with advanced mathematical theory, including regret analysis, lower bounds, and algorithm design in an online learning setting, but lacks any empirical validation, backtesting, or implementation details beyond theoretical algorithms.
flowchart TD
A["Research Goal: How should a broker maximize<br/>trading volume in an online learning setting?"] --> B["Model Setup"]
B --> C{"Trader Valuation CDF Type"}
C -->|Continuous & Known| D["Full-Feedback Setting"]
C -->|Lipschitz & Known| E["2-Bit Feedback Setting"]
C -->|Discontinuous / Unknown| F["Drop Regularity Assumptions"]
D --> G["Algorithm: Propose Prices at<br/>Quantiles of Empirical CDF"]
E --> H["Algorithm: Discretize Valuations<br/>& Explore-Exploit Strategy"]
F --> I["Regret increases to Θ√T or<br/>Learning becomes impossible"]
G --> J["Outcome: Logarithmic Regret<br/>(Optimal up to constants)"]
H --> K["Outcome: Poly-Logarithmic Regret<br/>(Near-optimal)"]
I --> L["Outcome: Degraded Performance<br/>No poly-log regret possible"]