Temporal distribution of clusters of investors and their application in prediction with expert advice

ArXiv ID: 2406.19403 “View on arXiv”

Authors: Unknown

Abstract

Financial organisations such as brokers face a significant challenge in servicing the investment needs of thousands of their traders worldwide. This task is further compounded since individual traders will have their own risk appetite and investment goals. Traders may look to capture short-term trends in the market which last only seconds to minutes, or they may have longer-term views which last several days to months. To reduce the complexity of this task, client trades can be clustered. By examining such clusters, we would likely observe many traders following common patterns of investment, but how do these patterns vary through time? Knowledge regarding the temporal distributions of such clusters may help financial institutions manage the overall portfolio of risk that accumulates from underlying trader positions. This study contributes to the field by demonstrating that the distribution of clusters derived from the real-world trades of 20k Foreign Exchange (FX) traders (from 2015 to 2017) is described in accordance with Ewens’ Sampling Distribution. Further, we show that the Aggregating Algorithm (AA), an on-line prediction with expert advice algorithm, can be applied to the aforementioned real-world data in order to improve the returns of portfolios of trader risk. However we found that the AA ‘struggles’ when presented with too many trader ``experts’’, especially when there are many trades with similar overall patterns. To help overcome this challenge, we have applied and compared the use of Statistically Validated Networks (SVN) with a hierarchical clustering approach on a subset of the data, demonstrating that both approaches can be used to significantly improve results of the AA in terms of profitability and smoothness of returns.

Keywords: Online Prediction, Expert Advice, Statistically Validated Networks, Clustering Analysis, Portfolio Risk Management, Foreign Exchange (FX)

Complexity vs Empirical Score

  • Math Complexity: 4.5/10
  • Empirical Rigor: 7.0/10
  • Quadrant: Street Traders
  • Why: The paper uses advanced statistical concepts like Ewens’ Sampling Distribution and hypergeometric tests, but the math is presented at a high level without deep derivations, keeping complexity moderate. It demonstrates strong empirical rigor by analyzing a large proprietary dataset (20k traders, 3 years of FX data), applying clustering algorithms (SVN, hierarchical), and testing prediction algorithms (Aggregating Algorithm) with clear profitability metrics.
  flowchart TD
    A["Research Goal<br>Cluster 20k FX Traders & Predict Risk Portfolios<br>using Expert Advice"] --> B{"Key Data & Inputs<br>20k FX Traders (2015-2017)<br>Transaction Trades"}
    B --> C["Computation: Clustering<br>Derive Temporal Clusters &<br>Validate Distribution Ewens' Sampling"]
    C --> D{"Computation: Prediction<br>Aggregating Algorithm AA<br>Applied to Real-World Data"}
    D --> E{"Key Findings / Outcomes"}
    E --> F["Outcome 1<br>Clusters follow Ewens' Distribution"]
    E --> G["Outcome 2<br>AA 'Struggles' with too many experts<br>(Overfitting/Similarity issues)"]
    E --> H["Solution Applied<br>Statistically Validated Networks (SVN)<br>& Hierarchical Clustering"]
    H --> I["Final Outcome<br>SVN/Clustering significantly improves<br>AA Profitability & Return Smoothness"]