false

From sectorial coarse graining to extreme coarse graining of S&P 500 correlation matrices

From sectorial coarse graining to extreme coarse graining of S&P 500 correlation matrices ArXiv ID: 2511.05463 “View on arXiv” Authors: Manan Vyas, M. Mijaíl Martínez-Ramos, Parisa Majari, Thomas H. Seligman Abstract Starting from the Pearson Correlation Matrix of stock returns and from the desire to obtain a reduced number of parameters relevant for the dynamics of a financial market, we propose to take the idea of a sectorial matrix, which would have a large number of parameters, to the reduced picture of a real symmetric $2 \times 2$ matrix, extreme case, that still conserves the desirable feature that the average correlation can be one of the parameters. This is achieved by averaging the correlation matrix over blocks created by choosing two subsets of stocks for rows and columns and averaging over each of the resulting blocks. Averaging over these blocks, we retain the average of the correlation matrix. We shall use a random selection for two equal block sizes as well as two specific, hopefully relevant, ones that do not produce equal block sizes. The results show that one of the non-random choices has somewhat different properties, whose meaning will have to be analyzed from an economy point of view. ...

November 7, 2025 · 2 min · Research Team

Axes that matter: PCA with a difference

Axes that matter: PCA with a difference ArXiv ID: 2503.06707 “View on arXiv” Authors: Unknown Abstract We extend the scope of differential machine learning and introduce a new breed of supervised principal component analysis to reduce dimensionality of Derivatives problems. Applications include the specification and calibration of pricing models, the identification of regression features in least-square Monte-Carlo, and the pre-processing of simulated datasets for (differential) machine learning. Keywords: differential machine learning, principal component analysis, derivatives pricing, least-square Monte-Carlo, dimensionality reduction ...

March 9, 2025 · 1 min · Research Team

Unlocking NACE Classification Embeddings with OpenAI for Enhanced Analysis and Processing

Unlocking NACE Classification Embeddings with OpenAI for Enhanced Analysis and Processing ArXiv ID: 2409.11524 “View on arXiv” Authors: Unknown Abstract The Statistical Classification of Economic Activities in the European Community (NACE) is the standard classification system for the categorization of economic and industrial activities within the European Union. This paper proposes a novel approach to transform the NACE classification into low-dimensional embeddings, using state-of-the-art models and dimensionality reduction techniques. The primary challenge is the preservation of the hierarchical structure inherent within the original NACE classification while reducing the number of dimensions. To address this issue, we introduce custom metrics designed to quantify the retention of hierarchical relationships throughout the embedding and reduction processes. The evaluation of these metrics demonstrates the effectiveness of the proposed methodology in retaining the structural information essential for insightful analysis. This approach not only facilitates the visual exploration of economic activity relationships, but also increases the efficacy of downstream tasks, including clustering, classification, integration with other classifications, and others. Through experimental validation, the utility of our proposed framework in preserving hierarchical structures within the NACE classification is showcased, thereby providing a valuable tool for researchers and policymakers to understand and leverage any hierarchical data. ...

September 17, 2024 · 2 min · Research Team

Low-dimensional approximations of the conditional law of Volterra processes: a non-positive curvature approach

Low-dimensional approximations of the conditional law of Volterra processes: a non-positive curvature approach ArXiv ID: 2405.20094 “View on arXiv” Authors: Unknown Abstract Predicting the conditional evolution of Volterra processes with stochastic volatility is a crucial challenge in mathematical finance. While deep neural network models offer promise in approximating the conditional law of such processes, their effectiveness is hindered by the curse of dimensionality caused by the infinite dimensionality and non-smooth nature of these problems. To address this, we propose a two-step solution. Firstly, we develop a stable dimension reduction technique, projecting the law of a reasonably broad class of Volterra process onto a low-dimensional statistical manifold of non-positive sectional curvature. Next, we introduce a sequentially deep learning model tailored to the manifold’s geometry, which we show can approximate the projected conditional law of the Volterra process. Our model leverages an auxiliary hypernetwork to dynamically update its internal parameters, allowing it to encode non-stationary dynamics of the Volterra process, and it can be interpreted as a gating mechanism in a mixture of expert models where each expert is specialized at a specific point in time. Our hypernetwork further allows us to achieve approximation rates that would seemingly only be possible with very large networks. ...

May 30, 2024 · 2 min · Research Team

Kernel Three Pass Regression Filter

Kernel Three Pass Regression Filter ArXiv ID: 2405.07292 “View on arXiv” Authors: Unknown Abstract We forecast a single time series using a high-dimensional set of predictors. When these predictors share common underlying dynamics, an approximate latent factor model provides a powerful characterization of their co-movements Bai(2003). These latent factors succinctly summarize the data and can also be used for prediction, alleviating the curse of dimensionality in high-dimensional prediction exercises, see Stock & Watson (2002a). However, forecasting using these latent factors suffers from two potential drawbacks. First, not all pervasive factors among the set of predictors may be relevant, and using all of them can lead to inefficient forecasts. The second shortcoming is the assumption of linear dependence of predictors on the underlying factors. The first issue can be addressed by using some form of supervision, which leads to the omission of irrelevant information. One example is the three-pass regression filter proposed by Kelly & Pruitt (2015). We extend their framework to cases where the form of dependence might be nonlinear by developing a new estimator, which we refer to as the Kernel Three-Pass Regression Filter (K3PRF). This alleviates the aforementioned second shortcoming. The estimator is computationally efficient and performs well empirically. The short-term performance matches or exceeds that of established models, while the long-term performance shows significant improvement. ...

May 12, 2024 · 2 min · Research Team

Combating Financial Crimes with Unsupervised Learning Techniques: Clustering and Dimensionality Reduction for Anti-Money Laundering

Combating Financial Crimes with Unsupervised Learning Techniques: Clustering and Dimensionality Reduction for Anti-Money Laundering ArXiv ID: 2403.00777 “View on arXiv” Authors: Unknown Abstract Anti-Money Laundering (AML) is a crucial task in ensuring the integrity of financial systems. One keychallenge in AML is identifying high-risk groups based on their behavior. Unsupervised learning, particularly clustering, is a promising solution for this task. However, the use of hundreds of features todescribe behavior results in a highdimensional dataset that negatively impacts clustering performance.In this paper, we investigate the effectiveness of combining clustering method agglomerative hierarchicalclustering with four dimensionality reduction techniques -Independent Component Analysis (ICA), andKernel Principal Component Analysis (KPCA), Singular Value Decomposition (SVD), Locality Preserving Projections (LPP)- to overcome the issue of high-dimensionality in AML data and improve clusteringresults. This study aims to provide insights into the most effective way of reducing the dimensionality ofAML data and enhance the accuracy of clustering-based AML systems. The experimental results demonstrate that KPCA outperforms other dimension reduction techniques when combined with agglomerativehierarchical clustering. This superiority is observed in the majority of situations, as confirmed by threedistinct validation indices. ...

February 14, 2024 · 2 min · Research Team