false

Integration of Wavelet Transform Convolution and Channel Attention with LSTM for Stock Price Prediction based Portfolio Allocation

Integration of Wavelet Transform Convolution and Channel Attention with LSTM for Stock Price Prediction based Portfolio Allocation ArXiv ID: 2507.01973 “View on arXiv” Authors: Junjie Guo Abstract Portfolio allocation via stock price prediction is inherently difficult due to the notoriously low signal-to-noise ratio of stock time series. This paper proposes a method by integrating wavelet transform convolution and channel attention with LSTM to implement stock price prediction based portfolio allocation. Stock time series data first are processed by wavelet transform convolution to reduce the noise. Processed features are then reconstructed by channel attention. LSTM is utilized to predict the stock price using the final processed features. We construct a portfolio consists of four stocks with trading signals predicted by model. Experiments are conducted by evaluating the return, Sharpe ratio and max drawdown performance. The results indicate that our method achieves robust performance even during period of post-pandemic downward market. ...

June 23, 2025 · 2 min · Research Team

When Dimensionality Hurts: The Role of LLM Embedding Compression for Noisy Regression Tasks

When Dimensionality Hurts: The Role of LLM Embedding Compression for Noisy Regression Tasks ArXiv ID: 2502.02199 “View on arXiv” Authors: Unknown Abstract Large language models (LLMs) have shown remarkable success in language modelling due to scaling laws found in model size and the hidden dimension of the model’s text representation. Yet, we demonstrate that compressed representations of text can yield better performance in LLM-based regression tasks. In this paper, we compare the relative performance of embedding compression in three different signal-to-noise contexts: financial return prediction, writing quality assessment and review scoring. Our results show that compressing embeddings, in a minimally supervised manner using an autoencoder’s hidden representation, can mitigate overfitting and improve performance on noisy tasks, such as financial return prediction; but that compression reduces performance on tasks that have high causal dependencies between the input and target data. Our results suggest that the success of interpretable compressed representations such as sentiment may be due to a regularising effect. ...

February 4, 2025 · 2 min · Research Team

Strong denoising of financial time-series

Strong denoising of financial time-series ArXiv ID: 2408.05690 “View on arXiv” Authors: Unknown Abstract In this paper we introduce a method for significantly improving the signal to noise ratio in financial data. The approach relies on combining a target variable with different context variables and use auto-encoders (AEs) to learn reconstructions of the combined inputs. The objective is to obtain agreement among pairs of AEs which are trained on related but different inputs and for which they are forced to find common ground. The training process is set up as a “conversation” where the models take turns at producing a prediction (speaking) and reconciling own predictions with the output of the other AE (listening), until an agreement is reached. This leads to a new way of constraining the complexity of the data representation generated by the AE. Unlike standard regularization whose strength needs to be decided by the designer, the proposed mutual regularization uses the partner network to detect and amend the lack of generality of the learned representation of the data. The integration of alternative perspectives enhances the de-noising capacity of a single AE and allows us to discover new regularities in financial time-series which can be converted into profitable trading strategies. ...

August 11, 2024 · 2 min · Research Team

When can weak latent factors be statistically inferred?

When can weak latent factors be statistically inferred? ArXiv ID: 2407.03616 “View on arXiv” Authors: Unknown Abstract This article establishes a new and comprehensive estimation and inference theory for principal component analysis (PCA) under the weak factor model that allow for cross-sectional dependent idiosyncratic components under the nearly minimal factor strength relative to the noise level or signal-to-noise ratio. Our theory is applicable regardless of the relative growth rate between the cross-sectional dimension $N$ and temporal dimension $T$. This more realistic assumption and noticeable result require completely new technical device, as the commonly-used leave-one-out trick is no longer applicable to the case with cross-sectional dependence. Another notable advancement of our theory is on PCA inference $ - $ for example, under the regime where $N\asymp T$, we show that the asymptotic normality for the PCA-based estimator holds as long as the signal-to-noise ratio (SNR) grows faster than a polynomial rate of $\log N$. This finding significantly surpasses prior work that required a polynomial rate of $N$. Our theory is entirely non-asymptotic, offering finite-sample characterizations for both the estimation error and the uncertainty level of statistical inference. A notable technical innovation is our closed-form first-order approximation of PCA-based estimator, which paves the way for various statistical tests. Furthermore, we apply our theories to design easy-to-implement statistics for validating whether given factors fall in the linear spans of unknown latent factors, testing structural breaks in the factor loadings for an individual unit, checking whether two units have the same risk exposures, and constructing confidence intervals for systematic risks. Our empirical studies uncover insightful correlations between our test results and economic cycles. ...

July 4, 2024 · 2 min · Research Team