Quant Finance Research Hub

SusGen-GPT: A Data-Centric LLM for Financial NLP and Sustainability Report Generation

SusGen-GPT: A Data-Centric LLM for Financial NLP and Sustainability Report Generation ArXiv ID: 2412.10906 “View on arXiv” Authors: Unknown Abstract The rapid growth of the financial sector and the rising focus on Environmental, Social, and Governance (ESG) considerations highlight the need for advanced NLP tools. However, open-source LLMs proficient in both finance and ESG domains remain scarce. To address this gap, we introduce SusGen-30K, a category-balanced dataset comprising seven financial NLP tasks and ESG report generation, and propose TCFD-Bench, a benchmark for evaluating sustainability report generation. Leveraging this dataset, we developed SusGen-GPT, a suite of models achieving state-of-the-art performance across six adapted and two off-the-shelf tasks, trailing GPT-4 by only 2% despite using 7-8B parameters compared to GPT-4’s 1,700B. Based on this, we propose the SusGen system, integrated with Retrieval-Augmented Generation (RAG), to assist in sustainability report generation. This work demonstrates the efficiency of our approach, advancing research in finance and ESG. ...

Financial Fine-tuning a Large Time Series Model

Financial Fine-tuning a Large Time Series Model ArXiv ID: 2412.09880 “View on arXiv” Authors: Unknown Abstract Large models have shown unprecedented capabilities in natural language processing, image generation, and most recently, time series forecasting. This leads us to ask the question: treating market prices as a time series, can large models be used to predict the market? In this paper, we answer this by evaluating the performance of the latest time series foundation model TimesFM on price prediction. We find that due to the irregular nature of price data, directly applying TimesFM gives unsatisfactory results and propose to fine-tune TimeFM on financial data for the task of price prediction. This is done by continual pre-training of the latest time series foundation model TimesFM on price data containing 100 million time points, spanning a range of financial instruments spanning hourly and daily granularities. The fine-tuned model demonstrates higher price prediction accuracy than the baseline model. We conduct mock trading for our model in various financial markets and show that it outperforms various benchmarks in terms of returns, sharpe ratio, max drawdown and trading cost. ...

Higher Order Transformers: Enhancing Stock Movement Prediction On Multimodal Time-Series Data

Higher Order Transformers: Enhancing Stock Movement Prediction On Multimodal Time-Series Data ArXiv ID: 2412.10540 “View on arXiv” Authors: Unknown Abstract In this paper, we tackle the challenge of predicting stock movements in financial markets by introducing Higher Order Transformers, a novel architecture designed for processing multivariate time-series data. We extend the self-attention mechanism and the transformer architecture to a higher order, effectively capturing complex market dynamics across time and variables. To manage computational complexity, we propose a low-rank approximation of the potentially large attention tensor using tensor decomposition and employ kernel attention, reducing complexity to linear with respect to the data size. Additionally, we present an encoder-decoder model that integrates technical and fundamental analysis, utilizing multimodal signals from historical prices and related tweets. Our experiments on the Stocknet dataset demonstrate the effectiveness of our method, highlighting its potential for enhancing stock movement prediction in financial markets. ...

Integrative Analysis of Financial Market Sentiment Using CNN and GRU for Risk Prediction and Alert Systems

Integrative Analysis of Financial Market Sentiment Using CNN and GRU for Risk Prediction and Alert Systems ArXiv ID: 2412.10199 “View on arXiv” Authors: Unknown Abstract This document presents an in-depth examination of stock market sentiment through the integration of Convolutional Neural Networks (CNN) and Gated Recurrent Units (GRU), enabling precise risk alerts. The robust feature extraction capability of CNN is utilized to preprocess and analyze extensive network text data, identifying local features and patterns. The extracted feature sequences are then input into the GRU model to understand the progression of emotional states over time and their potential impact on future market sentiment and risk. This approach addresses the order dependence and long-term dependencies inherent in time series data, resulting in a detailed analysis of stock market sentiment and effective early warnings of future risks. ...

Reciprocity in Interbank Markets

Reciprocity in Interbank Markets ArXiv ID: 2412.10329 “View on arXiv” Authors: Unknown Abstract Weighted reciprocity between two agents can be defined as the minimum of sending and receiving value in their bilateral relationship. In financial networks, such reciprocity characterizes the importance of individual banks as both liquidity absorber and provider, a feature typically attributed to large, intermediating dealer banks. In this paper we develop an exponential random graph model that can account for reciprocal links of each node simultaneously on the topological as well as on the weighted level. We provide an exact expression for the normalizing constant and thus a closed-form solution for the graph probability distribution. Applying this statistical null model to Italian interbank data, we find that before the great financial crisis (i) banks displayed significantly more weighted reciprocity compared to what the lower-order network features (size and volume distributions) would predict (ii) with a disappearance of this deviation once the early periods of the crisis set in, (iii) a trend which can be attributed in particular to smaller banks (dis)engaging in bilateral high-value trading relationships. Moreover, we show that neglecting reciprocal links and weights can lead to spurious findings of triadic relationships. As the hierarchical structure in the network is found to be compatible with its transitive but not with its intransitive triadic sub-graphs, the interbank market seems to be well-characterized by a hierarchical core-periphery structure enhanced by non-hierarchical reciprocal trading relationships. ...

Geometric Deep Learning for Realized Covariance Matrix Forecasting

Geometric Deep Learning for Realized Covariance Matrix Forecasting ArXiv ID: 2412.09517 “View on arXiv” Authors: Unknown Abstract Traditional methods employed in matrix volatility forecasting often overlook the inherent Riemannian manifold structure of symmetric positive definite matrices, treating them as elements of Euclidean space, which can lead to suboptimal predictive performance. Moreover, they often struggle to handle high-dimensional matrices. In this paper, we propose a novel approach for forecasting realized covariance matrices of asset returns using a Riemannian-geometry-aware deep learning framework. In this way, we account for the geometric properties of the covariance matrices, including possible non-linear dynamics and efficient handling of high-dimensionality. Moreover, building upon a Fréchet sample mean of realized covariance matrices, we are able to extend the HAR model to the matrix-variate. We demonstrate the efficacy of our approach using daily realized covariance matrices for the 50 most capitalized companies in the S&P 500 index, showing that our method outperforms traditional approaches in terms of predictive accuracy. ...

Isogeometric Analysis for the Pricing of Financial Derivatives with Nonlinear Models: Convertible Bonds and Options

Isogeometric Analysis for the Pricing of Financial Derivatives with Nonlinear Models: Convertible Bonds and Options ArXiv ID: 2412.08987 “View on arXiv” Authors: Unknown Abstract Computational efficiency is essential for enhancing the accuracy and practicality of pricing complex financial derivatives. In this paper, we discuss Isogeometric Analysis (IGA) for valuing financial derivatives, modeled by two nonlinear Black-Scholes PDEs: the Leland model for European call with transaction costs and the AFV model for convertible bonds with default options. We compare the solutions of IGA with finite difference methods (FDM) and finite element methods (FEM). In particular, very accurate solutions can be numerically calculated on far less mesh (knots) than FDM or FEM, by using non-uniform knots and weighted cubic NURBS, which in turn reduces the computational time significantly. ...

LLMs for Time Series: an Application for Single Stocks and Statistical Arbitrage

LLMs for Time Series: an Application for Single Stocks and Statistical Arbitrage ArXiv ID: 2412.09394 “View on arXiv” Authors: Unknown Abstract Recently, LLMs (Large Language Models) have been adapted for time series prediction with significant success in pattern recognition. However, the common belief is that these models are not suitable for predicting financial market returns, which are known to be almost random. We aim to challenge this misconception through a counterexample. Specifically, we utilized the Chronos model from Ansari et al.(2024) and tested both pretrained configurations and fine-tuned supervised forecasts on the largest American single stocks using data from Guijarro-Ordonnez et al.(2022). We constructed a long/short portfolio, and the performance simulation indicates that LLMs can in reality handle time series that are nearly indistinguishable from noise, demonstrating an ability to identify inefficiencies amidst randomness and generate alpha. Finally, we compared these results with those of specialized models and smaller deep learning models, highlighting significant room for improvement in LLM performance to further enhance their predictive capabilities. ...

Efficient and Verified Continuous Double Auctions

Efficient and Verified Continuous Double Auctions ArXiv ID: 2412.08624 “View on arXiv” Authors: Unknown Abstract Continuous double auctions are commonly used to match orders at currency, stock, and commodities exchanges. A verified implementation of continuous double auctions is a useful tool for market regulators as they give rise to automated checkers that are guaranteed to detect errors in the trade logs of an existing exchange if they contain trades that violate the matching rules. We provide an efficient and formally verified implementation of continuous double auctions that takes $O(n \log n)$ time to match $n$ orders. This improves an earlier $O(n^2)$ verified implementation. We also prove a matching $Ω(n\log n)$ lower bound on the running time for continuous double auctions. Our new implementation takes only a couple of minutes to run on ten million randomly generated orders as opposed to a few days taken by the earlier implementation. Our new implementation gives rise to an efficient automatic checker. We use the Coq proof assistant for verifying our implementation and extracting a verified OCaml program. While using Coq’s standard library implementation of red-black trees to obtain our improvement, we observed that its specification has serious gaps, which we fill in this work; this might be of independent interest. ...

High-dimensional covariance matrix estimators on simulated portfolios with complex structures

High-dimensional covariance matrix estimators on simulated portfolios with complex structures ArXiv ID: 2412.08756 “View on arXiv” Authors: Unknown Abstract We study the allocation of synthetic portfolios under hierarchical nested, one-factor, and diagonal structures of the population covariance matrix in a high-dimensional scenario. The noise reduction approaches for the sample realizations are based on random matrices, free probability, deterministic equivalents, and their combination with a data science hierarchical method known as two-step covariance estimators. The financial performance metrics from the simulations are compared with empirical data from companies comprising the S&P 500 index using a moving window and walk-forward analysis. The portfolio allocation strategies analyzed include the minimum variance portfolio (both with and without short-selling constraints) and the hierarchical risk parity approach. Our proposed hierarchical nested covariance model shows signatures of complex system interactions. The empirical financial data reproduces stylized portfolio facts observed in the complex and one-factor covariance models. The two-step estimators proposed here improve several financial metrics under the analyzed investment strategies. The results pave the way for new risk management and diversification approaches when the number of assets is of the same order as the number of transaction days in the investment portfolio. ...