Quant Finance Research Hub

The Construction of Instruction-tuned LLMs for Finance without Instruction Data Using Continual Pretraining and Model Merging

The Construction of Instruction-tuned LLMs for Finance without Instruction Data Using Continual Pretraining and Model Merging ArXiv ID: 2409.19854 “View on arXiv” Authors: Unknown Abstract This paper proposes a novel method for constructing instruction-tuned large language models (LLMs) for finance without instruction data. Traditionally, developing such domain-specific LLMs has been resource-intensive, requiring a large dataset and significant computational power for continual pretraining and instruction tuning. Our study proposes a simpler approach that combines domain-specific continual pretraining with model merging. Given that general-purpose pretrained LLMs and their instruction-tuned LLMs are often publicly available, they can be leveraged to obtain the necessary instruction task vector. By merging this with a domain-specific pretrained vector, we can effectively create instruction-tuned LLMs for finance without additional instruction data. Our process involves two steps: first, we perform continual pretraining on financial data; second, we merge the instruction-tuned vector with the domain-specific pretrained vector. Our experiments demonstrate the successful construction of instruction-tuned LLMs for finance. One major advantage of our method is that the instruction-tuned and domain-specific pretrained vectors are nearly independent. This independence makes our approach highly effective. The Japanese financial instruction-tuned LLMs we developed in this study are available at https://huggingface.co/pfnet/nekomata-14b-pfn-qfin-inst-merge. ...

American Call Options Pricing With Modular Neural Networks

American Call Options Pricing With Modular Neural Networks ArXiv ID: 2409.19706 “View on arXiv” Authors: Unknown Abstract An accurate valuation of American call options is critical in most financial decision making environments. However, traditional models like the Barone-Adesi Whaley (B-AW) and Binomial Option Pricing (BOP) methods fall short in handling the complexities of early exercise and market dynamics present in American options. This paper proposes a Modular Neural Network (MNN) model which aims to capture the key aspects of American options pricing. By dividing the prediction process into specialized modules, the MNN effectively models the non-linear interactions that drive American call options pricing. Experimental results indicate that the MNN model outperform both traditional models as well as a simpler Feed-forward Neural Network (FNN) across multiple stocks (AAPL, NVDA, QQQ), with significantly lower RMSE and nRMSE (by mean). These findings highlight the potential of MNNs as a powerful tool to improve the accuracy of predicting option prices. ...

Signal inference in financial stock return correlations through phase-ordering kinetics in the quenched regime

Signal inference in financial stock return correlations through phase-ordering kinetics in the quenched regime ArXiv ID: 2409.19711 “View on arXiv” Authors: Unknown Abstract Financial stock return correlations have been analyzed through the lens of random matrix theory to differentiate the underlying signal from spurious correlations. The continuous spectrum of the eigenvalue distribution derived from the stock return correlation matrix typically aligns with a rescaled Marchenko-Pastur distribution, indicating no detectable signal. In this study, we introduce a stochastic field theory model to establish a detection threshold for signals present in the limit where the eigenvalues are within the continuous spectrum, which itself closely resembles that of a random matrix where standard methods such as principal component analysis fail to infer a signal. We then apply our method to Standard & Poor’s 500 financial stocks’ return correlations, detecting the presence of a signal in the largest eigenvalues within the continuous spectrum. ...

Stock Price Prediction and Traditional Models: An Approach to Achieve Short-, Medium- and Long-Term Goals

Stock Price Prediction and Traditional Models: An Approach to Achieve Short-, Medium- and Long-Term Goals ArXiv ID: 2410.07220 “View on arXiv” Authors: Unknown Abstract A comparative analysis of deep learning models and traditional statistical methods for stock price prediction uses data from the Nigerian stock exchange. Historical data, including daily prices and trading volumes, are employed to implement models such as Long Short Term Memory (LSTM) networks, Gated Recurrent Units (GRUs), Autoregressive Integrated Moving Average (ARIMA), and Autoregressive Moving Average (ARMA). These models are assessed over three-time horizons: short-term (1 year), medium-term (2.5 years), and long-term (5 years), with performance measured by Mean Squared Error (MSE) and Mean Absolute Error (MAE). The stability of the time series is tested using the Augmented Dickey-Fuller (ADF) test. Results reveal that deep learning models, particularly LSTM, outperform traditional methods by capturing complex, nonlinear patterns in the data, resulting in more accurate predictions. However, these models require greater computational resources and offer less interpretability than traditional approaches. The findings highlight the potential of deep learning for improving financial forecasting and investment strategies. Future research could incorporate external factors such as social media sentiment and economic indicators, refine model architectures, and explore real-time applications to enhance prediction accuracy and scalability. ...

Evaluating Financial Relational Graphs: Interpretation Before Prediction

Evaluating Financial Relational Graphs: Interpretation Before Prediction ArXiv ID: 2410.07216 “View on arXiv” Authors: Unknown Abstract Accurate and robust stock trend forecasting has been a crucial and challenging task, as stock price changes are influenced by multiple factors. Graph neural network-based methods have recently achieved remarkable success in this domain by constructing stock relationship graphs that reflect internal factors and relationships between stocks. However, most of these methods rely on predefined factors to construct static stock relationship graphs due to the lack of suitable datasets, failing to capture the dynamic changes in stock relationships. Moreover, the evaluation of relationship graphs in these methods is often tied to the performance of neural network models on downstream tasks, leading to confusion and imprecision. To address these issues, we introduce the SPNews dataset, collected based on S&P 500 Index stocks, to facilitate the construction of dynamic relationship graphs. Furthermore, we propose a novel set of financial relationship graph evaluation methods that are independent of downstream tasks. By using the relationship graph to explain historical financial phenomena, we assess its validity before constructing a graph neural network, ensuring the graph’s effectiveness in capturing relevant financial relationships. Experimental results demonstrate that our evaluation methods can effectively differentiate between various financial relationship graphs, yielding more interpretable results compared to traditional approaches. We make our source code publicly available on GitHub to promote reproducibility and further research in this area. ...

Multi-Factor Polynomial Diffusion Models and Inter-Temporal Futures Dynamics

Multi-Factor Polynomial Diffusion Models and Inter-Temporal Futures Dynamics ArXiv ID: 2409.19386 “View on arXiv” Authors: Unknown Abstract In stochastic multi-factor commodity models, it is often the case that futures prices are explained by two latent state variables which represent the short and long term stochastic factors. In this work, we develop the family of stochastic models using polynomial diffusion to obtain the unobservable spot price to be used for modelling futures curve dynamics. The polynomial family of diffusion models allows one to incorporate a variety of non-linear, higher-order effects, into a multi-factor stochastic model, which is a generalisation of Schwartz and Smith (2000) two-factor model. Two filtering methods are used for the parameter and the latent factor estimation to address the non-linearity. We provide a comparative analysis of the performance of the estimation procedures. We discuss the parameter identification problem present in the polynomial diffusion case, regardless, the futures prices can still be estimated accurately. Moreover, we study the effects of different methods of calculating matrix exponential in the polynomial diffusion model. As the polynomial order increases, accurately and efficiently approximating the high-dimensional matrix exponential becomes essential in the polynomial diffusion model. ...

Optimizing Time Series Forecasting: A Comparative Study of Adam and Nesterov Accelerated Gradient on LSTM and GRU networks Using Stock Market data

Optimizing Time Series Forecasting: A Comparative Study of Adam and Nesterov Accelerated Gradient on LSTM and GRU networks Using Stock Market data ArXiv ID: 2410.01843 “View on arXiv” Authors: Unknown Abstract Several studies have discussed the impact different optimization techniques in the context of time series forecasting across different Neural network architectures. This paper examines the effectiveness of Adam and Nesterov’s Accelerated Gradient (NAG) optimization techniques on LSTM and GRU neural networks for time series prediction, specifically stock market time-series. Our study was done by training LSTM and GRU models with two different optimization techniques - Adam and Nesterov Accelerated Gradient (NAG), comparing and evaluating their performance on Apple Inc’s closing price data over the last decade. The GRU model optimized with Adam produced the lowest RMSE, outperforming the other model-optimizer combinations in both accuracy and convergence speed. The GRU models with both optimizers outperformed the LSTM models, whilst the Adam optimizer outperformed the NAG optimizer for both model architectures. The results suggest that GRU models optimized with Adam are well-suited for practitioners in time-series prediction, more specifically stock price time series prediction producing accurate and computationally efficient models. The code for the experiments in this project can be found at https://github.com/AhmadMak/Time-Series-Optimization-Research Keywords: Time-series Forecasting, Neural Network, LSTM, GRU, Adam Optimizer, Nesterov Accelerated Gradient (NAG) Optimizer ...

PDSim: A Shiny App for Simulating and Estimating Polynomial Diffusion Models in Commodity Futures

PDSim: A Shiny App for Simulating and Estimating Polynomial Diffusion Models in Commodity Futures ArXiv ID: 2409.19385 “View on arXiv” Authors: Unknown Abstract PDSim is an R package that enables users to simulate commodity futures prices using the polynomial diffusion model introduced in Filipovic & Larsson (2016) through both a Shiny web application and R scripts. For user-supplied data, a standalone R routine has been developed to provide joint estimation of state variables and model parameters via the Extended Kalman Filter (EKF) or Unscented Kalman Filter (UKF). With its user-friendly interface, PDSim makes the features of simulations and estimations accessible. To date, it is the only package specifically designed for the simulation and estimation of the polynomial diffusion model. The Schwartz-Smith two-factor model (Schwartz & Smith, 2000) is also available within this package for both simulation and calibration. The package is validated through several tests, including replication of the results in Schwartz & Smith (2000), unit testing of the coverage rate, and verification of the outputs of the main functions. ...

Modern Portfolio Diversification with Arte-Blue Chip Index

Modern Portfolio Diversification with Arte-Blue Chip Index ArXiv ID: 2409.18816 “View on arXiv” Authors: Unknown Abstract This paper presents a novel approach to evaluating blue-chip art as a viable asset class for portfolio diversification. We present the Arte-Blue Chip Index, an index that tracks 100 top-performing artists based on 81,891 public transactions from 157 artists across 584 auction houses over the period 1990 to 2024. By comparing blue-chip art price trends with stock market fluctuations, our index provides insights into the risk and return profile of blue-chip art investments. Our analysis demonstrates that a 20% allocation of blue-chip art in a diversified portfolio enhances risk-adjusted returns by around 20%, while maintaining volatility levels similar to the S&P 500. ...

Volatility Forecasting in Global Financial Markets Using TimeMixer

Volatility Forecasting in Global Financial Markets Using TimeMixer ArXiv ID: 2410.09062 “View on arXiv” Authors: Unknown Abstract Predicting volatility in financial markets, including stocks, index ETFs, foreign exchange, and cryptocurrencies, remains a challenging task due to the inherent complexity and non-linear dynamics of these time series. In this study, I apply TimeMixer, a state-of-the-art time series forecasting model, to predict the volatility of global financial assets. TimeMixer utilizes a multiscale-mixing approach that effectively captures both short-term and long-term temporal patterns by analyzing data across different scales. My empirical results reveal that while TimeMixer performs exceptionally well in short-term volatility forecasting, its accuracy diminishes for longer-term predictions, particularly in highly volatile markets. These findings highlight TimeMixer’s strength in capturing short-term volatility, making it highly suitable for practical applications in financial risk management, where precise short-term forecasts are critical. However, the model’s limitations in long-term forecasting point to potential areas for further refinement. ...