Equities

Anomalous diffusion and price impact in the fluid-limit of an order book

Anomalous diffusion and price impact in the fluid-limit of an order book ArXiv ID: 2310.06079 “View on arXiv” Authors: Unknown Abstract We extend a Discrete Time Random Walk (DTRW) numerical scheme to simulate the anomalous diffusion of financial market orders in a simulated order book. Here using random walks with Sibuya waiting times to include a time-dependent stochastic forcing function with non-uniformly sampled times between order book events in the setting of fractional diffusion. This models the fluid limit of an order book by modelling the continuous arrival, cancellation and diffusion of orders in the presence of information shocks. We study the impulse response and stylised facts of orders undergoing anomalous diffusion for different forcing functions and model parameters. Concretely, we demonstrate the price impact for flash limit-orders and market orders and show how the numerical method generate kinks in the price impact. We use cubic spline interpolation to generate smoothed price impact curves. The work promotes the use of non-uniform sampling in the presence of diffusive dynamics as the preferred simulation method. ...

Dual-Class Stocks: Can They Serve as Effective Predictors?

Dual-Class Stocks: Can They Serve as Effective Predictors? ArXiv ID: 2310.16845 “View on arXiv” Authors: Unknown Abstract Kardemir Karabuk Iron Steel Industry Trade & Co. Inc., ranked as the 24th largest industrial company in Turkey, offers three distinct stocks listed on the Borsa Istanbul: KRDMA, KRDMB, and KRDMD. These stocks, sharing the sole difference in voting power, have exhibited significant price divergence over an extended period. This paper conducts an in-depth analysis of the divergence patterns observed in these three stock prices from January 2001 to July 2023. Additionally, it introduces an innovative training set selection rule tailored for LSTM models, incorporating a rolling training set, and demonstrates its significant predictive superiority over the conventional use of LSTM models with large training sets. Despite their strong correlation, the study found no compelling evidence supporting the efficiency of dual-class stocks as predictors of each other’s performance. ...

Integrating Stock Features and Global Information via Large Language Models for Enhanced Stock Return Prediction

Integrating Stock Features and Global Information via Large Language Models for Enhanced Stock Return Prediction ArXiv ID: 2310.05627 “View on arXiv” Authors: Unknown Abstract The remarkable achievements and rapid advancements of Large Language Models (LLMs) such as ChatGPT and GPT-4 have showcased their immense potential in quantitative investment. Traders can effectively leverage these LLMs to analyze financial news and predict stock returns accurately. However, integrating LLMs into existing quantitative models presents two primary challenges: the insufficient utilization of semantic information embedded within LLMs and the difficulties in aligning the latent information within LLMs with pre-existing quantitative stock features. We propose a novel framework consisting of two components to surmount these challenges. The first component, the Local-Global (LG) model, introduces three distinct strategies for modeling global information. These approaches are grounded respectively on stock features, the capabilities of LLMs, and a hybrid method combining the two paradigms. The second component, Self-Correlated Reinforcement Learning (SCRL), focuses on aligning the embeddings of financial news generated by LLMs with stock features within the same semantic space. By implementing our framework, we have demonstrated superior performance in Rank Information Coefficient and returns, particularly compared to models relying only on stock features in the China A-share market. ...

FinGPT: Instruction Tuning Benchmark for Open-Source Large Language Models in Financial Datasets

FinGPT: Instruction Tuning Benchmark for Open-Source Large Language Models in Financial Datasets ArXiv ID: 2310.04793 “View on arXiv” Authors: Unknown Abstract In the swiftly expanding domain of Natural Language Processing (NLP), the potential of GPT-based models for the financial sector is increasingly evident. However, the integration of these models with financial datasets presents challenges, notably in determining their adeptness and relevance. This paper introduces a distinctive approach anchored in the Instruction Tuning paradigm for open-source large language models, specifically adapted for financial contexts. Through this methodology, we capitalize on the interoperability of open-source models, ensuring a seamless and transparent integration. We begin by explaining the Instruction Tuning paradigm, highlighting its effectiveness for immediate integration. The paper presents a benchmarking scheme designed for end-to-end training and testing, employing a cost-effective progression. Firstly, we assess basic competencies and fundamental tasks, such as Named Entity Recognition (NER) and sentiment analysis to enhance specialization. Next, we delve into a comprehensive model, executing multi-task operations by amalgamating all instructional tunings to examine versatility. Finally, we explore the zero-shot capabilities by earmarking unseen tasks and incorporating novel datasets to understand adaptability in uncharted terrains. Such a paradigm fortifies the principles of openness and reproducibility, laying a robust foundation for future investigations in open-source financial large language models (FinLLMs). ...

Enhancing Financial Sentiment Analysis via Retrieval Augmented Large Language Models

Enhancing Financial Sentiment Analysis via Retrieval Augmented Large Language Models ArXiv ID: 2310.04027 “View on arXiv” Authors: Unknown Abstract Financial sentiment analysis is critical for valuation and investment decision-making. Traditional NLP models, however, are limited by their parameter size and the scope of their training datasets, which hampers their generalization capabilities and effectiveness in this field. Recently, Large Language Models (LLMs) pre-trained on extensive corpora have demonstrated superior performance across various NLP tasks due to their commendable zero-shot abilities. Yet, directly applying LLMs to financial sentiment analysis presents challenges: The discrepancy between the pre-training objective of LLMs and predicting the sentiment label can compromise their predictive performance. Furthermore, the succinct nature of financial news, often devoid of sufficient context, can significantly diminish the reliability of LLMs’ sentiment analysis. To address these challenges, we introduce a retrieval-augmented LLMs framework for financial sentiment analysis. This framework includes an instruction-tuned LLMs module, which ensures LLMs behave as predictors of sentiment labels, and a retrieval-augmentation module which retrieves additional context from reliable external sources. Benchmarked against traditional models and LLMs like ChatGPT and LLaMA, our approach achieves 15% to 48% performance gain in accuracy and F1 score. ...

Estimation of market efficiency process within time-varying autoregressive models by extended Kalman filtering approach

Estimation of market efficiency process within time-varying autoregressive models by extended Kalman filtering approach ArXiv ID: 2310.04125 “View on arXiv” Authors: Unknown Abstract This paper explores a time-varying version of weak-form market efficiency that is a key component of the so-called Adaptive Market Hypothesis (AMH). One of the most common methodologies used for modeling and estimating a degree of market efficiency lies in an analysis of the serial autocorrelation in observed return series. Under the AMH, a time-varying market efficiency level is modeled by time-varying autoregressive (AR) process and traditionally estimated by the Kalman filter (KF). Being a linear estimator, the KF is hardly capable to track the hidden nonlinear dynamics that is an essential feature of the models under investigation. The contribution of this paper is threefold. We first provide a brief overview of time-varying AR models and estimation methods utilized for testing a weak-form market efficiency in econometrics literature. Secondly, we propose novel accurate estimation approach for recovering the hidden process of evolving market efficiency level by the extended Kalman filter (EKF). Thirdly, our empirical study concerns an examination of the Standard and Poor’s 500 Composite stock index and the Dow Jones Industrial Average index. Monthly data covers the period from November 1927 to June 2020, which includes the U.S. Great Depression, the 2008-2009 global financial crisis and the first wave of recent COVID-19 recession. The results reveal that the U.S. market was affected during all these periods, but generally remained weak-form efficient since the mid of 1946 as detected by the estimator. ...

Multi-Industry Simplex : A Probabilistic Extension of GICS

Multi-Industry Simplex : A Probabilistic Extension of GICS ArXiv ID: 2310.04280 “View on arXiv” Authors: Unknown Abstract Accurate industry classification is a critical tool for many asset management applications. While the current industry gold-standard GICS (Global Industry Classification Standard) has proven to be reliable and robust in many settings, it has limitations that cannot be ignored. Fundamentally, GICS is a single-industry model, in which every firm is assigned to exactly one group - regardless of how diversified that firm may be. This approach breaks down for large conglomerates like Amazon, which have risk exposure spread out across multiple sectors. We attempt to overcome these limitations by developing MIS (Multi-Industry Simplex), a probabilistic model that can flexibly assign a firm to as many industries as can be supported by the data. In particular, we utilize topic modeling, an natural language processing approach that utilizes business descriptions to extract and identify corresponding industries. Each identified industry comes with a relevance probability, allowing for high interpretability and easy auditing, circumventing the black-box nature of alternative machine learning approaches. We describe this model in detail and provide two use-cases that are relevant to asset management - thematic portfolios and nearest neighbor identification. While our approach has limitations of its own, we demonstrate the viability of probabilistic industry classification and hope to inspire future research in this field. ...

Navigating Uncertainty in ESG Investing

Navigating Uncertainty in ESG Investing ArXiv ID: 2310.02163 “View on arXiv” Authors: Unknown Abstract The widespread confusion among investors regarding Environmental, Social, and Governance (ESG) rankings assigned by rating agencies has underscored a critical issue in sustainable investing. To address this uncertainty, our research has devised methods that not only recognize this ambiguity but also offer tailored investment strategies for different investor profiles. By developing ESG ensemble strategies and integrating ESG scores into a Reinforcement Learning (RL) model, we aim to optimize portfolios that cater to both financial returns and ESG-focused outcomes. Additionally, by proposing the Double-Mean-Variance model, we classify three types of investors based on their risk preferences. We also introduce ESG-adjusted Capital Asset Pricing Models (CAPMs) to assess the performance of these optimized portfolios. Ultimately, our comprehensive approach provides investors with tools to navigate the inherent ambiguities of ESG ratings, facilitating more informed investment decisions. ...

Robust Long-Term Growth Rate of Expected Utility for Leveraged ETFs

Robust Long-Term Growth Rate of Expected Utility for Leveraged ETFs ArXiv ID: 2310.02084 “View on arXiv” Authors: Unknown Abstract This paper analyzes the robust long-term growth rate of expected utility and expected return from holding a leveraged exchange-traded fund (LETF). When the Markovian model parameters in the reference asset are uncertain, the robust long-term growth rate is derived by analyzing the worst-case parameters among an uncertainty set. We compute the growth rate and describe the optimal leverage ratio maximizing the robust long-term growth rate. To achieve this, the worst-case parameters are analyzed by the comparison principle, and the growth rate of the worst-case is computed using the martingale extraction method. The robust long-term growth rates are obtained explicitly under a number of models for the reference asset, including the geometric Brownian motion (GBM), Cox–Ingersoll–Ross (CIR), 3/2, and Heston and 3/2 stochastic volatility models. Additionally, we demonstrate the impact of stochastic interest rates, such as the Vasicek and inverse GARCH short rate models. This paper is an extended work of \citet{“Leung2017”}. ...

Signature Methods in Stochastic Portfolio Theory

Signature Methods in Stochastic Portfolio Theory ArXiv ID: 2310.02322 “View on arXiv” Authors: Unknown Abstract In the context of stochastic portfolio theory we introduce a novel class of portfolios which we call linear path-functional portfolios. These are portfolios which are determined by certain transformations of linear functions of a collections of feature maps that are non-anticipative path functionals of an underlying semimartingale. As main example for such feature maps we consider the signature of the (ranked) market weights. We prove that these portfolios are universal in the sense that every continuous, possibly path-dependent, portfolio function of the market weights can be uniformly approximated by signature portfolios. We also show that signature portfolios can approximate the growth-optimal portfolio in several classes of non-Markovian market models arbitrarily well and illustrate numerically that the trained signature portfolios are remarkably close to the theoretical growth-optimal portfolios. Besides these universality features, the main numerical advantage lies in the fact that several optimization tasks like maximizing (expected) logarithmic wealth or mean-variance optimization within the class of linear path-functional portfolios reduce to a convex quadratic optimization problem, thus making it computationally highly tractable. We apply our method also to real market data based on several indices. Our results point towards out-performance on the considered out-of-sample data, also in the presence of transaction costs. ...