false

Refined and Segmented Price Sentiment Indices from Survey Comments

Refined and Segmented Price Sentiment Indices from Survey Comments ArXiv ID: 2411.09937 “View on arXiv” Authors: Unknown Abstract We aim to enhance a price sentiment index and to more precisely understand price trends from the perspective of not only consumers but also businesses. We extract comments related to prices from the Economy Watchers Survey conducted by the Cabinet Office of Japan and classify price trends using a large language model (LLM). We classify whether the survey sample reflects the perspective of consumers or businesses, and whether the comments pertain to goods or services by utilizing information on the fields of comments and the industries of respondents included in the Economy Watchers Survey. From these classified price-related comments, we construct price sentiment indices not only for a general purpose but also for more specific objectives by combining perspectives on consumers and prices, as well as goods and services. It becomes possible to achieve a more accurate classification of price directions by employing a LLM for classification. Furthermore, integrating the outputs of multiple LLMs suggests the potential for the better performance of the classification. The use of more accurately classified comments allows for the construction of an index with a higher correlation to existing indices than previous studies. We demonstrate that the correlation of the price index for consumers, which has a larger sample size, is further enhanced by selecting comments for aggregation based on the industry of the survey respondents. ...

November 15, 2024 · 2 min · Research Team

Analyst Reports and Stock Performance: Evidence from the Chinese Market

Analyst Reports and Stock Performance: Evidence from the Chinese Market ArXiv ID: 2411.08726 “View on arXiv” Authors: Unknown Abstract This article applies natural language processing (NLP) to extract and quantify textual information to predict stock performance. Using an extensive dataset of Chinese analyst reports and employing a customized BERT deep learning model for Chinese text, this study categorizes the sentiment of the reports as positive, neutral, or negative. The findings underscore the predictive capacity of this sentiment indicator for stock volatility, excess returns, and trading volume. Specifically, analyst reports with strong positive sentiment will increase excess return and intraday volatility, and vice versa, reports with strong negative sentiment also increase volatility and trading volume, but decrease future excess return. The magnitude of this effect is greater for positive sentiment reports than for negative sentiment reports. This article contributes to the empirical literature on sentiment analysis and the response of the stock market to news in the Chinese stock market. ...

November 13, 2024 · 2 min · Research Team

Climate AI for Corporate Decarbonization Metrics Extraction

Climate AI for Corporate Decarbonization Metrics Extraction ArXiv ID: 2411.03402 “View on arXiv” Authors: Unknown Abstract Corporate Greenhouse Gas (GHG) emission targets are important metrics in sustainable investing [“12, 16”]. To provide a comprehensive view of company emission objectives, we propose an approach to source these metrics from company public disclosures. Without automation, curating these metrics manually is a labor-intensive process that requires combing through lengthy corporate sustainability disclosures that often do not follow a standard format. Furthermore, the resulting dataset needs to be validated thoroughly by Subject Matter Experts (SMEs), further lengthening the time-to-market. We introduce the Climate Artificial Intelligence for Corporate Decarbonization Metrics Extraction (CAI) model and pipeline, a novel approach utilizing Large Language Models (LLMs) to extract and validate linked metrics from corporate disclosures. We demonstrate that the process improves data collection efficiency and accuracy by automating data curation, validation, and metric scoring from public corporate disclosures. We further show that our results are agnostic to the choice of LLMs. This framework can be applied broadly to information extraction from textual data. ...

November 5, 2024 · 2 min · Research Team

Continuous Risk Factor Models: Analyzing Asset Correlations through Energy Distance

Continuous Risk Factor Models: Analyzing Asset Correlations through Energy Distance ArXiv ID: 2410.23447 “View on arXiv” Authors: Unknown Abstract This paper introduces a novel approach to financial risk analysis that does not rely on traditional price and market data, instead using market news to model assets as distributions over a metric space of risk factors. By representing asset returns as integrals over the scalar field of these risk factors, we derive the covariance structure between asset returns. Utilizing encoder-only language models to embed this news data, we explore the relationships between asset return distributions through the concept of Energy Distance, establishing connections between distributional differences and excess returns co-movements. This data-agnostic approach provides new insights into portfolio diversification, risk management, and the construction of hedging strategies. Our findings have significant implications for both theoretical finance and practical risk management, offering a more robust framework for modelling complex financial systems without depending on conventional market data. ...

October 30, 2024 · 2 min · Research Team

Enhancing literature review with LLM and NLP methods. Algorithmic trading case

Enhancing literature review with LLM and NLP methods. Algorithmic trading case ArXiv ID: 2411.05013 “View on arXiv” Authors: Unknown Abstract This study utilizes machine learning algorithms to analyze and organize knowledge in the field of algorithmic trading. By filtering a dataset of 136 million research papers, we identified 14,342 relevant articles published between 1956 and Q1 2020. We compare traditional practices-such as keyword-based algorithms and embedding techniques-with state-of-the-art topic modeling methods that employ dimensionality reduction and clustering. This comparison allows us to assess the popularity and evolution of different approaches and themes within algorithmic trading. We demonstrate the usefulness of Natural Language Processing (NLP) in the automatic extraction of knowledge, highlighting the new possibilities created by the latest iterations of Large Language Models (LLMs) like ChatGPT. The rationale for focusing on this topic stems from our analysis, which reveals that research articles on algorithmic trading are increasing at a faster rate than the overall number of publications. While stocks and main indices comprise more than half of all assets considered, certain asset classes, such as cryptocurrencies, exhibit a much stronger growth trend. Machine learning models have become the most popular methods in recent years. The study demonstrates the efficacy of LLMs in refining datasets and addressing intricate questions about the analyzed articles, such as comparing the efficiency of different models. Our research shows that by decomposing tasks into smaller components and incorporating reasoning steps, we can effectively tackle complex questions supported by case analyses. This approach contributes to a deeper understanding of algorithmic trading methodologies and underscores the potential of advanced NLP techniques in literature reviews. ...

October 23, 2024 · 2 min · Research Team

Large Language Model Agent in Financial Trading: A Survey

Large Language Model Agent in Financial Trading: A Survey ArXiv ID: 2408.06361 “View on arXiv” Authors: Unknown Abstract Trading is a highly competitive task that requires a combination of strategy, knowledge, and psychological fortitude. With the recent success of large language models(LLMs), it is appealing to apply the emerging intelligence of LLM agents in this competitive arena and understanding if they can outperform professional traders. In this survey, we provide a comprehensive review of the current research on using LLMs as agents in financial trading. We summarize the common architecture used in the agent, the data inputs, and the performance of LLM trading agents in backtesting as well as the challenges presented in these research. This survey aims to provide insights into the current state of LLM-based financial trading agents and outline future research directions in this field. ...

July 26, 2024 · 2 min · Research Team

AMA-LSTM: Pioneering Robust and Fair Financial Audio Analysis for Stock Volatility Prediction

AMA-LSTM: Pioneering Robust and Fair Financial Audio Analysis for Stock Volatility Prediction ArXiv ID: 2407.18324 “View on arXiv” Authors: Unknown Abstract Stock volatility prediction is an important task in the financial industry. Recent advancements in multimodal methodologies, which integrate both textual and auditory data, have demonstrated significant improvements in this domain, such as earnings calls (Earnings calls are public available and often involve the management team of a public company and interested parties to discuss the company’s earnings). However, these multimodal methods have faced two drawbacks. First, they often fail to yield reliable models and overfit the data due to their absorption of stochastic information from the stock market. Moreover, using multimodal models to predict stock volatility suffers from gender bias and lacks an efficient way to eliminate such bias. To address these aforementioned problems, we use adversarial training to generate perturbations that simulate the inherent stochasticity and bias, by creating areas resistant to random information around the input space to improve model robustness and fairness. Our comprehensive experiments on two real-world financial audio datasets reveal that this method exceeds the performance of current state-of-the-art solution. This confirms the value of adversarial training in reducing stochasticity and bias for stock volatility prediction tasks. ...

July 3, 2024 · 2 min · Research Team

BERT vs GPT for financial engineering

BERT vs GPT for financial engineering ArXiv ID: 2405.12990 “View on arXiv” Authors: Unknown Abstract The paper benchmarks several Transformer models [“4”], to show how these models can judge sentiment from a news event. This signal can then be used for downstream modelling and signal identification for commodity trading. We find that fine-tuned BERT models outperform fine-tuned or vanilla GPT models on this task. Transformer models have revolutionized the field of natural language processing (NLP) in recent years, achieving state-of-the-art results on various tasks such as machine translation, text summarization, question answering, and natural language generation. Among the most prominent transformer models are Bidirectional Encoder Representations from Transformers (BERT) and Generative Pre-trained Transformer (GPT), which differ in their architectures and objectives. A CopBERT model training data and process overview is provided. The CopBERT model outperforms similar domain specific BERT trained models such as FinBERT. The below confusion matrices show the performance on CopBERT & CopGPT respectively. We see a ~10 percent increase in f1_score when compare CopBERT vs GPT4 and 16 percent increase vs CopGPT. Whilst GPT4 is dominant It highlights the importance of considering alternatives to GPT models for financial engineering tasks, given risks of hallucinations, and challenges with interpretability. We unsurprisingly see the larger LLMs outperform the BERT models, with predictive power. In summary BERT is partially the new XGboost, what it lacks in predictive power it provides with higher levels of interpretability. Concluding that BERT models might not be the next XGboost [“2”], but represent an interesting alternative for financial engineering tasks, that require a blend of interpretability and accuracy. ...

April 24, 2024 · 2 min · Research Team

BERTopic-Driven Stock Market Predictions: Unraveling Sentiment Insights

BERTopic-Driven Stock Market Predictions: Unraveling Sentiment Insights ArXiv ID: 2404.02053 “View on arXiv” Authors: Unknown Abstract This paper explores the intersection of Natural Language Processing (NLP) and financial analysis, focusing on the impact of sentiment analysis in stock price prediction. We employ BERTopic, an advanced NLP technique, to analyze the sentiment of topics derived from stock market comments. Our methodology integrates this sentiment analysis with various deep learning models, renowned for their effectiveness in time series and stock prediction tasks. Through comprehensive experiments, we demonstrate that incorporating topic sentiment notably enhances the performance of these models. The results indicate that topics in stock market comments provide implicit, valuable insights into stock market volatility and price trends. This study contributes to the field by showcasing the potential of NLP in enriching financial analysis and opens up avenues for further research into real-time sentiment analysis and the exploration of emotional and contextual aspects of market sentiment. The integration of advanced NLP techniques like BERTopic with traditional financial analysis methods marks a step forward in developing more sophisticated tools for understanding and predicting market behaviors. ...

April 2, 2024 · 2 min · Research Team

Regional inflation analysis using social network data

Regional inflation analysis using social network data ArXiv ID: 2403.00774 “View on arXiv” Authors: Unknown Abstract Inflation is one of the most important macroeconomic indicators that have a great impact on the population of any country and region. Inflation is influenced by range of factors, one of which is inflation expectations. Many central banks take this factor into consideration while implementing monetary policy within the inflation targeting regime. Nowadays, a lot of people are active users of the Internet, especially social networks. There is a hypothesis that people search, read, and discuss mainly only those issues that are of particular interest to them. It is logical to assume that the dynamics of prices may also be in the focus of user discussions. So, such discussions could be regarded as an alternative source of more rapid information about inflation expectations. This study is based on unstructured data from Vkontakte social network to analyze upward and downward inflationary trends (on the example of the Omsk region). The sample of more than 8.5 million posts was collected between January 2010 and May 2022. The authors used BERT neural networks to solve the problem. These models demonstrated better results than the benchmarks (e.g., logistic regression, decision tree classifier, etc.). It makes possible to define pro-inflationary and disinflationary types of keywords in different contexts and get their visualization with SHAP method. This analysis provides additional operational information about inflationary processes at the regional level The proposed approach can be scaled for other regions. At the same time the limitation of the work is the time and power costs for the initial training of similar models for all regions of Russia. ...

February 14, 2024 · 2 min · Research Team