Methods for Acquiring and Incorporating Knowledge into Stock Price Prediction: A Survey
ArXiv ID: 2308.04947 “View on arXiv”
Authors: Unknown
Abstract
Predicting stock prices presents a challenging research problem due to the inherent volatility and non-linear nature of the stock market. In recent years, knowledge-enhanced stock price prediction methods have shown groundbreaking results by utilizing external knowledge to understand the stock market. Despite the importance of these methods, there is a scarcity of scholarly works that systematically synthesize previous studies from the perspective of external knowledge types. Specifically, the external knowledge can be modeled in different data structures, which we group into non-graph-based formats and graph-based formats: 1) non-graph-based knowledge captures contextual information and multimedia descriptions specifically associated with an individual stock; 2) graph-based knowledge captures interconnected and interdependent information in the stock market. This survey paper aims to provide a systematic and comprehensive description of methods for acquiring external knowledge from various unstructured data sources and then incorporating it into stock price prediction models. We also explore fusion methods for combining external knowledge with historical price features. Moreover, this paper includes a compilation of relevant datasets and delves into potential future research directions in this domain.
Keywords: Stock Price Prediction, Graph Neural Networks (GNN), Knowledge Graphs, Data Fusion, Natural Language Processing (NLP), Equities
Complexity vs Empirical Score
- Math Complexity: 3.0/10
- Empirical Rigor: 2.0/10
- Quadrant: Philosophers
- Why: This is a survey paper that categorizes existing methods and discusses high-level concepts like knowledge types and fusion paradigms, rather than introducing novel mathematics or providing specific implementation details or backtests.
flowchart TD
Start["Research Goal: Methods for Acquiring & Incorporating External Knowledge into Stock Price Prediction"] --> Inputs
subgraph Inputs ["Data & Knowledge Sources"]
Price["Historical Stock Prices"]
News["News/Financial Reports"]
Social["Social Media Data"]
end
Inputs --> Acquisition
subgraph Acquisition ["Knowledge Acquisition Phase"]
NonGraph["Non-Graph Knowledge<br/>(NLP, Text Processing)"]
Graph["Graph Knowledge<br/>(Knowledge Graphs, GNNs)"]
end
Acquisition --> Fusion
subgraph Fusion ["Knowledge Fusion Methods"]
Concat["Concatenation"]
Att["Attention Mechanism"]
Gate["Gate-based Fusion"]
end
Fusion --> Prediction["Stock Price Prediction Model"]
Prediction --> Outcomes
subgraph Outcomes ["Key Findings & Future Directions"]
Find1["Graph methods capture market dependencies better"]
Find2["NLP vital for sentiment/context extraction"]
Find3["Hybrid fusion yields best performance"]
Future["Future: Multimodal, Real-time, Causal KGs"]
end