Methods for Acquiring and Incorporating Knowledge into Stock Price Prediction: A Survey

ArXiv ID: 2308.04947 “View on arXiv”

Authors: Unknown

Abstract

Predicting stock prices presents a challenging research problem due to the inherent volatility and non-linear nature of the stock market. In recent years, knowledge-enhanced stock price prediction methods have shown groundbreaking results by utilizing external knowledge to understand the stock market. Despite the importance of these methods, there is a scarcity of scholarly works that systematically synthesize previous studies from the perspective of external knowledge types. Specifically, the external knowledge can be modeled in different data structures, which we group into non-graph-based formats and graph-based formats: 1) non-graph-based knowledge captures contextual information and multimedia descriptions specifically associated with an individual stock; 2) graph-based knowledge captures interconnected and interdependent information in the stock market. This survey paper aims to provide a systematic and comprehensive description of methods for acquiring external knowledge from various unstructured data sources and then incorporating it into stock price prediction models. We also explore fusion methods for combining external knowledge with historical price features. Moreover, this paper includes a compilation of relevant datasets and delves into potential future research directions in this domain.

Keywords: Stock Price Prediction, Graph Neural Networks (GNN), Knowledge Graphs, Data Fusion, Natural Language Processing (NLP), Equities

Complexity vs Empirical Score

  • Math Complexity: 3.0/10
  • Empirical Rigor: 2.0/10
  • Quadrant: Philosophers
  • Why: This is a survey paper that categorizes existing methods and discusses high-level concepts like knowledge types and fusion paradigms, rather than introducing novel mathematics or providing specific implementation details or backtests.
  flowchart TD
    Start["Research Goal: Methods for Acquiring & Incorporating External Knowledge into Stock Price Prediction"] --> Inputs
    
    subgraph Inputs ["Data & Knowledge Sources"]
        Price["Historical Stock Prices"]
        News["News/Financial Reports"]
        Social["Social Media Data"]
    end
    
    Inputs --> Acquisition
    subgraph Acquisition ["Knowledge Acquisition Phase"]
        NonGraph["Non-Graph Knowledge<br/>(NLP, Text Processing)"]
        Graph["Graph Knowledge<br/>(Knowledge Graphs, GNNs)"]
    end
    
    Acquisition --> Fusion
    subgraph Fusion ["Knowledge Fusion Methods"]
        Concat["Concatenation"]
        Att["Attention Mechanism"]
        Gate["Gate-based Fusion"]
    end
    
    Fusion --> Prediction["Stock Price Prediction Model"]
    Prediction --> Outcomes
    
    subgraph Outcomes ["Key Findings & Future Directions"]
        Find1["Graph methods capture market dependencies better"]
        Find2["NLP vital for sentiment/context extraction"]
        Find3["Hybrid fusion yields best performance"]
        Future["Future: Multimodal, Real-time, Causal KGs"]
    end