FinDKG: Dynamic Knowledge Graphs with Large Language Models for Detecting Global Trends in Financial Markets

ArXiv ID: 2407.10909 “View on arXiv”

Authors: Unknown

Abstract

Dynamic knowledge graphs (DKGs) are popular structures to express different types of connections between objects over time. They can also serve as an efficient mathematical tool to represent information extracted from complex unstructured data sources, such as text or images. Within financial applications, DKGs could be used to detect trends for strategic thematic investing, based on information obtained from financial news articles. In this work, we explore the properties of large language models (LLMs) as dynamic knowledge graph generators, proposing a novel open-source fine-tuned LLM for this purpose, called the Integrated Contextual Knowledge Graph Generator (ICKG). We use ICKG to produce a novel open-source DKG from a corpus of financial news articles, called FinDKG, and we propose an attention-based GNN architecture for analysing it, called KGTransformer. We test the performance of the proposed model on benchmark datasets and FinDKG, demonstrating superior performance on link prediction tasks. Additionally, we evaluate the performance of the KGTransformer on FinDKG for thematic investing, showing it can outperform existing thematic ETFs.

Keywords: Dynamic Knowledge Graphs, Large Language Models, Graph Neural Networks, Thematic Investing, Link Prediction, Equities

Complexity vs Empirical Score

  • Math Complexity: 7.5/10
  • Empirical Rigor: 7.0/10
  • Quadrant: Holy Grail
  • Why: The paper introduces novel mathematical architectures (KGTransformer based on GATs/HGTs/EvoKG) with formal definitions of dynamic knowledge graphs, while also providing concrete empirical evidence including a fine-tuned LLM pipeline, open-source datasets (FinDKG), and performance metrics on link prediction and thematic investing vs. ETFs.
  flowchart TD
    A["Research Goal:<br/>Detect Global Financial Trends<br/>using Dynamic Knowledge Graphs"] --> B["Data: Financial News Corpus"]
    B --> C{"Core Methodology:<br/>Generate & Analyze DKG"}
    C --> D["LLM Generator<br/>(ICKG):<br/>Build Dynamic Knowledge Graph"]
    C --> E["Graph Neural Network<br/>(KGTransformer):<br/>Attention-based Analysis"]
    D --> E
    E --> F["Computational Task:<br/>Link Prediction<br/>& Thematic Investing"]
    F --> G["Key Findings:<br/>1. SOTA Link Prediction<br/>2. Outperforms Thematic ETFs<br/>3. Validated FinDKG"]