Trillion Dollar Words: A New Financial Dataset, Task & Market Analysis

ArXiv ID: 2305.07972 “View on arXiv”

Authors: Unknown

Abstract

Monetary policy pronouncements by Federal Open Market Committee (FOMC) are a major driver of financial market returns. We construct the largest tokenized and annotated dataset of FOMC speeches, meeting minutes, and press conference transcripts in order to understand how monetary policy influences financial markets. In this study, we develop a novel task of hawkish-dovish classification and benchmark various pre-trained language models on the proposed dataset. Using the best-performing model (RoBERTa-large), we construct a measure of monetary policy stance for the FOMC document release days. To evaluate the constructed measure, we study its impact on the treasury market, stock market, and macroeconomic indicators. Our dataset, models, and code are publicly available on Huggingface and GitHub under CC BY-NC 4.0 license.

Keywords: Natural Language Processing (NLP), Monetary Policy, FOMC (Federal Open Market Committee), Hawkish-Dovish Classification, Market Impact Analysis, Multi-Asset (Bonds, Equities)

Complexity vs Empirical Score

  • Math Complexity: 3.5/10
  • Empirical Rigor: 8.0/10
  • Quadrant: Street Traders
  • Why: The paper focuses on building and benchmarking NLP models (RoBERTa-large, FinBERT) on a novel annotated dataset, with empirical validation on treasury and stock market impact, but lacks complex mathematical derivations or theoretical modeling.
  flowchart TD
    A["Research Goal:<br/>Monetary Policy & Market Impact"] --> B{"Key Methodology"}
    
    B --> C["Data Collection<br/>FOMC Speeches & Transcripts"]
    C --> D["Hawkish-Dovish<br/>Classification Task"]
    D --> E["Model Training &<br/>Benchmarking (RoBERTa-large)"]
    E --> F["Policy Stance<br/>Measurement"]
    
    F --> G{"Evaluation / Market Impact"}
    G --> H["Treasury Market<br/>Analysis"]
    G --> I["Stock Market<br/>Analysis"]
    G --> J["Macroeconomic<br/>Indicators"]
    
    H --> K["Key Findings:<br/>Policy Measure Predicts Asset Returns"]
    I --> K
    J --> K