Words That Unite The World: A Unified Framework for Deciphering Central Bank Communications Globally

ArXiv ID: 2505.17048 “View on arXiv”

Authors: Agam Shah, Siddhant Sukhani, Huzaifa Pardawala, Saketh Budideti, Riya Bhadani, Rudra Gopal, Siddhartha Somani, Rutwik Routu, Michael Galarnyk, Soungmin Lee, Arnav Hiray, Akshar Ravichandran, Eric Kim, Pranav Aluru, Joshua Zhang, Sebastian Jaskowski, Veer Guda, Meghaj Tarte, Liqin Ye, Spencer Gosden, Rachel Yuh, Sloka Chava, Sahasra Chava, Dylan Patrick Kelly, Aiden Chiang, Harsit Mittal, Sudheer Chava

Abstract

Central banks around the world play a crucial role in maintaining economic stability. Deciphering policy implications in their communications is essential, especially as misinterpretations can disproportionately impact vulnerable populations. To address this, we introduce the World Central Banks (WCB) dataset, the most comprehensive monetary policy corpus to date, comprising over 380k sentences from 25 central banks across diverse geographic regions, spanning 28 years of historical data. After uniformly sampling 1k sentences per bank (25k total) across all available years, we annotate and review each sentence using dual annotators, disagreement resolutions, and secondary expert reviews. We define three tasks: Stance Detection, Temporal Classification, and Uncertainty Estimation, with each sentence annotated for all three. We benchmark seven Pretrained Language Models (PLMs) and nine Large Language Models (LLMs) (Zero-Shot, Few-Shot, and with annotation guide) on these tasks, running 15,075 benchmarking experiments. We find that a model trained on aggregated data across banks significantly surpasses a model trained on an individual bank’s data, confirming the principle “the whole is greater than the sum of its parts.” Additionally, rigorous human evaluations, error analyses, and predictive tasks validate our framework’s economic utility. Our artifacts are accessible through the HuggingFace and GitHub under the CC-BY-NC-SA 4.0 license.

Keywords: Central Bank Communication, Stance Detection, Large Language Models (LLMs), Natural Language Processing, Monetary Policy, Macro

Complexity vs Empirical Score

  • Math Complexity: 3.5/10
  • Empirical Rigor: 9.0/10
  • Quadrant: Street Traders
  • Why: The paper’s mathematics is relatively low, primarily relying on standard statistical measures and NLP metrics without complex derivations, while its empirical rigor is extremely high, featuring a large curated dataset (25k annotated sentences), extensive benchmarking (15,075 experiments), and explicit release of models and code for reproducibility.
  flowchart TD
    A["Research Goal<br>Decipher Central Bank<br>Communications Globally"] --> B["Data Collection<br>WCB Dataset: 380k sentences<br>25 banks, 28 years"]
    B --> C["Annotation & Labeling<br>25k sampled sentences<br>Stance, Temporal, Uncertainty"]
    C --> D["Computational Benchmarking<br>7 PLMs + 9 LLMs<br>15k+ experiments"]
    D --> E{"Key Findings & Outcomes"}
    E --> F["Model Performance<br>Training on aggregated data<br>surpasses individual banks"]
    E --> G["Framework Validation<br>Human eval & error analysis<br>confirm economic utility"]
    E --> H["Artifacts Released<br>HuggingFace & GitHub<br>CC-BY-NC-SA 4.0"]