The Memorization Problem: Can We Trust LLMs’ Economic Forecasts?

ArXiv ID: 2504.14765 “View on arXiv”

Authors: Unknown

Abstract

Large language models (LLMs) cannot be trusted for economic forecasts during periods covered by their training data. Counterfactual forecasting ability is non-identified when the model has seen the realized values: any observed output is consistent with both genuine skill and memorization. Any evidence of memorization represents only a lower bound on encoded knowledge. We demonstrate LLMs have memorized economic and financial data, recalling exact values before their knowledge cutoff. Instructions to respect historical boundaries fail to prevent recall-level accuracy, and masking fails as LLMs reconstruct entities and dates from minimal context. Post-cutoff, we observe no recall. Memorization extends to embeddings.

Keywords: Large Language Models, Memorization, Counterfactual Forecasting, Economic Data, Data Privacy, Macroeconomics / General

Complexity vs Empirical Score

Math Complexity: 3.0/10
Empirical Rigor: 7.5/10
Quadrant: Street Traders
Why: The paper’s theoretical component is primarily conceptual, focusing on non-identification and logical arguments, which is moderately complex; however, it is heavily grounded in systematic empirical testing with extensive data analysis, evaluation metrics, and specific findings on recall accuracy and masking failures.

  flowchart TD
    Start["Research Goal: Can LLMs be trusted for economic forecasts?"] --> Method["Methodology: Test recall vs. genuine skill"]
    Method --> Data["Data: Historical economic/financial series"]
    Data --> Proc["Computational Process: Prompt recall & counterfactual tests"]
    Proc --> Out1["Finding 1: Models memorize exact values pre-cutoff"]
    Proc --> Out2["Finding 2: Instructions/masking fail to prevent recall"]
    Proc --> Out3["Finding 3: Post-cutoff forecasts show no recall"]
    Out1 --> Result["Outcome: Forecast trust is impossible pre-cutoff"]
    Out2 --> Result
    Out3 --> Result

The Memorization Problem: Can We Trust LLMs’ Economic Forecasts?#

Abstract#

Complexity vs Empirical Score#

The Memorization Problem: Can We Trust LLMs’ Economic Forecasts?

Abstract

Complexity vs Empirical Score