Reinforcement Learning for Monetary Policy Under Macroeconomic Uncertainty: Analyzing Tabular and Function Approximation Methods

ArXiv ID: 2512.17929 “View on arXiv”

Authors: Tony Wang, Kyle Feinstein, Sheryl Chen

Abstract

We study how a central bank should dynamically set short-term nominal interest rates to stabilize inflation and unemployment when macroeconomic relationships are uncertain and time-varying. We model monetary policy as a sequential decision-making problem where the central bank observes macroeconomic conditions quarterly and chooses interest rate adjustments. Using publicly accessible historical Federal Reserve Economic Data (FRED), we construct a linear-Gaussian transition model and implement a discrete-action Markov Decision Process with a quadratic loss reward function. We chose to compare nine different reinforcement learning style approaches against Taylor Rule and naive baselines, including tabular Q-learning variants, SARSA, Actor-Critic, Deep Q-Networks, Bayesian Q-learning with uncertainty quantification, and POMDP formulations with partial observability. Notably, despite its simplicity, standard tabular Q-learning achieved the best performance (-615.13 +- 309.58 mean return), outperforming both enhanced RL methods and traditional policy rules. Our results suggest that while sophisticated RL techniques show promise for monetary policy applications, simpler approaches may be more robust in this domain, highlighting important challenges in applying modern RL to macroeconomic policy.

Keywords: Reinforcement Learning (RL), Monetary Policy, Markov Decision Process (MDP), Taylor Rule, Q-Learning, Macro

Complexity vs Empirical Score

  • Math Complexity: 6.5/10
  • Empirical Rigor: 7.2/10
  • Quadrant: Holy Grail
  • Why: The paper introduces advanced reinforcement learning concepts and formal MDP/POMDP models, indicating moderate-to-high mathematical density. However, it is backed by concrete empirical implementation using real-world FRED data, specific backtesting procedures, and detailed performance metrics with statistical analysis.
  flowchart TD
    A["Research Goal: RL for Monetary Policy under Uncertainty"] --> B["Data & Inputs<br>FRED Macroeconomic Data"]
    B --> C["Model: MDP &<br>Linear-Gaussian Transition"]
    C --> D{"RL Methods Comparison"}
    D --> E["Tabular Methods<br>Q-Learning, SARSA"]
    D --> F["Deep Methods<br>DQN, Actor-Critic"]
    D --> G["Uncertainty Methods<br>Bayesian Q-Learning, POMDP"]
    E & F & G --> H["Key Finding: Simplicity Wins<br>Tabular Q-Learning (-615.13) Outperforms Complex Methods"]