VERAFI: Verified Agentic Financial Intelligence through Neurosymbolic Policy Generation

ArXiv ID: 2512.14744 “View on arXiv”

Authors: Adewale Akinfaderin, Shreyas Subramanian

Abstract

Financial AI systems suffer from a critical blind spot: while Retrieval-Augmented Generation (RAG) excels at finding relevant documents, language models still generate calculation errors and regulatory violations during reasoning, even with perfect retrieval. This paper introduces VERAFI (Verified Agentic Financial Intelligence), an agentic framework with neurosymbolic policy generation for verified financial intelligence. VERAFI combines state-of-the-art dense retrieval and cross-encoder reranking with financial tool-enabled agents and automated reasoning policies covering GAAP compliance, SEC requirements, and mathematical validation. Our comprehensive evaluation on FinanceBench demonstrates remarkable improvements: while traditional dense retrieval with reranking achieves only 52.4% factual correctness, VERAFI’s integrated approach reaches 94.7%, an 81% relative improvement. The neurosymbolic policy layer alone contributes a 4.3 percentage point gain over pure agentic processing, specifically targeting persistent mathematical and logical errors. By integrating financial domain expertise directly into the reasoning process, VERAFI offers a practical pathway toward trustworthy financial AI that meets the stringent accuracy demands of regulatory compliance, investment decisions, and risk management.

Keywords: Retrieval-Augmented Generation (RAG), Neurosymbolic AI, Financial Compliance, Agentic Framework, SEC/GAAP

Complexity vs Empirical Score

  • Math Complexity: 6.5/10
  • Empirical Rigor: 7.0/10
  • Quadrant: Holy Grail
  • Why: The paper introduces a novel neurosymbolic policy layer using formal SMT-lib specifications for regulatory and mathematical validation, which involves advanced mathematical formalization. It also provides comprehensive empirical evaluation on FinanceBench with specific metrics (94.7% factual correctness, 81% relative improvement), demonstrating backtest-ready rigor.
  flowchart TD
    A["Research Goal: Verify Financial AI"] --> B["Data & Framework"]
    B --> C["Method: Neurosymbolic Policy"]
    C --> D["Process: Agentic Tools & RAG"]
    D --> E["Outcome: 94.7% Accuracy"]
    
    subgraph B ["Inputs"]
        B1["FinanceBench Dataset"]
        B2["GAAP/SEC Regulations"]
    end
    
    subgraph D ["Computation"]
        D1["Dense Retrieval"]
        D2["Cross-Encoder Reranking"]
        D3["Tool-Enabled Reasoning"]
    end
    
    subgraph E ["Results"]
        E1["81% Relative Improvement"]
        E2["Regulatory Compliance"]
        E3["Mathematical Verification"]
    end