MoA is All You Need: Building LLM Research Team using Mixture of Agents

ArXiv ID: 2409.07487 “View on arXiv”

Authors: Unknown

Abstract

Large Language Models (LLMs) research in the financial domain is particularly complex due to the sheer number of approaches proposed in literature. Retrieval-Augmented Generation (RAG) has emerged as one of the leading methods in the sector due to its inherent groundedness and data source variability. In this work, we introduce a RAG framework called Mixture of Agents (MoA) and demonstrate its viability as a practical, customizable, and highly effective approach for scaling RAG applications. MoA is essentially a layered network of individually customized small language models (Hoffmann et al., 2022) collaborating to answer questions and extract information. While there are many theoretical propositions for such an architecture and even a few libraries for generally applying the structure in practice, there are limited documented studies evaluating the potential of this framework considering real business constraints such as cost and speed. We find that the MoA framework, consisting of small language models (Hoffmann et al., 2022), produces higher quality and more grounded responses across various financial domains that are core to Vanguard’s business while simultaneously maintaining low costs.

Keywords: Large Language Models (LLM), Retrieval-Augmented Generation (RAG), Mixture of Agents, NLP, Financial Analytics

Complexity vs Empirical Score

  • Math Complexity: 2.0/10
  • Empirical Rigor: 3.0/10
  • Quadrant: Philosophers
  • Why: The paper focuses on system architecture (MoA) and RAG frameworks with minimal mathematical derivations or formulas, relying instead on conceptual comparisons and enterprise application descriptions. While it claims cost and quality improvements, the excerpt provides no backtested datasets, statistical metrics, or implementation code, relying on high-level assertions rather than rigorous empirical validation.
  flowchart TD
    A["Research Goal:<br>Develop Cost-Effective RAG<br>for Financial NLP"] --> B["Methodology:<br>Mixture of Agents (MoA)<br>Layered Architecture"]
    
    B --> C["Input Data:<br>Financial Documents &<br>Vanguard Business Queries"]
    
    C --> D["Process:<br>Collaborative Small LLMs<br>Chained Retrieval & Generation"]
    
    D --> E["Key Constraints:<br>Cost & Speed Evaluation"]
    
    E --> F["Outcomes:<br>Higher Quality Grounded Responses<br>Low Operational Cost"]
    
    F --> G["Outcome:<br>Viable Framework for<br>Scaling RAG Applications"]