One More Question is Enough, Expert Question Decomposition (EQD) Model for Domain Quantitative Reasoning
ArXiv ID: 2510.01526 “View on arXiv”
Authors: Mengyu Wang, Sotirios Sabanis, Miguel de Carvalho, Shay B. Cohen, Tiejun Ma
Abstract
Domain-specific quantitative reasoning remains a major challenge for large language models (LLMs), especially in fields requiring expert knowledge and complex question answering (QA). In this work, we propose Expert Question Decomposition (EQD), an approach designed to balance the use of domain knowledge with computational efficiency. EQD is built on a two-step fine-tuning framework and guided by a reward function that measures the effectiveness of generated sub-questions in improving QA outcomes. It requires only a few thousand training examples and a single A100 GPU for fine-tuning, with inference time comparable to zero-shot prompting. Beyond its efficiency, EQD outperforms state-of-the-art domain-tuned models and advanced prompting strategies. We evaluate EQD in the financial domain, characterized by specialized knowledge and complex quantitative reasoning, across four benchmark datasets. Our method consistently improves QA performance by 0.6% to 10.5% across different LLMs. Our analysis reveals an important insight: in domain-specific QA, a single supporting question often provides greater benefit than detailed guidance steps.
Keywords: Large Language Models, Question Decomposition, Quantitative Reasoning, Fine-tuning, Reward Function, Multi-Asset (Financial QA)
Complexity vs Empirical Score
- Math Complexity: 1.5/10
- Empirical Rigor: 3.0/10
- Quadrant: Philosophers
- Why: The paper focuses on NLP techniques (prompting, fine-tuning) with minimal advanced mathematics, and while it reports performance metrics on benchmark datasets, it lacks code, backtests, or statistical rigor typical of empirical quant finance research.
flowchart TD
A["Research Goal<br>Enhance Domain-Specific Quantitative Reasoning<br>for LLMs Efficiently"] --> B["Methodology: Expert Question Decomposition EQD"]
B --> C{"Data & Inputs<br>Financial Domain QA Datasets<br>~3-5k Training Examples"}
C --> D["Two-Step Fine-Tuning Framework<br>Guided by Reward Function"]
D --> E["Computational Process<br>Train on A100 GPU<br>Inference Time ≈ Zero-Shot Prompting"]
E --> F["Outcomes & Findings<br>Consistent Performance Gain 0.6% to 10.5%<br>Key Insight: Single Supporting Question > Detailed Steps"]