false

Tracing Positional Bias in Financial Decision-Making: Mechanistic Insights from Qwen2.5

Tracing Positional Bias in Financial Decision-Making: Mechanistic Insights from Qwen2.5 ArXiv ID: 2508.18427 “View on arXiv” Authors: Fabrizio Dimino, Krati Saxena, Bhaskarjit Sarmah, Stefano Pasquali Abstract The growing adoption of large language models (LLMs) in finance exposes high-stakes decision-making to subtle, underexamined positional biases. The complexity and opacity of modern model architectures compound this risk. We present the first unified framework and benchmark that not only detects and quantifies positional bias in binary financial decisions but also pinpoints its mechanistic origins within open-source Qwen2.5-instruct models (1.5B-14B). Our empirical analysis covers a novel, finance-authentic dataset revealing that positional bias is pervasive, scale-sensitive, and prone to resurfacing under nuanced prompt designs and investment scenarios, with recency and primacy effects revealing new vulnerabilities in risk-laden contexts. Through transparent mechanistic interpretability, we map how and where bias emerges and propagates within the models to deliver actionable, generalizable insights across prompt types and scales. By bridging domain-specific audit with model interpretability, our work provides a new methodological standard for both rigorous bias diagnosis and practical mitigation, establishing essential guidance for responsible and trustworthy deployment of LLMs in financial systems. ...

August 25, 2025 · 2 min · Research Team

INVESTORBENCH: A Benchmark for Financial Decision-Making Tasks with LLM-based Agent

INVESTORBENCH: A Benchmark for Financial Decision-Making Tasks with LLM-based Agent ArXiv ID: 2412.18174 “View on arXiv” Authors: Unknown Abstract Recent advancements have underscored the potential of large language model (LLM)-based agents in financial decision-making. Despite this progress, the field currently encounters two main challenges: (1) the lack of a comprehensive LLM agent framework adaptable to a variety of financial tasks, and (2) the absence of standardized benchmarks and consistent datasets for assessing agent performance. To tackle these issues, we introduce \textsc{“InvestorBench”}, the first benchmark specifically designed for evaluating LLM-based agents in diverse financial decision-making contexts. InvestorBench enhances the versatility of LLM-enabled agents by providing a comprehensive suite of tasks applicable to different financial products, including single equities like stocks, cryptocurrencies and exchange-traded funds (ETFs). Additionally, we assess the reasoning and decision-making capabilities of our agent framework using thirteen different LLMs as backbone models, across various market environments and tasks. Furthermore, we have curated a diverse collection of open-source, multi-modal datasets and developed a comprehensive suite of environments for financial decision-making. This establishes a highly accessible platform for evaluating financial agents’ performance across various scenarios. ...

December 24, 2024 · 2 min · Research Team

FinMem: A Performance-Enhanced LLM Trading Agent with Layered Memory and Character Design

FinMem: A Performance-Enhanced LLM Trading Agent with Layered Memory and Character Design ArXiv ID: 2311.13743 “View on arXiv” Authors: Unknown Abstract Recent advancements in Large Language Models (LLMs) have exhibited notable efficacy in question-answering (QA) tasks across diverse domains. Their prowess in integrating extensive web knowledge has fueled interest in developing LLM-based autonomous agents. While LLMs are efficient in decoding human instructions and deriving solutions by holistically processing historical inputs, transitioning to purpose-driven agents requires a supplementary rational architecture to process multi-source information, establish reasoning chains, and prioritize critical tasks. Addressing this, we introduce \textsc{“FinMem”}, a novel LLM-based agent framework devised for financial decision-making. It encompasses three core modules: Profiling, to customize the agent’s characteristics; Memory, with layered message processing, to aid the agent in assimilating hierarchical financial data; and Decision-making, to convert insights gained from memories into investment decisions. Notably, \textsc{“FinMem”}’s memory module aligns closely with the cognitive structure of human traders, offering robust interpretability and real-time tuning. Its adjustable cognitive span allows for the retention of critical information beyond human perceptual limits, thereby enhancing trading outcomes. This framework enables the agent to self-evolve its professional knowledge, react agilely to new investment cues, and continuously refine trading decisions in the volatile financial environment. We first compare \textsc{“FinMem”} with various algorithmic agents on a scalable real-world financial dataset, underscoring its leading trading performance in stocks. We then fine-tuned the agent’s perceptual span and character setting to achieve a significantly enhanced trading performance. Collectively, \textsc{“FinMem”} presents a cutting-edge LLM agent framework for automated trading, boosting cumulative investment returns. ...

November 23, 2023 · 2 min · Research Team