Shai: A large language model for asset management

ArXiv ID: 2312.14203 “View on arXiv”

Authors: Unknown

Abstract

This paper introduces “Shai” a 10B level large language model specifically designed for the asset management industry, built upon an open-source foundational model. With continuous pre-training and fine-tuning using a targeted corpus, Shai demonstrates enhanced performance in tasks relevant to its domain, outperforming baseline models. Our research includes the development of an innovative evaluation framework, which integrates professional qualification exams, tailored tasks, open-ended question answering, and safety assessments, to comprehensively assess Shai’s capabilities. Furthermore, we discuss the challenges and implications of utilizing large language models like GPT-4 for performance assessment in asset management, suggesting a combination of automated evaluation and human judgment. Shai’s development, showcasing the potential and versatility of 10B-level large language models in the financial sector with significant performance and modest computational requirements, hopes to provide practical insights and methodologies to assist industry peers in their similar endeavors.

Keywords: Large Language Models (LLM), Asset management, Fine-tuning, Evaluation framework, Natural Language Processing, General Financial Markets

Complexity vs Empirical Score

  • Math Complexity: 1.0/10
  • Empirical Rigor: 7.0/10
  • Quadrant: Street Traders
  • Why: The paper is a technical implementation report for building a domain-specific large language model (LLM), focusing on data collection, fine-tuning, and a custom evaluation framework, with minimal theoretical math presented. It demonstrates high empirical rigor through the description of specific datasets, training processes, and comparative evaluations against baseline models on proprietary tasks.
  flowchart TD
    A["Research Goal: Develop & Evaluate a 10B-level LLM for Asset Management"] --> B["Methodology: Continuous Pre-training & Fine-tuning"]
    B --> C{"Data Inputs: Targeted Corpus & Professional Exams"}
    C --> D["Computational Process: 10B Parameter Model Training"]
    D --> E["Innovative Evaluation Framework"]
    E --> F["Key Findings: Outperforms Baselines"]
    E --> G["Key Findings: Significant Performance with Modest Compute"]