DiffsFormer: A Diffusion Transformer on Stock Factor Augmentation

ArXiv ID: 2402.06656 “View on arXiv”

Authors: Unknown

Abstract

Machine learning models have demonstrated remarkable efficacy and efficiency in a wide range of stock forecasting tasks. However, the inherent challenges of data scarcity, including low signal-to-noise ratio (SNR) and data homogeneity, pose significant obstacles to accurate forecasting. To address this issue, we propose a novel approach that utilizes artificial intelligence-generated samples (AIGS) to enhance the training procedures. In our work, we introduce the Diffusion Model to generate stock factors with Transformer architecture (DiffsFormer). DiffsFormer is initially trained on a large-scale source domain, incorporating conditional guidance so as to capture global joint distribution. When presented with a specific downstream task, we employ DiffsFormer to augment the training procedure by editing existing samples. This editing step allows us to control the strength of the editing process, determining the extent to which the generated data deviates from the target domain. To evaluate the effectiveness of DiffsFormer augmented training, we conduct experiments on the CSI300 and CSI800 datasets, employing eight commonly used machine learning models. The proposed method achieves relative improvements of 7.2% and 27.8% in annualized return ratio for the respective datasets. Furthermore, we perform extensive experiments to gain insights into the functionality of DiffsFormer and its constituent components, elucidating how they address the challenges of data scarcity and enhance the overall model performance. Our research demonstrates the efficacy of leveraging AIGS and the DiffsFormer architecture to mitigate data scarcity in stock forecasting tasks.

Keywords: Diffusion Model, Stock Forecasting, Data Augmentation, Transformer Architecture, AIGS (AI-Generated Samples)

Complexity vs Empirical Score

  • Math Complexity: 8.0/10
  • Empirical Rigor: 7.0/10
  • Quadrant: Holy Grail
  • Why: The paper employs advanced mathematical concepts from diffusion models and transformer architectures, involving concepts like Markov chains, noise prediction, and conditional guidance, alongside detailed algorithmic descriptions. It demonstrates high empirical rigor with extensive experiments on real-world datasets (CSI300/CSI800), quantitative performance metrics (7.2% and 27.8% improvements in annualized return), and ablation studies on model components.
  flowchart TD
    A["Research Goal<br>Address Data Scarcity in Stock Forecasting<br>Low SNR & Data Homogeneity"] --> B{"Key Methodology<br>Diffusion Model on Stock Factors<br>DiffsFormer Architecture"}
    
    B --> C["Inputs: Source Domain<br>CSI300 & CSI800 Datasets"]
    C --> D["Training Phase<br>Conditional Guidance<br>Captures Global Joint Distribution"]
    D --> E["Augmentation Phase<br>Conditional Editing of Samples<br>Controls Deviation Strength"]
    E --> F["Computational Process<br>8 ML Models + AIGS<br>Augmented Training"]
    
    F --> G["Outcomes & Findings<br>CSI300: +7.2% Annualized Return<br>CSI800: +27.8% Annualized Return<br>Data Scarcity Mitigated"]