NewsNet-SDF: Stochastic Discount Factor Estimation with Pretrained Language Model News Embeddings via Adversarial Networks
ArXiv ID: 2505.06864 “View on arXiv”
Authors: Shunyao Wang, Ming Cheng, Christina Dan Wang
Abstract
Stochastic Discount Factor (SDF) models provide a unified framework for asset pricing and risk assessment, yet traditional formulations struggle to incorporate unstructured textual information. We introduce NewsNet-SDF, a novel deep learning framework that seamlessly integrates pretrained language model embeddings with financial time series through adversarial networks. Our multimodal architecture processes financial news using GTE-multilingual models, extracts temporal patterns from macroeconomic data via LSTM networks, and normalizes firm characteristics, fusing these heterogeneous information sources through an innovative adversarial training mechanism. Our dataset encompasses approximately 2.5 million news articles and 10,000 unique securities, addressing the computational challenges of processing and aligning text data with financial time series. Empirical evaluations on U.S. equity data (1980-2022) demonstrate NewsNet-SDF substantially outperforms alternatives with a Sharpe ratio of 2.80. The model shows a 471% improvement over CAPM, over 200% improvement versus traditional SDF implementations, and a 74% reduction in pricing errors compared to the Fama-French five-factor model. In comprehensive comparisons, our deep learning approach consistently outperforms traditional, modern, and other neural asset pricing models across all key metrics. Ablation studies confirm that text embeddings contribute significantly more to model performance than macroeconomic features, with news-derived principal components ranking among the most influential determinants of SDF dynamics. These results validate the effectiveness of our multimodal deep learning approach in integrating unstructured text with traditional financial data for more accurate asset pricing, providing new insights for digital intelligent decision-making in financial technology.
Keywords: Stochastic Discount Factor (SDF), Deep Learning, Adversarial Networks, LSTM Networks, Multimodal Architecture, Equities
Complexity vs Empirical Score
- Math Complexity: 8.0/10
- Empirical Rigor: 9.0/10
- Quadrant: Holy Grail
- Why: The paper presents dense mathematical formulations including the stochastic discount factor framework, moment conditions, and adversarial training mechanisms, indicating high complexity. It demonstrates exceptional empirical rigor with a massive dataset (2.5M news articles, 10K securities), 42-year backtest, and detailed performance metrics including Sharpe ratios, pricing error reductions, and ablation studies.
flowchart TD
A["Research Goal: Integrate unstructured text with financial data for SDF estimation"] --> B["Data Preparation: 2.5M News Articles & 10K Securities"]
B --> C["Methodology: Multimodal Deep Learning Architecture"]
C --> D["Text Processing: GTE-multilingual Embeddings"]
C --> E["Time Series: LSTM Networks"]
C --> F["Normalization: Firm Characteristics"]
D & E & F --> G["Adversarial Network Fusion"]
G --> H["Outcomes: 2.80 Sharpe Ratio, 471% Improvement over CAPM"]
G --> I["Findings: News embeddings > Macro features, 74% Pricing Error Reduction"]