Stock Market Sentiment Classification and Backtesting via Fine-tuned BERT
ArXiv ID: 2309.11979 “View on arXiv”
Authors: Unknown
Abstract
With the rapid development of big data and computing devices, low-latency automatic trading platforms based on real-time information acquisition have become the main components of the stock trading market, so the topic of quantitative trading has received widespread attention. And for non-strongly efficient trading markets, human emotions and expectations always dominate market trends and trading decisions. Therefore, this paper starts from the theory of emotion, taking East Money as an example, crawling user comment titles data from its corresponding stock bar and performing data cleaning. Subsequently, a natural language processing model BERT was constructed, and the BERT model was fine-tuned using existing annotated data sets. The experimental results show that the fine-tuned model has different degrees of performance improvement compared to the original model and the baseline model. Subsequently, based on the above model, the user comment data crawled is labeled with emotional polarity, and the obtained label information is combined with the Alpha191 model to participate in regression, and significant regression results are obtained. Subsequently, the regression model is used to predict the average price change for the next five days, and use it as a signal to guide automatic trading. The experimental results show that the incorporation of emotional factors increased the return rate by 73.8% compared to the baseline during the trading period, and by 32.41% compared to the original alpha191 model. Finally, we discuss the advantages and disadvantages of incorporating emotional factors into quantitative trading, and give possible directions for further research in the future.
Keywords: Sentiment Analysis, BERT (Natural Language Processing), Quantitative Trading, Alpha191 Model, Market Sentiment, Equities
Complexity vs Empirical Score
- Math Complexity: 3.0/10
- Empirical Rigor: 7.0/10
- Quadrant: Street Traders
- Why: The paper applies fine-tuned BERT and regression models to stock sentiment data with a reported backtest showing a 73.8% return increase, indicating strong empirical implementation, but the mathematics is primarily application of existing models with minimal novel theoretical derivation.
flowchart TD
A["Research Goal:<br>Quantify market sentiment impact on trading"] --> B["Data Acquisition & Cleaning<br>East Money user comments"]
B --> C["Method: BERT Fine-tuning<br>vs Baseline Models"]
C --> D["Process: Sentiment Polarity Labeling<br>for comment data"]
D --> E["Integration: Combining Sentiment<br>with Alpha191 Model"]
E --> F["Outcome: Regression &<br>5-Day Price Prediction"]
F --> G["Findings:<br>73.8% return increase vs baseline<br>32.41% vs Alpha191 alone"]