SARF: Enhancing Stock Market Prediction with Sentiment-Augmented Random Forest
ArXiv ID: 2410.07143 “View on arXiv”
Authors: Unknown
Abstract
Stock trend forecasting, a challenging problem in the financial domain, involves ex-tensive data and related indicators. Relying solely on empirical analysis often yields unsustainable and ineffective results. Machine learning researchers have demonstrated that the application of random forest algorithm can enhance predictions in this context, playing a crucial auxiliary role in forecasting stock trends. This study introduces a new approach to stock market prediction by integrating sentiment analysis using FinGPT generative AI model with the traditional Random Forest model. The proposed technique aims to optimize the accuracy of stock price forecasts by leveraging the nuanced understanding of financial sentiments provided by FinGPT. We present a new methodology called “Sentiment-Augmented Random Forest” (SARF), which in-corporates sentiment features into the Random Forest framework. Our experiments demonstrate that SARF outperforms conventional Random Forest and LSTM models with an average accuracy improvement of 9.23% and lower prediction errors in pre-dicting stock market movements.
Keywords: Sentiment Analysis, FinGPT, Random Forest, Stock Trend Forecasting, Generative AI
Complexity vs Empirical Score
- Math Complexity: 2.5/10
- Empirical Rigor: 6.5/10
- Quadrant: Street Traders
- Why: The paper uses established machine learning methods (Random Forest, FinGPT) without novel mathematical derivations, keeping math complexity low, but includes detailed empirical work with specific datasets, APIs, and comparative metrics.
flowchart TD
Start(["Research Goal: Enhance Stock Trend Prediction"]) --> Inputs
subgraph Inputs ["Data & Models"]
direction LR
I1["(Historical Market Data)"]
I2["(Financial News via FinGPT)"]
end
Inputs --> Method
subgraph Method ["Key Methodology: SARF"]
direction TB
M1["Sentiment Feature Extraction"] --> M2["Feature Fusion<br>Sentiment + Technical Data"]
M2 --> M3["Random Forest Model Training"]
end
Method --> Outcome
subgraph Outcome ["Findings & Outcomes"]
direction LR
O1["9.23% Accuracy Improvement"]
O2["Lower Prediction Errors"]
O3["Superior to LSTM/RF"]
end