Effects of Daily News Sentiment on Stock Price Forecasting

ArXiv ID: 2308.08549 “View on arXiv”

Authors: Unknown

Abstract

Predicting future prices of a stock is an arduous task to perform. However, incorporating additional elements can significantly improve our predictions, rather than relying solely on a stock’s historical price data to forecast its future price. Studies have demonstrated that investor sentiment, which is impacted by daily news about the company, can have a significant impact on stock price swings. There are numerous sources from which we can get this information, but they are cluttered with a lot of noise, making it difficult to accurately extract the sentiments from them. Hence the focus of our research is to design an efficient system to capture the sentiments from the news about the NITY50 stocks and investigate how much the financial news sentiment of these stocks are affecting their prices over a period of time. This paper presents a robust data collection and preprocessing framework to create a news database for a timeline of around 3.7 years, consisting of almost half a million news articles. We also capture the stock price information for this timeline and create multiple time series data, that include the sentiment scores from various sections of the article, calculated using different sentiment libraries. Based on this, we fit several LSTM models to forecast the stock prices, with and without using the sentiment scores as features and compare their performances.

Keywords: Sentiment Analysis, Long Short-Term Memory (LSTM), Natural Language Processing, Time Series Forecasting, Feature Engineering, Equities

Complexity vs Empirical Score

  • Math Complexity: 6.5/10
  • Empirical Rigor: 7.5/10
  • Quadrant: Street Traders
  • Why: The paper uses advanced deep learning (LSTM) and multiple sentiment scoring formulas, but focuses heavily on a concrete, data-heavy implementation framework with scraping, preprocessing, and model validation on a large dataset.
  flowchart TD
    A["Research Goal<br>Does news sentiment improve stock price<br>forecasting beyond historical data alone?"] --> B["Data Collection & Preprocessing"]
    
    B --> C["Data Inputs<br>1. ~500k Financial News Articles<br>2. Nifty50 Stock Prices<br>3. 3.7-Year Timeline"]
    
    C --> D["Computational Processes<br>1. Extract sentiment scores from articles<br>2. Generate time series data<br>3. Fit LSTM models<br>4. Compare performance"]
    
    D --> E["Model Variants<br>1. LSTM without sentiment features<br>2. LSTM with sentiment features"]
    
    E --> F["Key Findings<br>1. Sentiment significantly impacts price volatility<br>2. Multi-feature models outperform historical-only models<br>3. Efficient framework for sentiment extraction established"]