Tweet Influence on Market Trends: Analyzing the Impact of Social Media Sentiment on Biotech Stocks

ArXiv ID: 2402.03353 “View on arXiv”

Authors: Unknown

Abstract

This study investigates the relationship between tweet sentiment across diverse categories: news, company opinions, CEO opinions, competitor opinions, and stock market behavior in the biotechnology sector, with a focus on understanding the impact of social media discourse on investor sentiment and decision-making processes. We analyzed historical stock market data for ten of the largest and most influential pharmaceutical companies alongside Twitter data related to COVID-19, vaccines, the companies, and their respective CEOs. Using VADER sentiment analysis, we examined the sentiment scores of tweets and assessed their relationships with stock market performance. We employed ARIMA (AutoRegressive Integrated Moving Average) and VAR (Vector AutoRegression) models to forecast stock market performance, incorporating sentiment covariates to improve predictions. Our findings revealed a complex interplay between tweet sentiment, news, biotech companies, their CEOs, and stock market performance, emphasizing the importance of considering diverse factors when modeling and predicting stock prices. This study provides valuable insights into the influence of social media on the financial sector and lays a foundation for future research aimed at refining stock price prediction models.

Keywords: sentiment analysis (VADER), time series forecasting (ARIMA, VAR), social media analytics, stock market prediction, biotechnology sector, Equities

Complexity vs Empirical Score

  • Math Complexity: 4.5/10
  • Empirical Rigor: 7.0/10
  • Quadrant: Street Traders
  • Why: The paper applies established time-series models (ARIMA, VAR) and sentiment analysis (VADER) to real financial and social media data, demonstrating clear empirical implementation. However, the math relies on standard statistical techniques without heavy theoretical derivations or novel algorithmic complexity.
  flowchart TD
    A["Research Goal<br>How does tweet sentiment affect biotech stock trends?"] --> B["Data Collection"]
    
    B --> C1["Twitter Data<br>COVID-19, Vaccines, Companies, CEOs"]
    B --> C2["Market Data<br>10 Major Pharma Stocks"]
    
    C1 --> D["Sentiment Analysis<br>VADER Model"]
    C2 --> E["Preprocessing &<br>Stationarity Check"]
    
    D --> F["Modeling & Forecasting"]
    E --> F
    
    F --> G1["ARIMA Model<br>Univariate Analysis"]
    F --> G2["VAR Model<br>Multivariate with Sentiment Covariates"]
    
    G1 --> H["Key Findings<br>Complex interplay between sentiment categories & stock performance"]
    G2 --> H
    
    H --> I["Outcome<br>Foundation for refined stock prediction models"]