Detection of Temporality at Discourse Level on Financial News by Combining Natural Language Processing and Machine Learning
ArXiv ID: 2404.01337 “View on arXiv”
Authors: Unknown
Abstract
Finance-related news such as Bloomberg News, CNN Business and Forbes are valuable sources of real data for market screening systems. In news, an expert shares opinions beyond plain technical analyses that include context such as political, sociological and cultural factors. In the same text, the expert often discusses the performance of different assets. Some key statements are mere descriptions of past events while others are predictions. Therefore, understanding the temporality of the key statements in a text is essential to separate context information from valuable predictions. We propose a novel system to detect the temporality of finance-related news at discourse level that combines Natural Language Processing and Machine Learning techniques, and exploits sophisticated features such as syntactic and semantic dependencies. More specifically, we seek to extract the dominant tenses of the main statements, which may be either explicit or implicit. We have tested our system on a labelled dataset of finance-related news annotated by researchers with knowledge in the field. Experimental results reveal a high detection precision compared to an alternative rule-based baseline approach. Ultimately, this research contributes to the state-of-the-art of market screening by identifying predictive knowledge for financial decision making.
Keywords: temporality detection, syntactic dependencies, semantic dependencies, market screening, discourse analysis, General (News Analysis)
Complexity vs Empirical Score
- Math Complexity: 3.0/10
- Empirical Rigor: 5.0/10
- Quadrant: Street Traders
- Why: The paper uses NLP/ML methods but relies on a manually labeled dataset without statistical metrics or backtesting, placing it in Street Traders due to practical implementation over theoretical math.
flowchart TD
A["Research Goal:<br>Detect temporality in financial news<br>to separate context from predictions"] --> B["Methodology:<br>Combine NLP & Machine Learning<br>with syntactic & semantic features"]
B --> C["Input Data:<br>Labelled dataset of<br>financial news articles"]
C --> D["Computational Process:<br>Extract & analyze dominant tenses<br>of main statements at discourse level"]
D --> E["Outcome:<br>High detection precision vs baseline<br>Identifies predictive knowledge<br>for market screening systems"]