Assets Forecasting with Feature Engineering and Transformation Methods for LightGBM
ArXiv ID: 2501.07580 “View on arXiv”
Authors: Unknown
Abstract
Fluctuations in the stock market rapidly shape the economic world and consumer markets, impacting millions of individuals. Hence, accurately forecasting it is essential for mitigating risks, including those associated with inactivity. Although research shows that hybrid models of Deep Learning (DL) and Machine Learning (ML) yield promising results, their computational requirements often exceed the capabilities of average personal computers, rendering them inaccessible to many. In order to address this challenge in this paper we optimize LightGBM (an efficient implementation of gradient-boosted decision trees (GBDT)) for maximum performance, while maintaining low computational requirements. We introduce novel feature engineering techniques including indicator-price slope ratios and differences of close and open prices divided by the corresponding 14-period Exponential Moving Average (EMA), designed to capture market dynamics and enhance predictive accuracy. Additionally, we test seven different feature and target variable transformation methods, including returns, logarithmic returns, EMA ratios and their standardized counterparts as well as EMA difference ratios, so as to identify the most effective ones weighing in both efficiency and accuracy. The results demonstrate Log Returns, Returns and EMA Difference Ratio constitute the best target variable transformation methods, with EMA ratios having a lower percentage of correct directional forecasts, and standardized versions of target variable transformations requiring significantly more training time. Moreover, the introduced features demonstrate high feature importance in predictive performance across all target variable transformation methods. This study highlights an accessible, computationally efficient approach to stock market forecasting using LightGBM, making advanced forecasting techniques more widely attainable.
Keywords: Stock market forecasting, LightGBM, Gradient-boosted decision trees, Feature engineering, Computational efficiency
Complexity vs Empirical Score
- Math Complexity: 4.0/10
- Empirical Rigor: 7.5/10
- Quadrant: Street Traders
- Why: The paper focuses on practical feature engineering and LightGBM optimization for stock forecasting, with detailed empirical setup, dataset description, and performance metrics, but lacks advanced mathematical derivations or heavy theoretical formalism.
flowchart TD
Goal["Research Goal: Achieve accurate<br>stock forecasting with low<br>computational requirements"] --> Methodology["Methodology: Optimize LightGBM<br>with Feature Engineering &<br>Transformation Methods"]
Methodology --> Inputs["Inputs: Stock Market Data<br>+ Engineered Features<br>(Slope Ratios, EMA Divisions)"]
Inputs --> Process["Computational Process:<br>Test 7 Feature/Target Transformations<br>(Returns, Log Returns, EMA Ratios, etc.)"]
Process --> Outcomes["Key Outcomes:<br>1. Log Returns/Returns/EMA Diff Ratio = Best Transformations<br>2. Engineered Features = High Importance<br>3. Standardized Versions = Slower Training<br>4. EMA Ratios = Lower Accuracy"]
Outcomes --> Conclusion["Conclusion: Accessible &<br>Efficient LightGBM Approach<br>Validated for Stock Forecasting"]