false

XGBoost Forecasting of NEPSE Index Log Returns with Walk Forward Validation

XGBoost Forecasting of NEPSE Index Log Returns with Walk Forward Validation ArXiv ID: 2601.08896 “View on arXiv” Authors: Sahaj Raj Malla, Shreeyash Kayastha, Rumi Suwal, Harish Chandra Bhandari, Rajendra Adhikari Abstract This study develops a robust machine learning framework for one-step-ahead forecasting of daily log-returns in the Nepal Stock Exchange (NEPSE) Index using the XGBoost regressor. A comprehensive feature set is engineered, including lagged log-returns (up to 30 days) and established technical indicators such as short- and medium-term rolling volatility measures and the 14-period Relative Strength Index. Hyperparameter optimization is performed using Optuna with time-series cross-validation on the initial training segment. Out-of-sample performance is rigorously assessed via walk-forward validation under both expanding and fixed-length rolling window schemes across multiple lag configurations, simulating real-world deployment and avoiding lookahead bias. Predictive accuracy is evaluated using root mean squared error, mean absolute error, coefficient of determination (R-squared), and directional accuracy on both log-returns and reconstructed closing prices. Empirical results show that the optimal configuration, an expanding window with 20 lags, outperforms tuned ARIMA and Ridge regression benchmarks, achieving the lowest log-return RMSE (0.013450) and MAE (0.009814) alongside a directional accuracy of 65.15%. While the R-squared remains modest, consistent with the noisy nature of financial returns, primary emphasis is placed on relative error reduction and directional prediction. Feature importance analysis and visual inspection further enhance interpretability. These findings demonstrate the effectiveness of gradient boosting ensembles in modeling nonlinear dynamics in volatile emerging market time series and establish a reproducible benchmark for NEPSE Index forecasting. ...

January 13, 2026 · 3 min · Research Team

Predictive Performance of LSTM Networks on Sectoral Stocks in an Emerging Market: A Case Study of the Pakistan Stock Exchange

Predictive Performance of LSTM Networks on Sectoral Stocks in an Emerging Market: A Case Study of the Pakistan Stock Exchange ArXiv ID: 2509.14401 “View on arXiv” Authors: Ahad Yaqoob, Syed M. Abdullah Abstract The application of deep learning models for stock price forecasting in emerging markets remains underexplored despite their potential to capture complex temporal dependencies. This study develops and evaluates a Long Short-Term Memory (LSTM) network model for predicting the closing prices of ten major stocks across diverse sectors of the Pakistan Stock Exchange (PSX). Utilizing historical OHLCV data and an extensive set of engineered technical indicators, we trained and validated the model on a multi-year dataset. Our results demonstrate strong predictive performance ($R^2 > 0.87$) for stocks in stable, high-liquidity sectors such as power generation, cement, and fertilizers. Conversely, stocks characterized by high volatility, low liquidity, or sensitivity to external shocks (e.g., global oil prices) presented significant forecasting challenges. The study provides a replicable framework for LSTM-based forecasting in data-scarce emerging markets and discusses implications for investors and future research. ...

September 17, 2025 · 2 min · Research Team

Equity Premium Prediction: Taking into Account the Role of Long, even Asymmetric, Swings in Stock Market Behavior

Equity Premium Prediction: Taking into Account the Role of Long, even Asymmetric, Swings in Stock Market Behavior ArXiv ID: 2509.10483 “View on arXiv” Authors: Kuok Sin Un, Marcel Ausloos Abstract Through a novel approach, this paper shows that substantial change in stock market behavior has a statistically and economically significant impact on equity risk premium predictability both on in-sample and out-of-sample cases. In line with Auer’s ‘‘Bullish ratio’’, a ‘‘Bullish index’’ is introduced to measure the changes in stock market behavior, which we describe through a ‘‘fluctuation detrending moving average analysis’’ (FDMAA) for returns. We consider 28 indicators. We find that a ‘‘positive shock’’ of the Bullish Index is closely related to strong equity risk premium predictability for forecasts based on macroeconomic variables for up to six months. In contrast, a ‘’negative shock’’ is associated with strong equity risk premium predictability with adequate forecasts for up to nine months when based on technical indicators. ...

August 29, 2025 · 2 min · Research Team

QTMRL: An Agent for Quantitative Trading Decision-Making Based on Multi-Indicator Guided Reinforcement Learning

QTMRL: An Agent for Quantitative Trading Decision-Making Based on Multi-Indicator Guided Reinforcement Learning ArXiv ID: 2508.20467 “View on arXiv” Authors: Xiangdong Liu, Jiahao Chen Abstract In the highly volatile and uncertain global financial markets, traditional quantitative trading models relying on statistical modeling or empirical rules often fail to adapt to dynamic market changes and black swan events due to rigid assumptions and limited generalization. To address these issues, this paper proposes QTMRL (Quantitative Trading Multi-Indicator Reinforcement Learning), an intelligent trading agent combining multi-dimensional technical indicators with reinforcement learning (RL) for adaptive and stable portfolio management. We first construct a comprehensive multi-indicator dataset using 23 years of S&P 500 daily OHLCV data (2000-2022) for 16 representative stocks across 5 sectors, enriching raw data with trend, volatility, and momentum indicators to capture holistic market dynamics. Then we design a lightweight RL framework based on the Advantage Actor-Critic (A2C) algorithm, including data processing, A2C algorithm, and trading agent modules to support policy learning and actionable trading decisions. Extensive experiments compare QTMRL with 9 baselines (e.g., ARIMA, LSTM, moving average strategies) across diverse market regimes, verifying its superiority in profitability, risk adjustment, and downside risk control. The code of QTMRL is publicly available at https://github.com/ChenJiahaoJNU/QTMRL.git ...

August 28, 2025 · 2 min · Research Team

Enhancing Trading Performance Through Sentiment Analysis with Large Language Models: Evidence from the S&P 500

Enhancing Trading Performance Through Sentiment Analysis with Large Language Models: Evidence from the S&P 500 ArXiv ID: 2507.09739 “View on arXiv” Authors: Haojie Liu, Zihan Lin, Randall R. Rojas Abstract This study integrates real-time sentiment analysis from financial news, GPT-2 and FinBERT, with technical indicators and time-series models like ARIMA and ETS to optimize S&P 500 trading strategies. By merging sentiment data with momentum and trend-based metrics, including a benchmark buy-and-hold and sentiment-based approach, is evaluated through assets values and returns. Results show that combining sentiment-driven insights with traditional models improves trading performance, offering a more dynamic approach to stock trading that adapts to market changes in volatile environments. ...

July 13, 2025 · 2 min · Research Team

Explainable-AI powered stock price prediction using time series transformers: A Case Study on BIST100

Explainable-AI powered stock price prediction using time series transformers: A Case Study on BIST100 ArXiv ID: 2506.06345 “View on arXiv” Authors: Sukru Selim Calik, Andac Akyuz, Zeynep Hilal Kilimci, Kerem Colak Abstract Financial literacy is increasingly dependent on the ability to interpret complex financial data and utilize advanced forecasting tools. In this context, this study proposes a novel approach that combines transformer-based time series models with explainable artificial intelligence (XAI) to enhance the interpretability and accuracy of stock price predictions. The analysis focuses on the daily stock prices of the five highest-volume banks listed in the BIST100 index, along with XBANK and XU100 indices, covering the period from January 2015 to March 2025. Models including DLinear, LTSNet, Vanilla Transformer, and Time Series Transformer are employed, with input features enriched by technical indicators. SHAP and LIME techniques are used to provide transparency into the influence of individual features on model outputs. The results demonstrate the strong predictive capabilities of transformer models and highlight the potential of interpretable machine learning to empower individuals in making informed investment decisions and actively engaging in financial markets. ...

June 1, 2025 · 2 min · Research Team

An Advanced Ensemble Deep Learning Framework for Stock Price Prediction Using VAE, Transformer, and LSTM Model

An Advanced Ensemble Deep Learning Framework for Stock Price Prediction Using VAE, Transformer, and LSTM Model ArXiv ID: 2503.22192 “View on arXiv” Authors: Unknown Abstract This research proposes a cutting-edge ensemble deep learning framework for stock price prediction by combining three advanced neural network architectures: The particular areas of interest for the research include but are not limited to: Variational Autoencoder (VAE), Transformer, and Long Short-Term Memory (LSTM) networks. The presented framework is aimed to substantially utilize the advantages of each model which would allow for achieving the identification of both linear and non-linear relations in stock price movements. To improve the accuracy of its predictions it uses rich set of technical indicators and it scales its predictors based on the current market situation. By trying out the framework on several stock data sets, and benchmarking the results against single models and conventional forecasting, the ensemble method exhibits consistently high accuracy and reliability. The VAE is able to learn linear representation on high-dimensional data while the Transformer outstandingly perform in recognizing long-term patterns on the stock price data. LSTM, based on its characteristics of being a model that can deal with sequences, brings additional improvements to the given framework, especially regarding temporal dynamics and fluctuations. Combined, these components provide exceptional directional performance and a very small disparity in the predicted results. The present solution has given a probable concept that can handle the inherent problem of stock price prediction with high reliability and scalability. Compared to the performance of individual proposals based on the neural network, as well as classical methods, the proposed ensemble framework demonstrates the advantages of combining different architectures. It has a very important application in algorithmic trading, risk analysis, and control and decision-making for finance professions and scholars. ...

March 28, 2025 · 2 min · Research Team

An End-To-End LLM Enhanced Trading System

An End-To-End LLM Enhanced Trading System ArXiv ID: 2502.01574 “View on arXiv” Authors: Unknown Abstract This project introduces an end-to-end trading system that leverages Large Language Models (LLMs) for real-time market sentiment analysis. By synthesizing data from financial news and social media, the system integrates sentiment-driven insights with technical indicators to generate actionable trading signals. FinGPT serves as the primary model for sentiment analysis, ensuring domain-specific accuracy, while Kubernetes is used for scalable and efficient deployment. ...

February 3, 2025 · 1 min · Research Team

Forecasting S&P 500 Using LSTM Models

Forecasting S&P 500 Using LSTM Models ArXiv ID: 2501.17366 “View on arXiv” Authors: Unknown Abstract With the volatile and complex nature of financial data influenced by external factors, forecasting the stock market is challenging. Traditional models such as ARIMA and GARCH perform well with linear data but struggle with non-linear dependencies. Machine learning and deep learning models, particularly Long Short-Term Memory (LSTM) networks, address these challenges by capturing intricate patterns and long-term dependencies. This report compares ARIMA and LSTM models in predicting the S&P 500 index, a major financial benchmark. Using historical price data and technical indicators, we evaluated these models using Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE). The ARIMA model showed reasonable performance with an MAE of 462.1, RMSE of 614, and 89.8 percent accuracy, effectively capturing short-term trends but limited by its linear assumptions. The LSTM model, leveraging sequential processing capabilities, outperformed ARIMA with an MAE of 369.32, RMSE of 412.84, and 92.46 percent accuracy, capturing both short- and long-term dependencies. Notably, the LSTM model without additional features performed best, achieving an MAE of 175.9, RMSE of 207.34, and 96.41 percent accuracy, showcasing its ability to handle market data efficiently. Accurately predicting stock movements is crucial for investment strategies, risk assessments, and market stability. Our findings confirm the potential of deep learning models in handling volatile financial data compared to traditional ones. The results highlight the effectiveness of LSTM and suggest avenues for further improvements. This study provides insights into financial forecasting, offering a comparative analysis of ARIMA and LSTM while outlining their strengths and limitations. ...

January 29, 2025 · 2 min · Research Team

Risk-Adjusted Performance of Random Forest Models in High-Frequency Trading

Risk-Adjusted Performance of Random Forest Models in High-Frequency Trading ArXiv ID: 2412.15448 “View on arXiv” Authors: Unknown Abstract Because of the theoretical challenges posed by the Efficient Market Hypothesis to technical analysis, the effectiveness of technical indicators in high-frequency trading remains inadequately explored, particularly at the minute-level frequency, where effects of the microstructure of the market dominate. This study evaluates the integration of traditional technical indicators with random forest regression models using minute-level SPY data, analyzing 13 distinct model configurations. Our empirical results reveal a stark contrast between in-sample and out-of-sample performance, with $R^2$ values deteriorating from 0.749–0.812 during training to negative values in testing. A feature importance analysis demonstrates that primary price-based features dominate the predictions made by the model, accounting for over 60% of the importance, while established technical indicators, such as RSI and Bollinger Bands, account for only 14%–15%. Although the indicator-enhanced models achieved superior risk-adjusted metrics, with Rachev ratios between 0.919 and 0.961, they consistently underperformed a simple buy-and-hold strategy, generating returns ranging from -2.4% to -3.9%. These findings challenge conventional assumptions about the usefulness of technical indicators in algorithmic trading, suggesting that in high-frequency contexts, they may be more relevant to risk management rather than to predicting returns. For practitioners and researchers, our findings indicate that successful high-frequency trading strategies should focus on adaptive feature selection and regime-specific modeling rather than relying on traditional technical indicators, as well as indicating the critical importance of robust out-of-sample testing in the development of a model. ...

December 19, 2024 · 2 min · Research Team