false

Cash Flow Underwriting with Bank Transaction Data: Advancing MSME Financial Inclusion in Malaysia

Cash Flow Underwriting with Bank Transaction Data: Advancing MSME Financial Inclusion in Malaysia ArXiv ID: 2510.16066 “View on arXiv” Authors: Chun Chet Ng, Wei Zeng Low, Jia Yu Lim, Yin Yin Boon Abstract Despite accounting for 96.1% of all businesses in Malaysia, access to financing remains one of the most persistent challenges faced by Micro, Small, and Medium Enterprises (MSMEs). Newly established businesses are often excluded from formal credit markets as traditional underwriting approaches rely heavily on credit bureau data. This study investigates the potential of bank statement data as an alternative data source for credit assessment to promote financial inclusion in emerging markets. First, we propose a cash flow-based underwriting pipeline where we utilise bank statement data for end-to-end data extraction and machine learning credit scoring. Second, we introduce a novel dataset of 611 loan applicants from a Malaysian lending institution. Third, we develop and evaluate credit scoring models based on application information and bank transaction-derived features. Empirical results show that the use of such data boosts the performance of all models on our dataset, which can improve credit scoring for new-to-lending MSMEs. Finally, we will release the anonymised bank transaction dataset to facilitate further research on MSME financial inclusion within Malaysia’s emerging economy. ...

October 17, 2025 · 2 min · Research Team

Enhancing OHLC Data with Timing Features: A Machine Learning Evaluation

Enhancing OHLC Data with Timing Features: A Machine Learning Evaluation ArXiv ID: 2509.16137 “View on arXiv” Authors: Ruslan Tepelyan Abstract OHLC bar data is a widely used format for representing financial asset prices over time due to its balance of simplicity and informativeness. Bloomberg has recently introduced a new bar data product that includes additional timing information-specifically, the timestamps of the open, high, low, and close prices within each bar. In this paper, we investigate the impact of incorporating this timing data into machine learning models for predicting volume-weighted average price (VWAP). Our experiments show that including these features consistently improves predictive performance across multiple ML architectures. We observe gains across several key metrics, including log-likelihood, mean squared error (MSE), $R^2$, conditional variance estimation, and directional accuracy. ...

September 19, 2025 · 2 min · Research Team

An Enhanced Focal Loss Function to Mitigate Class Imbalance in Auto Insurance Fraud Detection with Explainable AI

An Enhanced Focal Loss Function to Mitigate Class Imbalance in Auto Insurance Fraud Detection with Explainable AI ArXiv ID: 2508.02283 “View on arXiv” Authors: Francis Boabang, Samuel Asante Gyamerah Abstract In insurance fraud prediction, handling class imbalance remains a critical challenge. This paper presents a novel multistage focal loss function designed to enhance the performance of machine learning models in such imbalanced settings by helping to escape local minima and converge to a good solution. Building upon the foundation of the standard focal loss, our proposed approach introduces a dynamic, multi-stage convex and nonconvex mechanism that progressively adjusts the focus on hard-to-classify samples across training epochs. This strategic refinement facilitates more stable learning and improved discrimination between fraudulent and legitimate cases. Through extensive experimentation on a real-world insurance dataset, our method achieved better performance than the traditional focal loss, as measured by accuracy, precision, F1-score, recall and Area Under the Curve (AUC) metrics on the auto insurance dataset. These results demonstrate the efficacy of the multistage focal loss in boosting model robustness and predictive accuracy in highly skewed classification tasks, offering significant implications for fraud detection systems in the insurance industry. An explainable model is included to interpret the results. ...

August 4, 2025 · 2 min · Research Team

Isotonic Quantile Regression Averaging for uncertainty quantification of electricity price forecasts

Isotonic Quantile Regression Averaging for uncertainty quantification of electricity price forecasts ArXiv ID: 2507.15079 “View on arXiv” Authors: Arkadiusz Lipiecki, Bartosz Uniejewski Abstract Quantifying the uncertainty of forecasting models is essential to assess and mitigate the risks associated with data-driven decisions, especially in volatile domains such as electricity markets. Machine learning methods can provide highly accurate electricity price forecasts, critical for informing the decisions of market participants. However, these models often lack uncertainty estimates, which limits the ability of decision makers to avoid unnecessary risks. In this paper, we propose a novel method for generating probabilistic forecasts from ensembles of point forecasts, called Isotonic Quantile Regression Averaging (iQRA). Building on the established framework of Quantile Regression Averaging (QRA), we introduce stochastic order constraints to improve forecast accuracy, reliability, and computational costs. In an extensive forecasting study of the German day-ahead electricity market, we show that iQRA consistently outperforms state-of-the-art postprocessing methods in terms of both reliability and sharpness. It produces well-calibrated prediction intervals across multiple confidence levels, providing superior reliability to all benchmark methods, particularly coverage-based conformal prediction. In addition, isotonic regularization decreases the complexity of the quantile regression problem and offers a hyperparameter-free approach to variable selection. ...

July 20, 2025 · 2 min · Research Team

Predicting Realized Variance Out of Sample: Can Anything Beat The Benchmark?

Predicting Realized Variance Out of Sample: Can Anything Beat The Benchmark? ArXiv ID: 2506.07928 “View on arXiv” Authors: Austin Pollok Abstract The discrepancy between realized volatility and the market’s view of volatility has been known to predict individual equity options at the monthly horizon. It is not clear how this predictability depends on a forecast’s ability to predict firm-level volatility. We consider this phenomenon at the daily frequency using high-dimensional machine learning models, as well as low-dimensional factor models. We find that marginal improvements to standard forecast error measurements can lead to economically significant gains in portfolio performance. This makes a case for re-imagining the way we train models that are used to construct portfolios. ...

June 9, 2025 · 2 min · Research Team

Comparative analysis of financial data differentiation techniques using LSTM neural network

Comparative analysis of financial data differentiation techniques using LSTM neural network ArXiv ID: 2505.19243 “View on arXiv” Authors: Dominik Stempień, Janusz Gajda Abstract We compare traditional approach of computing logarithmic returns with the fractional differencing method and its tempered extension as methods of data preparation before their usage in advanced machine learning models. Differencing parameters are estimated using multiple techniques. The empirical investigation is conducted on data from four major stock indices covering the most recent 10-year period. The set of explanatory variables is additionally extended with technical indicators. The effectiveness of the differencing methods is evaluated using both forecast error metrics and risk-adjusted return trading performance metrics. The findings suggest that fractional differentiation methods provide a suitable data transformation technique, improving the predictive model forecasting performance. Furthermore, the generated predictions appeared to be effective in constructing profitable trading strategies for both individual assets and a portfolio of stock indices. These results underline the importance of appropriate data transformation techniques in financial time series forecasting, supporting the application of memory-preserving techniques. ...

May 25, 2025 · 2 min · Research Team

Forecasting Intraday Volume in Equity Markets with Machine Learning

Forecasting Intraday Volume in Equity Markets with Machine Learning ArXiv ID: 2505.08180 “View on arXiv” Authors: Mihai Cucuringu, Kang Li, Chao Zhang Abstract This study focuses on forecasting intraday trading volumes, a crucial component for portfolio implementation, especially in high-frequency (HF) trading environments. Given the current scarcity of flexible methods in this area, we employ a suite of machine learning (ML) models enriched with numerous HF predictors to enhance the predictability of intraday trading volumes. Our findings reveal that intraday stock trading volume is highly predictable, especially with ML and considering commonality. Additionally, we assess the economic benefits of accurate volume forecasting through Volume Weighted Average Price (VWAP) strategies. The results demonstrate that precise intraday forecasting offers substantial advantages, providing valuable insights for traders to optimize their strategies. ...

May 13, 2025 · 2 min · Research Team

On Bitcoin Price Prediction

On Bitcoin Price Prediction ArXiv ID: 2504.18982 “View on arXiv” Authors: Grégory Bournassenko Abstract In recent years, cryptocurrencies have attracted growing attention from both private investors and institutions. Among them, Bitcoin stands out for its impressive volatility and widespread influence. This paper explores the predictability of Bitcoin’s price movements, drawing a parallel with traditional financial markets. We examine whether the cryptocurrency market operates under the efficient market hypothesis (EMH) or if inefficiencies still allow opportunities for arbitrage. Our methodology combines theoretical reviews, empirical analyses, machine learning approaches, and time series modeling to assess the extent to which Bitcoin’s price can be predicted. We find that while, in general, the Bitcoin market tends toward efficiency, specific conditions, including information asymmetries and behavioral anomalies, occasionally create exploitable inefficiencies. However, these opportunities remain difficult to systematically identify and leverage. Our findings have implications for both investors and policymakers, particularly regarding the regulation of cryptocurrency brokers and derivatives markets. ...

April 26, 2025 · 2 min · Research Team

The impact of economic policies on housing prices. Approximations and predictions in the UK, the US, France, and Switzerland from the 1980s to today

The impact of economic policies on housing prices. Approximations and predictions in the UK, the US, France, and Switzerland from the 1980s to today ArXiv ID: 2505.09620 “View on arXiv” Authors: Unknown Abstract I show that house prices can be modeled using machine learning (kNN and tree-bagging) and a small dataset composed of macro-economic factors (MEF), including an inflation metric (CPI), US treasury rates (10-yr), Gross Domestic Product (GDP), and portfolio size of central banks (ECB, FED). This set of parameters covers all the parties involved in a transaction (buyer, seller, and financing facility) while ignoring the intrinsic properties of each asset and encompassing local (inflation) and liquidity issues that may impede each transaction composing a market. The model here takes the point of view of a real estate trader who is interested in both the financing and the price of the transaction. Machine Learning allows for the discrimination of two periods within the dataset. Unconventional policies of central banks may have allowed some institutional investors to arbitrage between real estate returns and other bond markets (sovereign and corporate). Finally, to assess the models’ relative performances, I performed various sensitivity tests, which tend to constrain the possibilities of each approach for each need. I also show that some models can predict the evolution of prices over the next 4 quarters with uncertainties that outperform existing index uncertainties. ...

April 10, 2025 · 2 min · Research Team

Deep Hedging of Green PPAs in Electricity Markets

Deep Hedging of Green PPAs in Electricity Markets ArXiv ID: 2503.13056 “View on arXiv” Authors: Unknown Abstract In power markets, Green Power Purchase Agreements have become an important contractual tool of the energy transition from fossil fuels to renewable sources such as wind or solar radiation. Trading Green PPAs exposes agents to price risks and weather risks. Also, developed electricity markets feature the so-called cannibalisation effect : large infeeds induce low prices and vice versa. As weather is a non-tradable entity the question arises how to hedge and risk-manage in this highly incom-plete setting. We propose a ‘‘deep hedging’’ framework utilising machine learning methods to construct hedging strategies. The resulting strategies outperform static and dynamic benchmark strategies with respect to different risk measures. ...

March 17, 2025 · 2 min · Research Team