Meta-Learning

Temporal-Aligned Meta-Learning for Risk Management: A Stacking Approach for Multi-Source Credit Scoring

Temporal-Aligned Meta-Learning for Risk Management: A Stacking Approach for Multi-Source Credit Scoring ArXiv ID: 2601.07588 “View on arXiv” Authors: O. Didkovskyi, A. Vidali, N. Jean, G. Le Pera Abstract This paper presents a meta-learning framework for credit risk assessment of Italian Small and Medium Enterprises (SMEs) that explicitly addresses the temporal misalignment of credit scoring models. The approach aligns financial statement reference dates with evaluation dates, mitigating bias arising from publication delays and asynchronous data sources. It is based on a two-step temporal decomposition that at first estimates annual probabilities of default (PDs) anchored to balance-sheet reference dates (December 31st) through a static model. Then it models the monthly evolution of PDs using higher-frequency behavioral data. Finally, we employ stacking-based architecture to aggregate multiple scoring systems, each capturing complementary aspects of default risk, into a unified predictive model. In this way, first level model outputs are treated as learned representations that encode non-linear relationships in financial and behavioral indicators, allowing integration of new expert-based features without retraining base models. This design provides a coherent and interpretable solution to challenges typical of low-default environments, including heterogeneous default definitions and reporting delays. Empirical validation shows that the framework effectively captures credit risk evolution over time, improving temporal consistency and predictive stability relative to standard ensemble methods. ...

Bayesian Portfolio Optimization by Predictive Synthesis

Bayesian Portfolio Optimization by Predictive Synthesis ArXiv ID: 2510.07180 “View on arXiv” Authors: Masahiro Kato, Kentaro Baba, Hibiki Kaibuchi, Ryo Inokuchi Abstract Portfolio optimization is a critical task in investment. Most existing portfolio optimization methods require information on the distribution of returns of the assets that make up the portfolio. However, such distribution information is usually unknown to investors. Various methods have been proposed to estimate distribution information, but their accuracy greatly depends on the uncertainty of the financial markets. Due to this uncertainty, a model that could well predict the distribution information at one point in time may perform less accurately compared to another model at a different time. To solve this problem, we investigate a method for portfolio optimization based on Bayesian predictive synthesis (BPS), one of the Bayesian ensemble methods for meta-learning. We assume that investors have access to multiple asset return prediction models. By using BPS with dynamic linear models to combine these predictions, we can obtain a Bayesian predictive posterior about the mean rewards of assets that accommodate the uncertainty of the financial markets. In this study, we examine how to construct mean-variance portfolios and quantile-based portfolios based on the predicted distribution information. ...

Meta-Learning Neural Process for Implied Volatility Surfaces with SABR-induced Priors

Meta-Learning Neural Process for Implied Volatility Surfaces with SABR-induced Priors ArXiv ID: 2509.11928 “View on arXiv” Authors: Jirong Zhuang, Xuan Wu Abstract We treat implied volatility surface (IVS) reconstruction as a learning problem guided by two principles. First, we adopt a meta-learning view that trains across trading days to learn a procedure that maps sparse option quotes to a full IVS via conditional prediction, avoiding per-day calibration at test time. Second, we impose a structural prior via transfer learning: pre-train on SABR-generated dataset to encode geometric prior, then fine-tune on historical market dataset to align with empirical patterns. We implement both principles in a single attention-based Neural Process (Volatility Neural Process, VolNP) that produces a complete IVS from a sparse context set in one forward pass. On SPX options, the VolNP outperforms SABR, SSVI, and Gaussian process. Relative to an ablation trained only on market data, the SABR-induced prior reduces RMSE by about 40% and suppresses large errors, with pronounced gains at long maturities where quotes are sparse. The resulting model is fast (single pass), stable (no daily recalibration), and practical for deployment at scale. ...

FinFlowRL: An Imitation-Reinforcement Learning Framework for Adaptive Stochastic Control in Finance

FinFlowRL: An Imitation-Reinforcement Learning Framework for Adaptive Stochastic Control in Finance ArXiv ID: 2510.15883 “View on arXiv” Authors: Yang Li, Zhi Chen Abstract Traditional stochastic control methods in finance struggle in real world markets due to their reliance on simplifying assumptions and stylized frameworks. Such methods typically perform well in specific, well defined environments but yield suboptimal results in changed, non stationary ones. We introduce FinFlowRL, a novel framework for financial optimal stochastic control. The framework pretrains an adaptive meta policy learning from multiple expert strategies, then finetunes through reinforcement learning in the noise space to optimize the generative process. By employing action chunking generating action sequences rather than single decisions, it addresses the non Markovian nature of markets. FinFlowRL consistently outperforms individually optimized experts across diverse market conditions. ...

Reinforcement-Learning Portfolio Allocation with Dynamic Embedding of Market Information

Reinforcement-Learning Portfolio Allocation with Dynamic Embedding of Market Information ArXiv ID: 2501.17992 “View on arXiv” Authors: Unknown Abstract We develop a portfolio allocation framework that leverages deep learning techniques to address challenges arising from high-dimensional, non-stationary, and low-signal-to-noise market information. Our approach includes a dynamic embedding method that reduces the non-stationary, high-dimensional state space into a lower-dimensional representation. We design a reinforcement learning (RL) framework that integrates generative autoencoders and online meta-learning to dynamically embed market information, enabling the RL agent to focus on the most impactful parts of the state space for portfolio allocation decisions. Empirical analysis based on the top 500 U.S. stocks demonstrates that our framework outperforms common portfolio benchmarks and the predict-then-optimize (PTO) approach using machine learning, particularly during periods of market stress. Traditional factor models do not fully explain this superior performance. The framework’s ability to time volatility reduces its market exposure during turbulent times. Ablation studies confirm the robustness of this performance across various reinforcement learning algorithms. Additionally, the embedding and meta-learning techniques effectively manage the complexities of high-dimensional, noisy, and non-stationary financial data, enhancing both portfolio performance and risk management. ...

Designing Time-Series Models With Hypernetworks & Adversarial Portfolios

Designing Time-Series Models With Hypernetworks & Adversarial Portfolios ArXiv ID: 2407.20352 “View on arXiv” Authors: Unknown Abstract This article describes the methods that achieved 4th and 6th place in the forecasting and investment challenges, respectively, of the M6 competition, ultimately securing the 1st place in the overall duathlon ranking. In the forecasting challenge, we tested a novel meta-learning model that utilizes hypernetworks to design a parametric model tailored to a specific family of forecasting tasks. This approach allowed us to leverage similarities observed across individual forecasting tasks while also acknowledging potential heterogeneity in their data generating processes. The model’s training can be directly performed with backpropagation, eliminating the need for reliance on higher-order derivatives and is equivalent to a simultaneous search over the space of parametric functions and their optimal parameter values. The proposed model’s capabilities extend beyond M6, demonstrating superiority over state-of-the-art meta-learning methods in the sinusoidal regression task and outperforming conventional parametric models on time-series from the M4 competition. In the investment challenge, we adjusted portfolio weights to induce greater or smaller correlation between our submission and that of other participants, depending on the current ranking, aiming to maximize the probability of achieving a good rank. ...

Generative Meta-Learning Robust Quality-Diversity Portfolio

Generative Meta-Learning Robust Quality-Diversity Portfolio ArXiv ID: 2307.07811 “View on arXiv” Authors: Unknown Abstract This paper proposes a novel meta-learning approach to optimize a robust portfolio ensemble. The method uses a deep generative model to generate diverse and high-quality sub-portfolios combined to form the ensemble portfolio. The generative model consists of a convolutional layer, a stateful LSTM module, and a dense network. During training, the model takes a randomly sampled batch of Gaussian noise and outputs a population of solutions, which are then evaluated using the objective function of the problem. The weights of the model are updated using a gradient-based optimizer. The convolutional layer transforms the noise into a desired distribution in latent space, while the LSTM module adds dependence between generations. The dense network decodes the population of solutions. The proposed method balances maximizing the performance of the sub-portfolios with minimizing their maximum correlation, resulting in a robust ensemble portfolio against systematic shocks. The approach was effective in experiments where stochastic rewards were present. Moreover, the results (Fig. 1) demonstrated that the ensemble portfolio obtained by taking the average of the generated sub-portfolio weights was robust and generalized well. The proposed method can be applied to problems where diversity is desired among co-optimized solutions for a robust ensemble. The source-codes and the dataset are in the supplementary material. ...

DoubleAdapt: A Meta-learning Approach to Incremental Learning for Stock Trend Forecasting

DoubleAdapt: A Meta-learning Approach to Incremental Learning for Stock Trend Forecasting ArXiv ID: 2306.09862 “View on arXiv” Authors: Unknown Abstract Stock trend forecasting is a fundamental task of quantitative investment where precise predictions of price trends are indispensable. As an online service, stock data continuously arrive over time. It is practical and efficient to incrementally update the forecast model with the latest data which may reveal some new patterns recurring in the future stock market. However, incremental learning for stock trend forecasting still remains under-explored due to the challenge of distribution shifts (a.k.a. concept drifts). With the stock market dynamically evolving, the distribution of future data can slightly or significantly differ from incremental data, hindering the effectiveness of incremental updates. To address this challenge, we propose DoubleAdapt, an end-to-end framework with two adapters, which can effectively adapt the data and the model to mitigate the effects of distribution shifts. Our key insight is to automatically learn how to adapt stock data into a locally stationary distribution in favor of profitable updates. Complemented by data adaptation, we can confidently adapt the model parameters under mitigated distribution shifts. We cast each incremental learning task as a meta-learning task and automatically optimize the adapters for desirable data adaptation and parameter initialization. Experiments on real-world stock datasets demonstrate that DoubleAdapt achieves state-of-the-art predictive performance and shows considerable efficiency. ...