false

The Memorization Problem: Can We Trust LLMs' Economic Forecasts?

The Memorization Problem: Can We Trust LLMs’ Economic Forecasts? ArXiv ID: 2504.14765 “View on arXiv” Authors: Unknown Abstract Large language models (LLMs) cannot be trusted for economic forecasts during periods covered by their training data. Counterfactual forecasting ability is non-identified when the model has seen the realized values: any observed output is consistent with both genuine skill and memorization. Any evidence of memorization represents only a lower bound on encoded knowledge. We demonstrate LLMs have memorized economic and financial data, recalling exact values before their knowledge cutoff. Instructions to respect historical boundaries fail to prevent recall-level accuracy, and masking fails as LLMs reconstruct entities and dates from minimal context. Post-cutoff, we observe no recall. Memorization extends to embeddings. ...

April 20, 2025 · 2 min · Research Team

Federated Diffusion Modeling with Differential Privacy for Tabular Data Synthesis

Federated Diffusion Modeling with Differential Privacy for Tabular Data Synthesis ArXiv ID: 2412.16083 “View on arXiv” Authors: Unknown Abstract The increasing demand for privacy-preserving data analytics in various domains necessitates solutions for synthetic data generation that rigorously uphold privacy standards. We introduce the DP-FedTabDiff framework, a novel integration of Differential Privacy, Federated Learning and Denoising Diffusion Probabilistic Models designed to generate high-fidelity synthetic tabular data. This framework ensures compliance with privacy regulations while maintaining data utility. We demonstrate the effectiveness of DP-FedTabDiff on multiple real-world mixed-type tabular datasets, achieving significant improvements in privacy guarantees without compromising data quality. Our empirical evaluations reveal the optimal trade-offs between privacy budgets, client configurations, and federated optimization strategies. The results affirm the potential of DP-FedTabDiff to enable secure data sharing and analytics in highly regulated domains, paving the way for further advances in federated learning and privacy-preserving data synthesis. ...

December 20, 2024 · 2 min · Research Team

The Role of AI in Financial Forecasting: ChatGPT's Potential and Challenges

The Role of AI in Financial Forecasting: ChatGPT’s Potential and Challenges ArXiv ID: 2411.13562 “View on arXiv” Authors: Unknown Abstract The outlook for the future of artificial intelligence (AI) in the financial sector, especially in financial forecasting, the challenges and implications. The dynamics of AI technology, including deep learning, reinforcement learning, and integration with blockchAIn and the Internet of Things, also highlight the continued improvement in data processing capabilities. Explore how AI is reshaping financial services with precisely tAIlored services that can more precisely meet the diverse needs of individual investors. The integration of AI challenges regulatory and ethical issues in the financial sector, as well as the implications for data privacy protection. Analyze the limitations of current AI technology in financial forecasting and its potential impact on the future financial industry landscape, including changes in the job market, the emergence of new financial institutions, and user interface innovations. Emphasizing the importance of increasing investor understanding and awareness of AI and looking ahead to future trends in AI tools for user experience to drive wider adoption of AI in financial decision making. The huge potential, challenges, and future directions of AI in the financial sector highlight the critical role of AI technology in driving transformation and innovation in the financial sector ...

November 7, 2024 · 2 min · Research Team

Six Levels of Privacy: A Framework for Financial Synthetic Data

Six Levels of Privacy: A Framework for Financial Synthetic Data ArXiv ID: 2403.14724 “View on arXiv” Authors: Unknown Abstract Synthetic Data is increasingly important in financial applications. In addition to the benefits it provides, such as improved financial modeling and better testing procedures, it poses privacy risks as well. Such data may arise from client information, business information, or other proprietary sources that must be protected. Even though the process by which Synthetic Data is generated serves to obscure the original data to some degree, the extent to which privacy is preserved is hard to assess. Accordingly, we introduce a hierarchy of levels'' of privacy that are useful for categorizing Synthetic Data generation methods and the progressively improved protections they offer. While the six levels were devised in the context of financial applications, they may also be appropriate for other industries as well. Our paper includes: A brief overview of Financial Synthetic Data, how it can be used, how its value can be assessed, privacy risks, and privacy attacks. We close with details of the Six Levels’’ that include defenses against those attacks. ...

March 20, 2024 · 2 min · Research Team

Transformers with Attentive Federated Aggregation for Time Series Stock Forecasting

Transformers with Attentive Federated Aggregation for Time Series Stock Forecasting ArXiv ID: 2402.06638 “View on arXiv” Authors: Unknown Abstract Recent innovations in transformers have shown their superior performance in natural language processing (NLP) and computer vision (CV). The ability to capture long-range dependencies and interactions in sequential data has also triggered a great interest in time series modeling, leading to the widespread use of transformers in many time series applications. However, being the most common and crucial application, the adaptation of transformers to time series forecasting has remained limited, with both promising and inconsistent results. In contrast to the challenges in NLP and CV, time series problems not only add the complexity of order or temporal dependence among input sequences but also consider trend, level, and seasonality information that much of this data is valuable for decision making. The conventional training scheme has shown deficiencies regarding model overfitting, data scarcity, and privacy issues when working with transformers for a forecasting task. In this work, we propose attentive federated transformers for time series stock forecasting with better performance while preserving the privacy of participating enterprises. Empirical results on various stock data from the Yahoo! Finance website indicate the superiority of our proposed scheme in dealing with the above challenges and data heterogeneity in federated learning. ...

January 22, 2024 · 2 min · Research Team