false

Credit Risk Estimation with Non-Financial Features: Evidence from a Synthetic Istanbul Dataset

Credit Risk Estimation with Non-Financial Features: Evidence from a Synthetic Istanbul Dataset ArXiv ID: 2512.12783 “View on arXiv” Authors: Atalay Denknalbant, Emre Sezdi, Zeki Furkan Kutlu, Polat Goktas Abstract Financial exclusion constrains entrepreneurship, increases income volatility, and widens wealth gaps. Underbanked consumers in Istanbul often have no bureau file because their earnings and payments flow through informal channels. To study how such borrowers can be evaluated we create a synthetic dataset of one hundred thousand Istanbul residents that reproduces first quarter 2025 TÜİK census marginals and telecom usage patterns. Retrieval augmented generation feeds these public statistics into the OpenAI o3 model, which synthesises realistic yet private records. Each profile contains seven socio demographic variables and nine alternative attributes that describe phone specifications, online shopping rhythm, subscription spend, car ownership, monthly rent, and a credit card flag. To test the impact of the alternative financial data CatBoost, LightGBM, and XGBoost are each trained in two versions. Demo models use only the socio demographic variables; Full models include both socio demographic and alternative attributes. Across five fold stratified validation the alternative block raises area under the curve by about one point three percentage and lifts balanced (F_{“1”}) from roughly 0.84 to 0.95, a fourteen percent gain. We contribute an open Istanbul 2025 Q1 synthetic dataset, a fully reproducible modeling pipeline, and empirical evidence that a concise set of behavioural attributes can approach bureau level discrimination power while serving borrowers who lack formal credit records. These findings give lenders and regulators a transparent blueprint for extending fair and safe credit access to the underbanked. ...

December 14, 2025 · 2 min · Research Team

Cash Flow Underwriting with Bank Transaction Data: Advancing MSME Financial Inclusion in Malaysia

Cash Flow Underwriting with Bank Transaction Data: Advancing MSME Financial Inclusion in Malaysia ArXiv ID: 2510.16066 “View on arXiv” Authors: Chun Chet Ng, Wei Zeng Low, Jia Yu Lim, Yin Yin Boon Abstract Despite accounting for 96.1% of all businesses in Malaysia, access to financing remains one of the most persistent challenges faced by Micro, Small, and Medium Enterprises (MSMEs). Newly established businesses are often excluded from formal credit markets as traditional underwriting approaches rely heavily on credit bureau data. This study investigates the potential of bank statement data as an alternative data source for credit assessment to promote financial inclusion in emerging markets. First, we propose a cash flow-based underwriting pipeline where we utilise bank statement data for end-to-end data extraction and machine learning credit scoring. Second, we introduce a novel dataset of 611 loan applicants from a Malaysian lending institution. Third, we develop and evaluate credit scoring models based on application information and bank transaction-derived features. Empirical results show that the use of such data boosts the performance of all models on our dataset, which can improve credit scoring for new-to-lending MSMEs. Finally, we will release the anonymised bank transaction dataset to facilitate further research on MSME financial inclusion within Malaysia’s emerging economy. ...

October 17, 2025 · 2 min · Research Team

Credit Scores: Performance and Equity

Credit Scores: Performance and Equity ArXiv ID: 2409.00296 “View on arXiv” Authors: Unknown Abstract Credit scores are critical for allocating consumer debt in the United States, yet little evidence is available on their performance. We benchmark a widely used credit score against a machine learning model of consumer default and find significant misclassification of borrowers, especially those with low scores. Our model improves predictive accuracy for young, low-income, and minority groups due to its superior performance with low quality data, resulting in a gain in standing for these populations. Our findings suggest that improving credit scoring performance could lead to more equitable access to credit. ...

August 30, 2024 · 2 min · Research Team

Peer-induced Fairness: A Causal Approach for Algorithmic Fairness Auditing

Peer-induced Fairness: A Causal Approach for Algorithmic Fairness Auditing ArXiv ID: 2408.02558 “View on arXiv” Authors: Unknown Abstract With the European Union’s Artificial Intelligence Act taking effect on 1 August 2024, high-risk AI applications must adhere to stringent transparency and fairness standards. This paper addresses a crucial question: how can we scientifically audit algorithmic fairness? Current methods typically remain at the basic detection stage of auditing, without accounting for more complex scenarios. We propose a novel framework, ``peer-induced fairness’’, which combines the strengths of counterfactual fairness and peer comparison strategy, creating a reliable and robust tool for auditing algorithmic fairness. Our framework is universal, adaptable to various domains, and capable of handling different levels of data quality, including skewed distributions. Moreover, it can distinguish whether adverse decisions result from algorithmic discrimination or inherent limitations of the subjects, thereby enhancing transparency. This framework can serve as both a self-assessment tool for AI developers and an external assessment tool for auditors to ensure compliance with the EU AI Act. We demonstrate its utility in small and medium-sized enterprises access to finance, uncovering significant unfairness-41.51% of micro-firms face discrimination compared to non-micro firms. These findings highlight the framework’s potential for broader applications in ensuring equitable AI-driven decision-making. ...

August 5, 2024 · 2 min · Research Team

Explainable Automated Machine Learning for Credit Decisions: Enhancing Human Artificial Intelligence Collaboration in Financial Engineering

Explainable Automated Machine Learning for Credit Decisions: Enhancing Human Artificial Intelligence Collaboration in Financial Engineering ArXiv ID: 2402.03806 “View on arXiv” Authors: Unknown Abstract This paper explores the integration of Explainable Automated Machine Learning (AutoML) in the realm of financial engineering, specifically focusing on its application in credit decision-making. The rapid evolution of Artificial Intelligence (AI) in finance has necessitated a balance between sophisticated algorithmic decision-making and the need for transparency in these systems. The focus is on how AutoML can streamline the development of robust machine learning models for credit scoring, while Explainable AI (XAI) methods, particularly SHapley Additive exPlanations (SHAP), provide insights into the models’ decision-making processes. This study demonstrates how the combination of AutoML and XAI not only enhances the efficiency and accuracy of credit decisions but also fosters trust and collaboration between humans and AI systems. The findings underscore the potential of explainable AutoML in improving the transparency and accountability of AI-driven financial decisions, aligning with regulatory requirements and ethical considerations. ...

February 6, 2024 · 2 min · Research Team

Investigate The ESG Score Methodology

Investigate The ESG Score Methodology ArXiv ID: 2312.00202 “View on arXiv” Authors: Unknown Abstract Whether the Refinitiv provide a reliable and trusted methodology in the process of aggregating 10 category scores to overall score? Keywords: Credit Scoring, Methodology Validation, Refinitiv, Data Aggregation, Financial Ratings, Fixed Income / Credit Complexity vs Empirical Score Math Complexity: 1.0/10 Empirical Rigor: 2.0/10 Quadrant: Philosophers Why: The paper focuses on conceptual critique and literature review of ESG methodologies without advanced mathematical derivations, and it lacks code, backtests, or datasets, indicating low empirical rigor. flowchart TD A["Research Goal:<br>Validate Refinitiv ESG Score Methodology"] --> B{"Key Methodology Steps"} B --> C["Data Input:<br>10 Category ESG Metrics"] B --> D["Data Input:<br>Industry-Specific Weightings"] C --> E["Computational Process:<br>Normalization & Scoring"] D --> E E --> F["Computational Process:<br>Weighted Aggregation"] F --> G["Outcome:<br>Overall ESG Score (0-100)"] F --> H["Outcome:<br>Score Reliability & Methodology Assessment"]

November 30, 2023 · 1 min · Research Team