false

How to Choose a Threshold for an Evaluation Metric for Large Language Models

How to Choose a Threshold for an Evaluation Metric for Large Language Models ArXiv ID: 2412.12148 “View on arXiv” Authors: Unknown Abstract To ensure and monitor large language models (LLMs) reliably, various evaluation metrics have been proposed in the literature. However, there is little research on prescribing a methodology to identify a robust threshold on these metrics even though there are many serious implications of an incorrect choice of the thresholds during deployment of the LLMs. Translating the traditional model risk management (MRM) guidelines within regulated industries such as the financial industry, we propose a step-by-step recipe for picking a threshold for a given LLM evaluation metric. We emphasize that such a methodology should start with identifying the risks of the LLM application under consideration and risk tolerance of the stakeholders. We then propose concrete and statistically rigorous procedures to determine a threshold for the given LLM evaluation metric using available ground-truth data. As a concrete example to demonstrate the proposed methodology at work, we employ it on the Faithfulness metric, as implemented in various publicly available libraries, using the publicly available HaluBench dataset. We also lay a foundation for creating systematic approaches to select thresholds, not only for LLMs but for any GenAI applications. ...

December 10, 2024 · 2 min · Research Team

Implementing portfolio risk management and hedging in practice

Implementing portfolio risk management and hedging in practice ArXiv ID: 2309.15767 “View on arXiv” Authors: Unknown Abstract In academic literature portfolio risk management and hedging are often versed in the language of stochastic control and Hamilton–Jacobi–Bellman~(HJB) equations in continuous time. In practice the continuous-time framework of stochastic control may be undesirable for various business reasons. In this work we present a straightforward approach for thinking of cross-asset portfolio risk management and hedging, providing some implementation details, while rarely venturing outside the convex optimisation setting of (approximate) quadratic programming~(QP). We pay particular attention to the correspondence between the economic concepts and their mathematical representations; the abstractions enabling us to handle multiple asset classes and risk models at once; the dimensional analysis of the resulting equations; and the assumptions inherent in our derivations. We demonstrate how to solve the resulting QPs with CVXOPT. ...

September 27, 2023 · 2 min · Research Team

DigitalFinance& The COVID-19 Crisis

DigitalFinance& The COVID-19 Crisis ArXiv ID: ssrn-3558889 “View on arXiv” Authors: Unknown Abstract The COVID-19 coronavirus crisis is putting unprecedented strain on markets, governments, businesses and individuals. The human, economic and financial costs are Keywords: COVID-19, Market Volatility, Systemic Risk, Economic Impact, Cross-Asset Complexity vs Empirical Score Math Complexity: 1.0/10 Empirical Rigor: 1.0/10 Quadrant: Philosophers Why: The paper is a high-level policy and regulatory analysis with no mathematical models or empirical backtesting, focusing on conceptual strategies and qualitative recommendations. flowchart TD A["Research Goal: Impact of COVID-19<br>on Digital Finance Markets"] --> B["Data Collection"] B --> C["Methodology: Cross-Asset Analysis"] C --> D["Computational Process:<br>Volatility & Risk Modeling"] D --> E["Key Findings"] subgraph B ["Data/Inputs"] B1["Market Volatility Data"] B2["Systemic Risk Indicators"] B3["Economic Impact Metrics"] end subgraph E ["Outcomes"] E1["Increased Market Volatility"] E2["Systemic Risk Transmission"] E3["Cross-Asset Correlation Spike"] end

March 26, 2020 · 1 min · Research Team

International Law on Pandemic Response: A First Stocktaking in Light of the Coronavirus Crisis

International Law on Pandemic Response: A First Stocktaking in Light of the Coronavirus Crisis ArXiv ID: ssrn-3561650 “View on arXiv” Authors: Unknown Abstract The coronavirus (SARS-CoV-2) pandemic is currently raging throughout the world. The ensuing crisis has acquired a multidimensional nature, affecting all levels Keywords: Pandemic, Crisis Management, Market Liquidity, Economic Recovery, Cross-Asset Complexity vs Empirical Score Math Complexity: 0.0/10 Empirical Rigor: 0.0/10 Quadrant: Philosophers Why: The paper is a legal analysis of international health regulations and human rights law with no mathematical formulas or empirical backtesting, focusing on theoretical and normative assessments of pandemic response frameworks. flowchart TD Goal["Research Goal: Assess International Law's adequacy in pandemic response based on coronavirus crisis."] Method["Methodology: Qualitative legal analysis of international instruments and case studies."] Inputs["Data/Inputs: WHO IHR 2005, International Health Regulations, pandemic response measures."] Process["Computational Process: Comparative analysis of legal frameworks vs. crisis management realities."] Outcome["Key Findings: Existing laws have gaps in enforcement; need for revised global governance."]

March 26, 2020 · 1 min · Research Team