false

Semantic Faithfulness and Entropy Production Measures to Tame Your LLM Demons and Manage Hallucinations

Semantic Faithfulness and Entropy Production Measures to Tame Your LLM Demons and Manage Hallucinations ArXiv ID: 2512.05156 “View on arXiv” Authors: Igor Halperin Abstract Evaluating faithfulness of Large Language Models (LLMs) to a given task is a complex challenge. We propose two new unsupervised metrics for faithfulness evaluation using insights from information theory and thermodynamics. Our approach treats an LLM as a bipartite information engine where hidden layers act as a Maxwell demon controlling transformations of context $C $ into answer $A$ via prompt $Q$. We model Question-Context-Answer (QCA) triplets as probability distributions over shared topics. Topic transformations from $C$ to $Q$ and $A$ are modeled as transition matrices ${"\bf Q"}$ and ${"\bf A"}$ encoding the query goal and actual result, respectively. Our semantic faithfulness (SF) metric quantifies faithfulness for any given QCA triplet by the Kullback-Leibler (KL) divergence between these matrices. Both matrices are inferred simultaneously via convex optimization of this KL divergence, and the final SF metric is obtained by mapping the minimal divergence onto the unit interval [“0,1”], where higher scores indicate greater faithfulness. Furthermore, we propose a thermodynamics-based semantic entropy production (SEP) metric in answer generation, and show that high faithfulness generally implies low entropy production. The SF and SEP metrics can be used jointly or separately for LLM evaluation and hallucination control. We demonstrate our framework on LLM summarization of corporate SEC 10-K filings. ...

December 4, 2025 · 2 min · Research Team

Hidden Order in Trades Predicts the Size of Price Moves

Hidden Order in Trades Predicts the Size of Price Moves ArXiv ID: 2512.15720 “View on arXiv” Authors: Mainak Singha Abstract Financial markets exhibit an apparent paradox: while directional price movements remain largely unpredictable–consistent with weak-form efficiency–the magnitude of price changes displays systematic structure. Here we demonstrate that real-time order-flow entropy, computed from a 15-state Markov transition matrix at second resolution, predicts the magnitude of intraday returns without providing directional information. Analysis of 38.5 million SPY trades over 36 trading days reveals that conditioning on entropy below the 5th percentile increases subsequent 5-minute absolute returns by a factor of 2.89 (t = 12.41, p < 0.0001), while directional accuracy remains at 45.0%–statistically indistinguishable from chance (p = 0.12). This decoupling arises from a fundamental symmetry: entropy is invariant under sign permutation, detecting the presence of informed trading without revealing its direction. Walk-forward validation across five non-overlapping test periods confirms out-of-sample predictability, and label-permutation placebo tests yield z = 14.4 against the null. These findings suggest that information-theoretic measures may serve as volatility state variables in market microstructure, though the limited sample (36 days, single instrument) requires extended validation. ...

December 2, 2025 · 2 min · Research Team

Mechanisms of information communication and market price movements. The case of SP 500 market

Mechanisms of information communication and market price movements. The case of SP 500 market ArXiv ID: 2505.09625 “View on arXiv” Authors: Inga Ivanova, Grzegorz Rzadkowski Abstract In this paper we analyze how market prices change in response to information processing among the market participants and how non-linear information dynamics drive market price movement. We analyze historical data of the SP 500 market for the period 1950 -2025 using the logistic Continuous Wavelet Transformation method. This approach allows us to identify various patterns in market dynamics. These patterns are conceptualized using a new theory of reflexive communication of information in a market consisting of heterogeneous agents who assign meaning to information from different perspectives. This allows us to describe market dynamics and make forecasts of its development using the most general mechanisms of information circulation within the content-free approach. ...

April 28, 2025 · 2 min · Research Team

The lexical ratio: A new perspective on portfolio diversification

The lexical ratio: A new perspective on portfolio diversification ArXiv ID: 2411.06080 “View on arXiv” Authors: Unknown Abstract Portfolio diversification, traditionally measured through asset correlations and volatilitybased metrics, is fundamental to managing financial risk. However, existing diversification metrics often overlook non-numerical relationships between assets that can impact portfolio stability, particularly during market stresses. This paper introduces the lexical ratio (LR), a novel metric that leverages textual data to capture diversification dimensions absent in standard approaches. By treating each asset as a unique document composed of sectorspecific and financial keywords, the LR evaluates portfolio diversification by distributing these terms across assets, incorporating entropy-based insights from information theory. We thoroughly analyze LR’s properties, including scale invariance, concavity, and maximality, demonstrating its theoretical robustness and ability to enhance risk-adjusted portfolio returns. Using empirical tests on S&P 500 portfolios, we compare LR’s performance to established metrics such as Markowitz’s volatility-based measures and diversification ratios. Our tests reveal LR’s superiority in optimizing portfolio returns, especially under varied market conditions. Our findings show that LR aligns with conventional metrics and captures unique diversification aspects, suggesting it is a viable tool for portfolio managers. ...

November 9, 2024 · 2 min · Research Team

Model-based and empirical analyses of stochastic fluctuations in economy and finance

Model-based and empirical analyses of stochastic fluctuations in economy and finance ArXiv ID: 2408.16010 “View on arXiv” Authors: Unknown Abstract The objective of this work is the investigation of complexity, asymmetry, stochasticity and non-linearity of the financial and economic systems by using the tools of statistical mechanics and information theory. More precisely, this thesis concerns statistical-based modeling and empirical analyses with applications in finance, forecasting, production processes and game theory. In these areas the time dependence of probability distributions is of prime interest and can be measured or exactly calculated for model systems. The correlation coefficients and moments are among the useful quantities to describe the dynamics and the correlations between random variables. However, the full investigation can only be achieved if the probability distribution function of the variable is known; its derivation is one of the main focuses of the present work. ...

August 14, 2024 · 2 min · Research Team

A quantum double-or-nothing game: The Kelly Criterion for Spins

A quantum double-or-nothing game: The Kelly Criterion for Spins ArXiv ID: 2308.01305 “View on arXiv” Authors: Unknown Abstract A sequence of spin-1/2 particles polarised in one of two possible directions is presented to an experimenter, who can wager in a double-or-nothing game on the outcomes of measurements in freely chosen polarisation directions. Wealth is accrued through astute betting. As information is gained from the stream of particles, the measurement directions are progressively adjusted, and the portfolio growth rate is raised. The optimal quantum strategy is determined numerically and shown to differ from the classical strategy, which is associated with the Kelly criterion. The paper contributes to the development of quantum finance, as aspects of portfolio optimisation are extended to the quantum realm. ...

August 2, 2023 · 2 min · Research Team

Complexity measure, kernel density estimation, bandwidth selection, and the efficient market hypothesis

Complexity measure, kernel density estimation, bandwidth selection, and the efficient market hypothesis ArXiv ID: 2305.13123 “View on arXiv” Authors: Unknown Abstract We are interested in the nonparametric estimation of the probability density of price returns, using the kernel approach. The output of the method heavily relies on the selection of a bandwidth parameter. Many selection methods have been proposed in the statistical literature. We put forward an alternative selection method based on a criterion coming from information theory and from the physics of complex systems: the bandwidth to be selected maximizes a new measure of complexity, with the aim of avoiding both overfitting and underfitting. We review existing methods of bandwidth selection and show that they lead to contradictory conclusions regarding the complexity of the probability distribution of price returns. This has also some striking consequences in the evaluation of the relevance of the efficient market hypothesis. We apply these methods to real financial data, focusing on the Bitcoin. ...

May 22, 2023 · 2 min · Research Team