Anomaly Detection

A Comparative Analysis of Statistical and Machine Learning Models for Outlier Detection in Bitcoin Limit Order Books

A Comparative Analysis of Statistical and Machine Learning Models for Outlier Detection in Bitcoin Limit Order Books ArXiv ID: 2507.14960 “View on arXiv” Authors: Ivan Letteri Abstract The detection of outliers within cryptocurrency limit order books (LOBs) is of paramount importance for comprehending market dynamics, particularly in highly volatile and nascent regulatory environments. This study conducts a comprehensive comparative analysis of robust statistical methods and advanced machine learning techniques for real-time anomaly identification in cryptocurrency LOBs. Within a unified testing environment, named AITA Order Book Signal (AITA-OBS), we evaluate the efficacy of thirteen diverse models to identify which approaches are most suitable for detecting potentially manipulative trading behaviours. An empirical evaluation, conducted via backtesting on a dataset of 26,204 records from a major exchange, demonstrates that the top-performing model, Empirical Covariance (EC), achieves a 6.70% gain, significantly outperforming a standard Buy-and-Hold benchmark. These findings underscore the effectiveness of outlier-driven strategies and provide insights into the trade-offs between model complexity, trade frequency, and performance. This study contributes to the growing corpus of research on cryptocurrency market microstructure by furnishing a rigorous benchmark of anomaly detection models and highlighting their potential for augmenting algorithmic trading and risk management. ...

Machine learning approach to stock price crash risk

Machine learning approach to stock price crash risk ArXiv ID: 2505.16287 “View on arXiv” Authors: Abdullah Karasan, Ozge Sezgin Alp, Gerhard-Wilhelm Weber Abstract In this study, we propose a novel machine-learning-based measure for stock price crash risk, utilizing the minimum covariance determinant methodology. Employing this newly introduced dependent variable, we predict stock price crash risk through cross-sectional regression analysis. The findings confirm that the proposed method effectively captures stock price crash risk, with the model demonstrating strong performance in terms of both statistical significance and economic relevance. Furthermore, leveraging a newly developed firm-specific investor sentiment index, the analysis identifies a positive correlation between stock price crash risk and firm-specific investor sentiment. Specifically, higher levels of sentiment are associated with an increased likelihood of stock price crash risk. This relationship remains robust across different firm sizes and when using the detoned version of the firm-specific investor sentiment index, further validating the reliability of the proposed approach. ...

Detecting Crypto Pump-and-Dump Schemes: A Thresholding-Based Approach to Handling Market Noise

Detecting Crypto Pump-and-Dump Schemes: A Thresholding-Based Approach to Handling Market Noise ArXiv ID: 2503.08692 “View on arXiv” Authors: Unknown Abstract We propose a simple yet robust unsupervised model to detect pump-and-dump events on tokens listed on the Poloniex Exchange platform. By combining threshold-based criteria with exponentially weighted moving averages (EWMA) and volatility measures, our approach effectively distinguishes genuine anomalies from minor trading fluctuations, even for tokens with low liquidity and prolonged inactivity. These characteristics present a unique challenge, as standard anomaly-detection methods often over-flag negligible volume spikes. Our framework overcomes this issue by tailoring both price and volume thresholds to the specific trading patterns observed, resulting in a model that balances high true-positive detection with minimal noise. ...

High-Frequency Market Manipulation Detection with a Markov-modulated Hawkes process

High-Frequency Market Manipulation Detection with a Markov-modulated Hawkes process ArXiv ID: 2502.04027 “View on arXiv” Authors: Unknown Abstract This work focuses on a self-exciting point process defined by a Hawkes-like intensity and a switching mechanism based on a hidden Markov chain. Previous works in such a setting assume constant intensities between consecutive events. We extend the model to general Hawkes excitation kernels that are piecewise constant between events. We develop an expectation-maximization algorithm for the statistical inference of the Hawkes intensities parameters as well as the state transition probabilities. The numerical convergence of the estimators is extensively tested on simulated data. Using high-frequency cryptocurrency data on a top centralized exchange, we apply the model to the detection of anomalous bursts of trades. We benchmark the goodness-of-fit of the model with the Markov-modulated Poisson process and demonstrate the relevance of the model in detecting suspicious activities. ...

Detecting and Triaging Spoofing using Temporal Convolutional Networks

Detecting and Triaging Spoofing using Temporal Convolutional Networks ArXiv ID: 2403.13429 “View on arXiv” Authors: Unknown Abstract As algorithmic trading and electronic markets continue to transform the landscape of financial markets, detecting and deterring rogue agents to maintain a fair and efficient marketplace is crucial. The explosion of large datasets and the continually changing tricks of the trade make it difficult to adapt to new market conditions and detect bad actors. To that end, we propose a framework that can be adapted easily to various problems in the space of detecting market manipulation. Our approach entails initially employing a labelling algorithm which we use to create a training set to learn a weakly supervised model to identify potentially suspicious sequences of order book states. The main goal here is to learn a representation of the order book that can be used to easily compare future events. Subsequently, we posit the incorporation of expert assessment to scrutinize specific flagged order book states. In the event of an expert’s unavailability, recourse is taken to the application of a more complex algorithm on the identified suspicious order book states. We then conduct a similarity search between any new representation of the order book against the expert labelled representations to rank the results of the weak learner. We show some preliminary results that are promising to explore further in this direction ...

Dimensionality reduction techniques to support insider trading detection

Dimensionality reduction techniques to support insider trading detection ArXiv ID: 2403.00707 “View on arXiv” Authors: Unknown Abstract Identification of market abuse is an extremely complicated activity that requires the analysis of large and complex datasets. We propose an unsupervised machine learning method for contextual anomaly detection, which allows to support market surveillance aimed at identifying potential insider trading activities. This method lies in the reconstruction-based paradigm and employs principal component analysis and autoencoders as dimensionality reduction techniques. The only input of this method is the trading position of each investor active on the asset for which we have a price sensitive event (PSE). After determining reconstruction errors related to the trading profiles, several conditions are imposed in order to identify investors whose behavior could be suspicious of insider trading related to the PSE. As a case study, we apply our method to investor resolved data of Italian stocks around takeover bids. ...

CaT-GNN: Enhancing Credit Card Fraud Detection via Causal Temporal Graph Neural Networks

CaT-GNN: Enhancing Credit Card Fraud Detection via Causal Temporal Graph Neural Networks ArXiv ID: 2402.14708 “View on arXiv” Authors: Unknown Abstract Credit card fraud poses a significant threat to the economy. While Graph Neural Network (GNN)-based fraud detection methods perform well, they often overlook the causal effect of a node’s local structure on predictions. This paper introduces a novel method for credit card fraud detection, the \textbf{"\underline{Ca"}}usal \textbf{"\underline{T"}}emporal \textbf{"\underline{G"}}raph \textbf{"\underline{N"}}eural \textbf{“N”}etwork (CaT-GNN), which leverages causal invariant learning to reveal inherent correlations within transaction data. By decomposing the problem into discovery and intervention phases, CaT-GNN identifies causal nodes within the transaction graph and applies a causal mixup strategy to enhance the model’s robustness and interpretability. CaT-GNN consists of two key components: Causal-Inspector and Causal-Intervener. The Causal-Inspector utilizes attention weights in the temporal attention mechanism to identify causal and environment nodes without introducing additional parameters. Subsequently, the Causal-Intervener performs a causal mixup enhancement on environment nodes based on the set of nodes. Evaluated on three datasets, including a private financial dataset and two public datasets, CaT-GNN demonstrates superior performance over existing state-of-the-art methods. Our findings highlight the potential of integrating causal reasoning with graph neural networks to improve fraud detection capabilities in financial transactions. ...

Detecting Anomalous Events in Object-centric Business Processes via Graph Neural Networks

Detecting Anomalous Events in Object-centric Business Processes via Graph Neural Networks ArXiv ID: 2403.00775 “View on arXiv” Authors: Unknown Abstract Detecting anomalies is important for identifying inefficiencies, errors, or fraud in business processes. Traditional process mining approaches focus on analyzing ‘flattened’, sequential, event logs based on a single case notion. However, many real-world process executions exhibit a graph-like structure, where events can be associated with multiple cases. Flattening event logs requires selecting a single case identifier which creates a gap with the real event data and artificially introduces anomalies in the event logs. Object-centric process mining avoids these limitations by allowing events to be related to different cases. This study proposes a novel framework for anomaly detection in business processes that exploits graph neural networks and the enhanced information offered by object-centric process mining. We first reconstruct and represent the process dependencies of the object-centric event logs as attributed graphs and then employ a graph convolutional autoencoder architecture to detect anomalous events. Our results show that our approach provides promising performance in detecting anomalies at the activity type and attributes level, although it struggles to detect anomalies in the temporal order of events. ...

Comparative Evaluation of Anomaly Detection Methods for Fraud Detection in Online Credit Card Payments

Comparative Evaluation of Anomaly Detection Methods for Fraud Detection in Online Credit Card Payments ArXiv ID: 2312.13896 “View on arXiv” Authors: Unknown Abstract This study explores the application of anomaly detection (AD) methods in imbalanced learning tasks, focusing on fraud detection using real online credit card payment data. We assess the performance of several recent AD methods and compare their effectiveness against standard supervised learning methods. Offering evidence of distribution shift within our dataset, we analyze its impact on the tested models’ performances. Our findings reveal that LightGBM exhibits significantly superior performance across all evaluated metrics but suffers more from distribution shifts than AD methods. Furthermore, our investigation reveals that LightGBM also captures the majority of frauds detected by AD methods. This observation challenges the potential benefits of ensemble methods to combine supervised, and AD approaches to enhance performance. In summary, this research provides practical insights into the utility of these techniques in real-world scenarios, showing LightGBM’s superiority in fraud detection while highlighting challenges related to distribution shifts. ...

Detecting Financial Market Manipulation with Statistical Physics Tools

Detecting Financial Market Manipulation with Statistical Physics Tools ArXiv ID: 2308.08683 “View on arXiv” Authors: Unknown Abstract We take inspiration from statistical physics to develop a novel conceptual framework for the analysis of financial markets. We model the order book dynamics as a motion of particles and define the momentum measure of the system as a way to summarise and assess the state of the market. Our approach proves useful in capturing salient financial market phenomena: in particular, it helps detect the market manipulation activities called spoofing and layering. We apply our method to identify pathological order book behaviours during the flash crash of the LUNA cryptocurrency, uncovering widespread instances of spoofing and layering in the market. Furthermore, we establish that our technique outperforms the conventional Z-score-based anomaly detection method in identifying market manipulations across both LUNA and Bitcoin cryptocurrency markets. ...