Machine learning-based similarity measure to forecast M&A from patent data
ArXiv ID: 2404.07179 “View on arXiv”
Authors: Unknown
Abstract
Defining and finalizing Mergers and Acquisitions (M&A) requires complex human skills, which makes it very hard to automatically find the best partner or predict which firms will make a deal. In this work, we propose the MASS algorithm, a specifically designed measure of similarity between companies and we apply it to patenting activity data to forecast M&A deals. MASS is based on an extreme simplification of tree-based machine learning algorithms and naturally incorporates intuitive criteria for deals; as such, it is fully interpretable and explainable. By applying MASS to the Zephyr and Crunchbase datasets, we show that it outperforms LightGCN, a “black box” graph convolutional network algorithm. When similar companies have disjoint patenting activities, on the contrary, LightGCN turns out to be the most effective algorithm. This study provides a simple and powerful tool to model and predict M&A deals, offering valuable insights to managers and practitioners for informed decision-making.
Keywords: Mergers and Acquisitions, Machine Learning, Patent Analysis, Graph Convolutional Networks, Equities
Complexity vs Empirical Score
- Math Complexity: 6.5/10
- Empirical Rigor: 8.5/10
- Quadrant: Holy Grail
- Why: The paper presents a novel, interpretable similarity metric (MASS) with moderate mathematical formulation and modifies existing economic complexity/network metrics, indicating advanced math. Empirical rigor is high due to the use of real-world datasets (Zephyr, Crunchbase), out-of-sample forecasting, and comparison against a complex baseline (LightGCN) with explicit performance metrics.
flowchart TD
A["Research Goal: Forecast M&A deals<br>using patent data"] --> B["Methodology: MASS Algorithm<br>Simplicity & Interpretability"]
B --> C["Data Inputs:<br>Patent Activity, Zephyr & Crunchbase"]
C --> D["Computational Process:<br>Tree-based Similarity Measure"]
D --> E{"Evaluate Performance<br>vs LightGCN"}
E -->|Mass Similarity| F["Outcome: MASS Outperforms<br>LightGCN"]
E -->|Disjoint Patent Activity| G["Outcome: LightGCN More Effective"]
F --> H["Key Insight: Simple, interpretable<br>tool for M&A prediction"]
G --> H