Machine learning-based similarity measure to forecast M&A from patent data

ArXiv ID: 2404.07179 “View on arXiv”

Authors: Unknown

Abstract

Defining and finalizing Mergers and Acquisitions (M&A) requires complex human skills, which makes it very hard to automatically find the best partner or predict which firms will make a deal. In this work, we propose the MASS algorithm, a specifically designed measure of similarity between companies and we apply it to patenting activity data to forecast M&A deals. MASS is based on an extreme simplification of tree-based machine learning algorithms and naturally incorporates intuitive criteria for deals; as such, it is fully interpretable and explainable. By applying MASS to the Zephyr and Crunchbase datasets, we show that it outperforms LightGCN, a “black box” graph convolutional network algorithm. When similar companies have disjoint patenting activities, on the contrary, LightGCN turns out to be the most effective algorithm. This study provides a simple and powerful tool to model and predict M&A deals, offering valuable insights to managers and practitioners for informed decision-making.

Keywords: Mergers and Acquisitions, Machine Learning, Patent Analysis, Graph Convolutional Networks, Equities

Complexity vs Empirical Score

  • Math Complexity: 6.5/10
  • Empirical Rigor: 8.5/10
  • Quadrant: Holy Grail
  • Why: The paper presents a novel, interpretable similarity metric (MASS) with moderate mathematical formulation and modifies existing economic complexity/network metrics, indicating advanced math. Empirical rigor is high due to the use of real-world datasets (Zephyr, Crunchbase), out-of-sample forecasting, and comparison against a complex baseline (LightGCN) with explicit performance metrics.
  flowchart TD
    A["Research Goal: Forecast M&A deals<br>using patent data"] --> B["Methodology: MASS Algorithm<br>Simplicity & Interpretability"]
    B --> C["Data Inputs:<br>Patent Activity, Zephyr & Crunchbase"]
    C --> D["Computational Process:<br>Tree-based Similarity Measure"]
    D --> E{"Evaluate Performance<br>vs LightGCN"}
    E -->|Mass Similarity| F["Outcome: MASS Outperforms<br>LightGCN"]
    E -->|Disjoint Patent Activity| G["Outcome: LightGCN More Effective"]
    F --> H["Key Insight: Simple, interpretable<br>tool for M&A prediction"]
    G --> H