Explainable artificial intelligence model for identifying Market Value in Professional Soccer Players

ArXiv ID: 2311.04599 “View on arXiv”

Authors: Unknown

Abstract

This study introduces an advanced machine learning method for predicting soccer players’ market values, combining ensemble models and the Shapley Additive Explanations (SHAP) for interpretability. Utilizing data from about 12,000 players from Sofifa, the Boruta algorithm streamlined feature selection. The Gradient Boosting Decision Tree (GBDT) model excelled in predictive accuracy, with an R-squared of 0.901 and a Root Mean Squared Error (RMSE) of 3,221,632.175. Player attributes in skills, fitness, and cognitive areas significantly influenced market value. These insights aid sports industry stakeholders in player valuation. However, the study has limitations, like underestimating superstar players’ values and needing larger datasets. Future research directions include enhancing the model’s applicability and exploring value prediction in various contexts.

Keywords: Gradient Boosting Decision Tree (GBDT), Shapley Additive Explanations (SHAP), Sports analytics, Player market value prediction, Ensemble models

Complexity vs Empirical Score

  • Math Complexity: 2.5/10
  • Empirical Rigor: 7.0/10
  • Quadrant: Street Traders
  • Why: The paper primarily uses standard machine learning models (GBDT) with accessible explanations (SHAP) and relies on a large, real-world dataset with performance metrics like R-squared and RMSE, indicating high empirical readiness for sports analytics but limited advanced mathematical theory.
  flowchart TD
    A["Research Goal:<br>Predict & Explain Soccer Player Market Value"] --> B{"Data & Feature Engineering"};
    B --> C["Feature Selection<br>Boruta Algorithm"];
    B --> D["Dataset: ~12,000 players (Sofifa)"];
    C --> E["Model Training<br>GBDT Ensemble"];
    D --> E;
    E --> F{"Model Evaluation"};
    F --> G["Findings<br>High Accuracy: R²=0.901, RMSE=3.2M"];
    F --> H["Interpretability<br>SHAP Analysis: Skills & Fitness are Key Drivers"];