โก Quick Summary
This study utilized SHAP-based interpretable machine learning to predict injury risks among 800 university football players, achieving an impressive 95.6% accuracy with the Support Vector Machine model. Key factors identified include psychological stress and sleep duration, highlighting the importance of lifestyle factors in injury prevention.
๐ Key Details
- ๐ Dataset: 800 Chinese university football players
- ๐งฉ Features used: 18 features across four dimensions: basic information, training factors, physical fitness, and lifestyle habits
- โ๏ธ Technology: Support Vector Machine (SVM) and SHAP for interpretability
- ๐ Performance: SVM: 95.6% accuracy, 95.7% F1-score, 99.2% ROC-AUC
๐ Key Takeaways
- ๐ Injury prediction is crucial for the health of university football players.
- ๐ก SHAP analysis revealed key injury risk factors, including stress and sleep.
- ๐ก๏ธ Lifestyle factors were found to be more significant than traditional physical fitness indicators.
- ๐ Psychological stress showed a positive correlation with injury risk.
- ๐ด Adequate sleep and balance ability were identified as protective factors.
- ๐ Limitations include the study’s single-dataset design and lack of external validation.
- ๐ Future research is needed for prospective validation before clinical application.
- ๐ This study lays the groundwork for evidence-based injury prevention strategies.

๐ Background
Sports injuries pose a significant challenge, particularly in university football, where players are often at risk due to intense training and competition. While much research has focused on professional athletes, there is a notable gap in studies addressing the unique needs of university players. This study aims to bridge that gap by employing machine learning techniques to predict injury risks, thereby enhancing player health and safety.
๐๏ธ Study
Conducted using the Kaggle “University Football Injury Prediction Dataset”, this research involved a comprehensive analysis of 800 university football players. The study developed an 18-feature evaluation system that encompassed various dimensions, including basic information, training factors, physical fitness, and lifestyle habits. By systematically comparing ten different machine learning algorithms, the researchers sought to identify the most effective method for predicting injury risks.
๐ Results
The Support Vector Machine (SVM) emerged as the top performer, achieving a remarkable 95.6% accuracy, 95.7% F1-score, and 99.2% ROC-AUC. The SHAP interpretability analysis highlighted several key factors influencing injury risk, with stress level (importance: 0.10), sleep duration (0.09), and balance ability (0.08) being the most significant. These findings underscore the critical role of psychological and lifestyle factors in injury prevention.
๐ Impact and Implications
The implications of this study are profound, as it demonstrates the feasibility of using interpretable machine learning for injury risk prediction in university athletes. By identifying key risk factors, coaches and trainers can develop targeted prevention strategies that prioritize player health. This research not only contributes to the field of sports science but also sets the stage for future studies aimed at enhancing athlete safety through data-driven approaches.
๐ฎ Conclusion
This study highlights the potential of interpretable machine learning in predicting injury risks among university football players. By focusing on both psychological and lifestyle factors, we can create more effective injury prevention strategies. As we move forward, further research and validation are essential to ensure these findings can be applied in real-world settings, ultimately improving the health and performance of athletes.
๐ฌ Your comments
What are your thoughts on the use of machine learning for injury risk prediction in sports? We would love to hear your insights! ๐ฌ Leave your comments below or connect with us on social media:
SHAP-based interpretable machine learning for injury risk prediction in university football players: a multi-dimensional data analysis approach.
Abstract
Sports injury prediction is crucial for university football player health, yet existing research predominantly focuses on professional athletes and lacks interpretability. Using the Kaggle “University Football Injury Prediction Dataset” (800 Chinese university players), we constructed a comprehensive 18-feature evaluation system across four dimensions: basic information, training factors, physical fitness, and lifestyle habits. We systematically compared 10 machine learning algorithms. The Support Vector Machine achieved optimal performance (95.6% accuracy, 95.7% F1-score, 99.2% ROC-AUC). SHAP interpretability analysis identified stress level (importance: 0.10), sleep duration (0.09), and balance ability (0.08) as key injury risk factors, with psychological stress showing positive correlation and adequate sleep/balance showing protective effects. Notably, lifestyle factors outweighed traditional physical fitness indicators in importance. Despite promising results, this study’s single-dataset design and lack of external validation limit generalizability. Prospective validation is essential before clinical deployment. This work demonstrates the feasibility of interpretable injury risk prediction for university athletes, providing a foundation for evidence-based prevention strategies.
Author: [‘Ma J’, ‘Liu S’, ‘Pei Y’]
Journal: Sci Rep
Citation: Ma J, et al. SHAP-based interpretable machine learning for injury risk prediction in university football players: a multi-dimensional data analysis approach. SHAP-based interpretable machine learning for injury risk prediction in university football players: a multi-dimensional data analysis approach. 2025; 15:40252. doi: 10.1038/s41598-025-24144-y