Using machine learning to improve our understanding of injury risk and prediction in elite male youth football players

Oliver, Jon L, Ayala, Francisco, De Ste Croix, Mark B ORCID: 0000-0001-9911-4355, Lloyd, Rhodri S, Myer, Gregory D and Read, Paul J (2020) Using machine learning to improve our understanding of injury risk and prediction in elite male youth football players. Journal of Science and Medicine in Sport, 23 (11). pp. 1044-1048. doi:10.1016/j.jsams.2020.04.021

Text (Peer-reviewed version)
8341-de-Ste-Croix-(2020)-Using-machine-learning-to-improve.pdf - Accepted Version
Available under License Creative Commons Attribution Non-commercial No Derivatives 4.0.

Download (2MB) | Preview


Objectives: The purpose of this study was to examine whether the use of machine learning improved the ability of a neuromuscular screen to identify injury risk factors in elite male youth football players. Methods: 355 elite youth football players aged 10 to 18 years old completed a prospective pre-season neuromuscular screen that included anthropometric measures of size, as well as single leg countermovement jump (SLCMJ), single leg hop for distance (SLHD), 75% hop distance and stick (75%Hop), Y-balance anterior reach and tuck jump assessment. Injury incidence was monitored over one competitive season. Risk profiling was assessed using traditional regression analyses and compared to supervised machine learning algorithms constructed using decision trees. Results: Using continuous data, multivariate logistic analysis identified SLCMJ asymmetry as the 14 sole significant predictor of injury (OR 0.94, 0.92-0.97, p<0.001 with a specificity of 97.7% and sensitivity of 15.2% giving an AUC of 0.661. The best performing decision tree model provided a specificity of 74.2% and sensitivity of 55.6% with an AUC of 0.663. All variables contributed to the final machine model, with asymmetry in the SLCMJ, 75%Hop and Y-balance, plus tuck jump knee valgus and anthropometrics being the most frequent contributors. Conclusions: Although both statistical methods reported similar accuracy, logistic regression provided very low sensitivity and only identified a single neuromuscular injury risk factor. The machine learning model provided much improved sensitivity to predict injury and identified interactions of asymmetry, knee valgus angle and body size as contributing factors to an injurious profile in youth football players.

Item Type: Article
Article Type: Article
Uncontrolled Keywords: Neuromuscular; Screen; Prospective; Binary logistic regression; REF2021
Subjects: G Geography. Anthropology. Recreation > GV Recreation Leisure > GV557 Sports > GV0711 Coaching
G Geography. Anthropology. Recreation > GV Recreation Leisure > GV557 Sports > GV861 Ball games: Baseball, football, golf, etc.
H Social Sciences > HA Statistics
Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Q Science > QA Mathematics > QA76 Computer software
Q Science > QP Physiology > QP301.H75 Physiology. Sport
Divisions: Schools and Research Institutes > School of Education and Science
Research Priority Areas: Health, Life Sciences, Sport and Wellbeing
Depositing User: Rhiannon Goodland
Date Deposited: 06 May 2020 10:30
Last Modified: 31 Aug 2023 09:07

University Staff: Request a correction | Repository Editors: Update this record

University Of Gloucestershire

Bookmark and Share

Find Us On Social Media:

Social Media Icons Facebook Twitter Google+ YouTube Pinterest Linkedin

Other University Web Sites

University of Gloucestershire, The Park, Cheltenham, Gloucestershire, GL50 2RH. Telephone +44 (0)844 8010001.