Download presentation
Presentation is loading. Please wait.
Published byCornelius Walters Modified over 9 years ago
1
Functional Data Approach to Longitudinal Modeling in the National Hockey League Matthew J. Valente, David P. MacKinnon, and Hye Won Suk Arizona State University Introduction Method References Matthew J. Valente is the contact author for this research. E-mail: m.valente@asu.edu. Presented at the Annual Meeting of New England Symposium on Statistics in Sports, 2015 The data for this analysis project were retrieved from www.NHL.com and consisted of 13 seasons of season level data from all 30 teams in the NHL and referred to the end-of- season totals for the performance statistics for each team. The seasons ranged from the 2000 – 2001 season to the 2014 – 2015 season. The 2004-2005 season was not included due to lock-out and the 2012-2013 season was excluded because it was not a full season (i.e., not 82 games).www.NHL.com The Fenwick scores were computed in the following way: (Shots for + Missed shots) and the PDO scores were computed in the following way: (Save percentage + Shooting percentage) (Stats.HockeyAnalysis.com). B-spline basis functions of order four with knot points placed at each season were used to smooth total points, Fenwick scores, and PDO scores. The squared second order derivative was used as the penalty parameter and a weight value for this penalty parameter was chosen using Generalized Cross Validation (GCV). After the data were smoothed, a concurrent functional linear model was estimated using leave-one-out cross-validation to pick the penalty parameter that minimized the sum of squared error (SSE), a scalar response model was conducted using leave-one-out cross-validation to pick the penalty parameter that minimized the SSE, and a functional ANOVA was conducted using leave-one-out cross-validation to pick the penalty parameter that minimized the SSE. Conclusions/Future Directions The purpose of this data analysis project was to apply functional data analysis (FDA) techniques to analyze trends and relations in season level performance statistics of all 30 teams in the National Hockey League (NHL) across 13 regular seasons. Functional data are data that are generated from a continuous underlying process along some continuum, usually time (Levitin et al., 2007). The trends for points and performance statistics for each team across many seasons can be thought of as being generated from smooth underlying curves because the teams across each season remain relatively stable (except in some rare cases). Three FDA techniques were applied to total points at the end of regular season, Fenwick scores, PDO scores, and number of playoff appearances. Concurrent functional linear model. Concurrent functional linear models allow for both the outcome and the predictors to be continuous. The continuous outcome was total points at the end of regular season and the continuous predictors were Fenwick and PDO scores. Scalar response model. Scalar response models allow for a single scalar outcome and a continuous predictor. Number of playoff appearances was the scalar outcome and the continuous predictor was Fenwick scores. Functional Analysis of Variance (ANOVA). Functional ANOVA models allow for a continuous outcome and a categorical predictor (for example see, Park et al., 2013). The continuous outcome was Fenwick scores and the categorical predictor was NHL conference (western conference coded as 0, eastern conference coded as 1). The Detroit Red Wings, Columbus Blue Jackets, and the Winnipeg Jets were excluded from this analysis because each of these teams switched conferences during the time span investigated. Fenwick scores and PDO scores were significant predictors of total points at the end of regular season and accounted for large amount of variance in total points. Fenwick scores was a significant predictor of number of playoff appearances although they did not explain a large amount of the variance in number of playoff appearances. Finally, there was a range of seasons for which the western and eastern conferences were significantly different in terms of there Fenwick scores. FDA techniques can be applied to a wide range of longitudinal questions for NHL data. In these examples FDA techniques were applied to team-level regular season NHL data but these techniques can easily be applied to player-level major/minor league data. Because FDA allows for modeling intensive longitudinal data (i.e., many time points) (Levitin et al., 2007; Park et al., 2013), FDA techniques might be particularly well-suited for modeling relations between in-game player and puck possession statistics. This is interesting because the NHL is currently experimenting with collecting in-game player and puck possession statistics which would result in large amounts of data depending on the time scale the data is collected on. For example, player and puck possession data is collected every shift and is tracked continuously during each shift. If a player is averaging 25 – 30 shifts per game and playing 30 – 60 seconds each shift and we are interested in player and puck possession statistics for every second on the ice, there would be 750 – 1800 data points for a single player on a single team. FDA is a promising analytic technique for the unique types and amounts of data that can be obtained from NHL teams and players as well as minor league or junior league teams and players. Levitin, D. J., Nuzzo, R. L., Vines, B. W., & Ramsay, J. O. (2007). Introduction to functional data analysis. Canadian Psychology/Psychologie canadienne,48(3), 135. Park, K. K., Suk, H. W., Hwang, H., & Lee, J. H. (2013). A functional analysis of deception detection of a mock crime using infrared thermal imaging and the concealed information test. Frontiers in human neuroscience, 7. Results Concurrent Linear Model Scalar Response Model Raw Data Functional ANOVA
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.