NHL Player Development Modelling

Slides:



Advertisements
Similar presentations
Different types of data e.g. Continuous data:height Categorical data ordered (nominal):growth rate very slow, slow, medium, fast, very fast not ordered:fruit.
Advertisements

Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides
        iDistance -- Indexing the Distance An Efficient Approach to KNN Indexing C. Yu, B. C. Ooi, K.-L. Tan, H.V. Jagadish. Indexing the distance:
Metrics, Algorithms & Follow-ups Profile Similarity Measures Cluster combination procedures Hierarchical vs. Non-hierarchical Clustering Statistical follow-up.
Shape and Dynamics in Human Movement Analysis Ashok Veeraraghavan.
Shape and Dynamics in Human Movement Analysis Ashok Veeraraghavan.
Statistical Analysis SC504/HS927 Spring Term 2008 Week 17 (25th January 2008): Analysing data.
Patrick Kemmeren Using EP:NG.
Multivariate Methods EPSY 5245 Michael C. Rodriguez.
Chapter 3 - Part B Descriptive Statistics: Numerical Methods
NASA Earth Observing System Data and Information Systems
Chapter Eleven A Primer for Descriptive Statistics.
Measures of Dispersion & The Standard Normal Distribution 2/5/07.
Multiple Regression The Basics. Multiple Regression (MR) Predicting one DV from a set of predictors, the DV should be interval/ratio or at least assumed.
Measures of Dispersion & The Standard Normal Distribution 9/12/06.
Using the Empirical Rule. Normal Distributions These are special density curves. They have the same overall shape  Symmetric  Single-Peaked  Bell-Shaped.
Computer Graphics and Image Processing (CIS-601).
Marketing Research Aaker, Kumar, Day and Leone Tenth Edition Instructor’s Presentation Slides 1.
Chapter 18 Portfolio Performance Evaluation. Types of management revisited Passive management 1.Capital allocation between cash and the risky portfolio.
Fast Query-Optimized Kernel Machine Classification Via Incremental Approximate Nearest Support Vectors by Dennis DeCoste and Dominic Mazzoni International.
Chapter 1: Introduction to Statistics. Variables A variable is a characteristic or condition that can change or take on different values. Most research.
Understanding AzMERIT Results and Score Reporting An Overview.
Estimating Volatilities and Correlations
Controlling for Context S. Burtch © Traditional Points Plus/Minus Faceoffs Real Time Stats (hits, blocked shots, takeaways/giveaways) Ice Time.
Principal Component Analysis
Chapter 14 EXPLORATORY FACTOR ANALYSIS. Exploratory Factor Analysis  Statistical technique for dealing with multiple variables  Many variables are reduced.
Methods of multivariate analysis Ing. Jozef Palkovič, PhD.
Chapter 12 REGRESSION DIAGNOSTICS AND CANONICAL CORRELATION.
Université d’Ottawa / University of Ottawa 2003 Bio 8102A Applied Multivariate Biostatistics L4.1 Lecture 4: Multivariate distance measures l The concept.
High School Courses at BMV
Chapter 8 Introducing Inferential Statistics.
The rise of statistics Statistics is the science of collecting, organizing and interpreting data. The goal of statistics is to gain understanding from.
ScWk 298 Quantitative Review Session
Strong and Weak Links: Talent Distributions within Teams
CHAPTER 3 Describing Relationships
My Career Compass to Becoming a professional athlete
Topic 3: Measures of central tendency, dispersion and shape
Passing Networks in Hockey RIT Analytics Conference 2015 S Burtch
Methods for Forecasting Style of Play in Soccer
The Structure of Common Genetic Variation in United States Populations
Descriptive Statistics I REVIEW
Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.
Dimension Reduction via PCA (Principal Component Analysis)
1-Way Random Effects Model
School of Computing Science
Opportunity Analysis and Industry Forecast,
1 Chapter 1: Introduction to Statistics. 2 Variables A variable is a characteristic or condition that can change or take on different values. Most research.
Using the Empirical Rule
What is the point of these sports?
Estimated Shot Assists for NHL Skaters
Quality Control at a Local Brewery
Collaborative Filtering Nearest Neighbor Approach
M.Sc. Project Doron Harlev Supervisor: Dr. Dana Ron
Inferential Statistics
Model Trees for Identifying Exceptional Players in the NHL Draft
Access Center Assessment Report
CHAPTER 8 Estimating with Confidence
EPSY 5245 EPSY 5245 Michael C. Rodriguez
Dispersion How values arrange themselves around the mean
Data Mining – Chapter 4 Cluster Analysis Part 2
Maximizing NHL Player Usage Using a Linear Optimization Model
Principal Component Analysis
Section 6.2 Prediction.
1-Way Random Effects Model
Jaromír Jágr the Czech Republic.
Methods for Forecasting Style of Play in Soccer
Meta-analysis, systematic reviews and research syntheses
Forecasting Plays an important role in many industries
Presentation transcript:

NHL Player Development Modelling Means and Methods to Projecting Player Contributions S. Burtch, 2017

Player Career Trajectory Description Useful for Generalization Player Production vs Age Cumulative Points (Krzywicki, 2008)

Player Career Trajectory Description Useful for Generalization Points Per GP / Per 60 (Desjardins, 2010; Tuslky, 2013, 2014) Shot Rates (Tulsky 2014)

Player Career Trajectory Description Useful for Generalization Save Percentage (Burtch, 2013; Tulsky 2013)

Projection as a Statistical Exercise Marcel Projections “The Marcel the Monkey Forecasting System (or the Marcels for short) is the most advanced forecasting system ever concieved.” “Not.” “Actually it is the most basic forecasting system you can have, that uses as little intelligence as possible.” (Tom Tango, 2012 – Marcels first introduced in 2004) Projecting Goaltender Performance Marcels (Garik16, 2014; Tulsky, 2014) Projecting Skater Performance Marcels (Galimini, 2015)

More Detailed Projections in Other Sports PECOTA / Steamer / ZiPs (baseball) CARMELO (basketball) APROPOS (soccer) All employ a similar logic Try to project player development by exploring trajectory of similar players This raises 3 key questions…

1. How Do We Identify Similar Players? Player Similarity Scores (Hockey/Basketball/Baseball/Football-Reference.com, 2008) Uses overall career “score” and “shape” similarity Does not account for variation by age Euclidean Distance Most recently applied in Hockey Statistics by Corsica.Hockey Statistically Similar Players (Perry, 2016) Identifying Playing Styles With Clustering (Stimson, 2017) Examining Which Players are Most Similar To Current Red Wings (Iyer, 2017)

1. How Do We Identify Similar Players? Using K-Nearest Neighbours Algorithms to identify similar player types/styles We seek to identify similarity by age/season across a variety of GAR metrics KNN Matrix Data using GAR data (courtesy Dawson Sprigings, 2008-2017) Weight each GAR metric according to year-over-year auto-correlation Scale each GAR metric to account for players with limited GP and thus maximize the size of the sample This was accomplished by modeling the underlying relationships between GP and individual GAR components to account for individual skaters with low GP

1. How Do We Identify Similar Players? Using K-Nearest Neighbours Algorithms to identify similar player types/styles We seek to identify similarity by age/season across a variety of GAR metrics We thus obtain our KNN Age Matrix 1479 Individual Skaters 6 GAR metrics per year of age Spanning ages 18 to 44 (Jaromir Jagr will boost the upper bound to 45 this year) 1479 x 6 x 27 = 239,598 data points of fun for comparison

1. How Do We Identify Similar Players? For each individual skater we compare them to all other skaters filtered by position (F or D) using their GAR data from age 18 to the age of the skater last season (2016-17) This is done using a KNN Algorithm in R – identifying the 5 most similar skaters Issues with this method as applied Some skaters have peers of the same current age within their 5 most similar players e.g. William Nylander’s 2 most similar skaters were Christian Dvorak of the Arizona Coyotes and Brayden Point of the Tampa Bay Lightning. All 3 skaters are in the midst of their age 21 season.

2. How Do We Project Into The Future? Two obvious options exist 1 Forecast using the average outcomes of most similar skaters in the following season 2 Forecast using the average proportional change for the most similar skaters in the following season The third possibility is a blending of the two aforementioned methods

Projection of GAR by Age

Projection of GAR by Age

GARPENLOV What Do We Call It? Goals Above Replacement Projection Employing Newly Leveraged Observations & Variance GARPENLOV

Thank You