Predictive Analytics at the NHL

Slides:



Advertisements
Similar presentations
ScreenACT INFORMATION SESSION. On the Agenda ACT Screen Development Fund Cannes 2015 Screen Industry Pod.
Advertisements

© 2010 Experian Information Solutions, Inc. All rights reserved. 1 Moody’s CreditCycle™ Plus powered by Experian Overview  A comprehensive loss forecasting,
ScreenACT INFORMATION SESSION. On the Agenda ACT Screen Development Fund Cannes 2015 Screen Industry Pod.
Who’s on First: Simulating the Canadian Football League regular season Keith A. Willoughby, Ph.D. University of Saskatchewan Joint Statistical Meetings.
Blending the Real and Virtual in Games: The Model of Fantasy Sports Frank Shipman Department of Computer Science & Center for the Study of Digital Libraries.
Deriving Performance Metrics From Project Plans to Provide KPIs for Management Information Primavera SIG October 2013.
2015 TVA BEACH LEAGUE All rights reserved Tidewater Volleyball Association 2015 ©
Video and Performance Analysis in the Coaching Process For Goalkeeping Coaches and Goalkeepers.
MBA7020_05.ppt/June 27, 2005/Page 1 Georgia State University - Confidential MBA 7020 Business Analysis Foundations Time Series Forecasting June 27, 2005.
MBA7025_01.ppt/Jan 13, 2015/Page 1 Georgia State University - Confidential MBA 7025 Statistical Business Analysis Introduction - Why Business Analysis.
MBA7020_01.ppt/June 13, 2005/Page 1 Georgia State University - Confidential MBA 7020 Business Analysis Foundations Introduction - Why Business Analysis.
BUSINESS STATISTICS MGT 2302 BUSINESS STATISTICS MGT 2302 Lecturer Name : Liyana ‘Adilla 1 SCHOOLOGY ACCESS CODE: 7QRB9-4MPNN.
Actuaries in Financial Markets (Session C2) CAS Annual Meeting November 2007 Chicago, IL Scott J. Swanay, FCAS Swanay Sports FantasyBaseballSherpa.com.
Admissions Process Review 2011 Click to edit Master title style Admissions Process Review Cathy Gilbert Supporting Progression Conference Teesside University,
Making the NHL Playoffs Dustin Schneider Lawrence Mulcahy.
NHL CORSI STATISTICS – DATA MINING Max Schutzman ISC 110 Professor Wenderholm.
What Is The Most Important Statistic In The NHL? Does a higher Corsi%, Save %, or Shot % increase your chances of winning a Stanley Cup?
Major/minor Data Science & Business Analytics (HIR/BEng) Minor Predoctoral Module (HIR/BEng) Minor Business Research (TEW/BE) March 16, 2016.
Wastewater Treatment and Disposal Upgrade Nampa Wastewater Advisory Group Meeting #10 October 9, 2012.
Predicting the NHL Playoffs Daniel Boucher Tarek Bos-Jabbar.
ADULT COMPETITIONS REVIEW October Grow Participation  Aim to increase number of teams participating, compared to existing programme where entry.
{ NAHL Official Rules League inside of NHL 14 for EASHL.
Anu Jain, Sr. Director of Big Data Platforms, Target
Data Science for Finance and Business
A Lifetime Price Tag on Smoking
Office 365 Security Assessment Workshop
Judges Training Webinar
Jeremy Sylvain & Michael Schuckers
Competitive Balance and Attendance in the MLB
Comprehensive Operational Review Mid-Study Update Presentation
Free Cricket Betting Tips
The Marshall University Experience with Implementing Project Server 2003 August 9, 2005 Presented by: Chuck Elliott, M.S. Associate Director, Customer.
Multilevel Modeling in Hockey Analytics: Untangling Individual and Team Performance In Even-Strength, Power Play, and Short Handed Situations Sophie Jablansky.
NATA Foundation General Grants Program Process
The Internet of Things (IoT) and Analytics
OPERATIONS MANAGEMENT for MBAs Fourth Edition
What is Survival Model and why it is important?
Master of Science in Management Science
UNIT – V BUSINESS ANALYTICS
Baseball Season Spring of 2008.
6th – 9th Division Tournament
Dollars Dominance The latest annual revenue numbers in Supermarket News’ 2016 Top 75 US & Canadian Food Retailers & Wholesalers list clearly reveal the.
School of Computing Science
A Model Based Approach to Injuries
NBA Draft Prediction BIT 5534 May 2nd 2018
Tuesday, March 6, 2018 Convention Center, Room 124
HOLDING SLIDE.
The Business Analytics Program at Old Dominion University
BUREAU VERITAS COMMODITIES
ADULT COMPETITIONS REVIEW
Government Revenue Forecasting Perspectives
Addus Healthcare – Prophix story
Jermaine Carn Advisor: Nick Webb
Model Trees for Identifying Exceptional Players in the NHL Draft
Rodney J. Paul – Syracuse University
Suzanne Robotics Club Coaches Training.
Predicting Frost Using Artificial Neural Network
Conjoint Analysis.
Aleysha Becker Ece 539, Fall 2018
“Hard work beats talent when talent doesn’t work hard.”
MSc in Statistical Science Graduate Open Day – 31 October 2018
BEC 30325: MANAGERIAL ECONOMICS
NATA Foundation General Grants Program Process
Business Application & Conceptual Issues
Quick statistics - how to deal with quality?
Introduction to Decision Sciences
Task Force Peer reviews and quality Eurostat
Southern Pines Men’s Rugby Annual General Meeting
Forecasting Plays an important role in many industries
Isabel López Komeno Ogbeneme Joseph Gutierrez Nolan Murakami
Presentation transcript:

Predictive Analytics at the NHL Eric Blabac, Director of Decision Science – Membership Analytics, Sam’s Club May 6th, 2017

Agenda Introductions SAP Partnership with the NHL Playoff Predictions Probability of Making Playoffs Q&A

Director of Decision Science – Membership Analytics Introductions Global analytics expert, data science evangelist and is currently the Director of Decision Science, Membership Analytics at Sam’s Club. Prior to Sam’s Club, Eric held the role of Principal Data Scientist at SAP. Eric’s background is based on advanced analytics with substantial experience in statistical modeling, predictive analytics, data mining, forecasting and management consulting. He has worked across a variety of industries including retail. financial services, consumer product goods, healthcare and sports and entertainment Eric Blabac Director of Decision Science – Membership Analytics Sam’s Club He is also the author of The Encyclopedia of Baseball Statistics - From A to ZR, a complete reference of all modern baseball statistics, what they really mean, how to calculate them and how to use them.  Eric holds two Masters degrees (MS), Statistics and Applied Mathematics from Iowa State University, a Master’s in Business Administration (MBA) from Grand Canyon University and a  Bachelors (BSc) degree in Mathematics from Iowa State University

SAP Partnership with the NHL *5-year sponsorship agreement Phase 1a (Enhanced Stats) : Oct 2014 – Feb 2015 Phase 1b (Playoff Predictions): Jan – April 2015 Phases 2 and 3 (UX Revamp + Additional Stats): June – Aug 2015 Advanced Game Level Filtering Statistic Charting/Player Comparison Stats by Context (Faceoffs By Zone, Shots By Type, etc …) Stats by Strength (e.g. 3on3 Goals) Team Power Index Phase 4 (Enhancements): Nov 2015 – Jan/Feb 2016 Probability of Making Playoffs Line Analysis In-Game Win Expectancy Project Team: PM, 3 consultants (ETL/HANA modeling, Data Scientist, Solution Architect)

Playoff Predictions

NHL Playoff Predictions Overview GOAL: Predict the Stanley Cup winner THINGS TO CONSIDER (aka Requirements): BUISINESS Need to predict every game AND series leading up to the finals. What does the output need to look like? Model needs to incorporate “Enhanced Statistics” (SAP Marketing) The model needs to be EASY to explain/interpret (for the NHL) The output need to be EASY to understand (for fans) The factors used need to be EASY to understand, but compelling (for the NHL, fans, media) ‘ ANALYTICAL What data should I use? Do I need to calculate additional variables? Define “prediction” (e.g. explicit win/loss, win probability) Which statistical model should I use? How do I implement the model? How do I “simulate” the Stanley Cup playoffs? Predictions for game x needs to account for results in game x-1 and previous series (Bracket) TIME Began in early Jan, deadline of mid-March (three weeks before playoffs start)

NHL Playoff Predictions Solution Overview A logistic regression model was developed to calculate the probability a team would win a specific playoff game. This model incorporated various factors including: Standard regular season stats Penalty Kill %, Goals Against Per Game, etc … Enhanced and Advanced regular season stats Shot Attempts % Behind, Save % on High Quality Shots, Shooting Efficiency %, etc … Game Context factors Home vs. Away, Time Zones Travelled, Opponent Strength, etc … Regular Season Results Team Level Stats Simulating Remaining Bracket (Game Level) Game Level Win Probabilities Game Context Playoff Game Results Series Level Win Probabilities Streak and Strength Factors

NHL Playoff Predictions Solution Development DATA PREPERATION Created an exhaustive list of all factors that we thought may be predictive (NHL/SAP) We came up with 78 different factors, these factors generated 241 variables E.g. Factor: Winning Percentage → Variables: Current Winning Percentage, Opponent Winning Percentage, Winning Percentage Last X Games, Winning Percentage League Rank, etc … The data was prepared in a HANA (database) stored procedure utilizing over 20 different source tables in the NHL’s data landscape - over 1500 lines of code ‘ MODELING Chose a model that was appropriate for the problem (classification) and met the NHL requirements (e.g. “EASY” to develop, interpret and explain) → Logistic Regression I initially grouped the 241 variables into eight (8) subgroups based on the type of variable (e.g. Possession, Special Teams and Goalie, etc …). Models were ran on each subgroup to determine the factors with high predictive power. Each selected variable was then combined into one final model to yield the final 37. SIMULATION and IMPLEMENTATION I then developed code to simulate the remaining bracket given the current state of the playoffs; loop through each game, series and round and predict each future game in the bracket format Predictions were generated every morning and were available on the HANA cloud for fans to access over any platform on NHL.com

NHL Playoff Predictions Implementation NHL.com Series Preview SAP Match-up Analysis (Bracket Challenge) Do you notice anything missing?

NHL Playoff Predictions Day “0” (Before the Playoffs Began) Bracket Predictions and Results

NHL Playoff Predictions Initial Results The initial results were mixed, but overall positive as the model successfully predicted the Chicago Blackhawks to win the Cup on “Day 0” (!!) However, the game level model had some issues: Stubbornness and Predicting “too many” sweeps In many cases, the model stuck with the initial series prediction, even based on in-series performance (e.g. team lost first 2 games, team down 3 games to 1) Picked “too many” big upsets While some upsets turned out to be predicted correctly, “too many” big upsets simply didn’t look right (both examples below were upsets of President’s trophy winners) E.g. PIT over NYR in 2014-2015 E.g. PHI over WSH in 2015-2016

NHL Playoff Predictions Second Version The second version of the playoff predictions was structured in “phases”. The first phase being a series level prediction, using all available historical playoff data (back to 1987-88 season). This series level prediction can be utilized on its own (e.g. Bracket Challenge), but is also used as an input into a new game level model. Regular Season Results Team Level Stats Series Level Win Probabilities Simulated Remaining Bracket (Series Level)* Simulated Remaining Bracket (Game Level)* Playoff Game Results Game Context Game Level Win Probabilities Historical Playoff Performance The new game level model takes into account game context factors (home vs. away, days between games, etc …) plus takes into account historical playoff performance for more “realistic” game predictions.

NHL Playoff Predictions Results Comparison – Series Level (2015-2016)

NHL Playoff Predictions Results Comparison – Game Level (2014-2015) CGY vs VAN, 1st round Note: Even though the initial series level model showed a slight edge to VAN, the game level model utilized the current series performance to give the eventual edge to CGY late in the series NYR vs WSH, 2nd round

NHL Playoff Predictions Results Comparison Notes Better overall success at the series level (v1 vs. v2) 2014-2015: 11/15 (73%) vs. 9/15 (60%) 2015-2016: 12/15 (80%) vs. 8/15 (53%) Both new series and game level predictions are much more “conservative” In fact, with the new model using 28 seasons of playoff data (420 series), only 39 series had more than an 80% series win probability Better “eye test” success (NYR and WSH being Presidents’ Trophy winners) 2014-2015: NYR vs PIT Old: NYR (16.01%) vs. PIT (83.99%) New: NYR (63.81%) vs. PIT (36.19%) 2015-2016: WSH vs PHI Old: WSH (43.25%) vs. PHI (56.75%) New: WSH (62.69%) vs. PHI (37.31%)

Questions? Thank you!