Kurt RoutleyOliver SchulteTim SchwartzZeyu ZhaoSajjad Gholami.

Kurt RoutleyOliver SchulteTim SchwartzZeyu ZhaoSajjad Gholami

 North American Sports: $485 Billion$485 Billion  Sports Analytics: ◦ growing in industry.  $72.5M Investment in Hudl.Hudl ◦ growing in academia.  #Sports Analytics papers 2008-2015 = 7x #applied operations research papers.  AI ◦ modelling and learning game strategies. ◦ multi-agent systems. ◦ structured data. 2/68 Cochran, J. J. “The emergence of sports analytics” Analytics, 2010, 36-39. Coleman, B. J. “Identifying the players in sports analytics” Research Interfaces, 2012, 42, 109-118.

3/68 Reinforcement Learning Sports Analytics on-line intro text by Sutton and Barto

5/68 Sports Analytics Evaluate Player/Team Performance Predict Match Outcomes Identify strengths, weaknesses Advise on drafts, trades

6/68 Evaluate Player/Team Performance Action Value Counts Issues entails transitivity interpretable? considers final results only Latent Strength Model Chess: Elo Rating Gaming: MS TrueSkillTrueSkill

Olympics 2010 Golden Goal 7/68 Issues for action values:  Common scale for all actions  Context-awareness  Lookahead

 Sabermetrics in Baseball Sabermetrics in Baseball  +/- Score in ice hockey  nhl.com nhl.com  Advanced Stats Advanced Stats 8/68

Search 9/68

 Many areas of AI and optimization involve lookahead.  In AI this is called search.  Example: GPS route planning. 10/68

 Backgammon  AlphaGo!  Chess. http://mygames. chessbase.com/js /apps/MyGames/ http://mygames. chessbase.com/js /apps/MyGames/ 11/68

 Markov Chain Demo Markov Chain Demo  our nhl model > 1M nodes  Solving a Markov Decision Process ◦ Value Iteration Demo Value Iteration Demo 12/68

 How much does the action change the expected reward at the current state?  Example: how much does the action change the chance of winning at the current state? 13/68 Expected reward after action Expected reward before action

 Transition graph with 5 parts: ◦ Players/Agents P ◦ States S ◦ Actions A ◦ Transition Probabilities T ◦ Rewards R  Transitions, Rewards depend on state and tuple of actions, one for each agent. 15/68 Littman, M. L. (1994), Markov games as a framework for multi-agent reinforcement learning, in ’ICML', pp. 157--163.

16/68 GD = Goal Differential MP = ManPower PR = Period CV = chance that home team scores next goal

 Players in our Markov game = {Home, Away}.  Models average or random player. 29/68

 Context Features ◦ Goal Differential GD ◦ Manpower Differential MD ◦ Period PR 30/68

 13 Action Types  Action parameters: team, location. ◦ faceoff(Home,Neutral) ◦ shot(Home,Offensive) ◦ hit(Away,Defensive 31/68 Action Types Blocked Shot Faceoff Giveaway Goal Hit Missed Shot Shot Takeaway...

 Use action description notation (Levesque et al, 1998) ◦ Actions written in form a(T,L)  Action a  Team T  Location/Zone L ◦ faceoff(Home,Neutral) ◦ shot(Home,Offensive) ◦ hit(Away,Defensive) 32/68

 Transition probabilities are estimated from observances in play-by-play data ◦ Record occurrences of state s as Occ(s) ◦ Record occurrences of transition as Occ(s,s’) ◦ Parameter Learning.  Transition probabilities T estimated as Occ(s,s’) / Occ(s). 33/68

 Goals ◦ R(s) = 1 if s corresponds to a goal(Home,*) ◦ R(s) = -1 if s corresponds to a goal(Away,*) ◦ R(s) = 0 otherwise  Penalties ◦ R(s) = 1 if s corresponds to a penalty(Home,*) ◦ R(s) = -1 if s corresponds to a penalty(Away,*) ◦ R(s) = 0 otherwise  Wins ◦ R(s) = 1 if s corresponds to a Win(Home) ◦ R(s) = -1 if s corresponds to a Win(Away) ◦ R(s) = 0 otherwise 34/68

35/68  Basketball Demo - Open in Chrome Basketball Demo - Open in Chrome

The Data 36/68

 Complete Tracking: which player is where when. Plus the ball/puck. ★  Box Score: Action Counts.  Play-By-Play: Action/Event Sequence. 37/68

 Basketball Example from SportsVUSportsVU  Coming to the NHL? 38/68

 Oilers vs. Canucks Oilers vs. Canucks 39/68

 Successive Play Sequences Successive Play Sequences 40/68

Source: nhl.com 2007-2015 No Locations 41/68 NHL.com Teams32 Players1,951 Games9,220 Events2,827,467 SportLogiq Teams32 Players2,233 Games446 Events1,048,576 Source: SportLoqigSportLoqig 2015 Action Locations

 Basic question: What difference does an action make?  Quantify effect of action on outcome (goal) = action value.  Player contribution = sum of scores of player’s actions. ◦ Schuckers and Curro (2013), McHall and Scarf (2005; soccer).  Example: +/- Score in ice hockey ◦ nhl.com Advanced Stats nhl.comAdvanced Stats Schuckers, M. & Curro, J. (2013), Total Hockey Rating (THoR): A comprehensive statistical rating of National Hockey League forwards and defensemen based upon all on-ice events, in '7th Annual MIT Sloan Sports Analytics Conference’.

Computation 43/68

 V(s) = Expected reward starting in state s 44/68 RewardAbsorbing StatesQ(s) represents WinGame EndWin Probability Differential GoalsGame EndExpected Goal Differential GoalsGame End + GoalsNext Goal Probability Differential PenaltiesGame EndExpected Penalty Differential PenaltiesGame End + PenaltiesNext Penalty Probability Differential

 Iterative Value function computation (on policy) for i=1,...,h steps.  h is the lookahead horizon 45/68 Immediate reward Prob. of Action Expected Future Reward given action and state

46/68 Cervone, D.; D’Amour, A.; Bornn, L. & Goldsberry, K. (2014), POINTWISE: Predicting points and valuing decisions in real time with NBA optical tracking data, in MIT Sloan Sports Analytics Conference

Examples 47/68

48/68 Immediate reward Expected Future Reward given action and state

We discretize locations by clustering the points at which a given action occurs. Example: 49/68

 Average values of actions at location, over all states and both teams. 51/68 Action = shot Chance of scoring the next goal lookahead = 1

52/68 Chance of scoring the next goal lookahead = 1 Chance of scoring the next goal after shot lookahead = 14

53/68 Which is better? Figure by Shaun Kreider, Kreider Designs.

54/68 Chance of scoring the next goal after carry Chance of scoring the next goal after dump-in

expected reward after action 56/68 expected reward before action

Players 1.Apply the impact of an action to the player performing the action 2.Sum the impact of his actions over a game to get his net game impact. 3.Sum the net game impact of a player over a single season to get his net season impact. 57/68 Teams Sum the impact of all players.

Compare  average impact of team in game (our model)  average goal ratio of team in game (independent metric). 2-1 = 4-2 = 6-3  Correlation = 0.7! 58/68

 Commonly used (e.g. Financial Times) Commonly used  Correlation only 0.09 59/68

60/68 2014-15 no location data

61/20 Jason Spezza: high goal impact, low +/-. plays very well on poor team (Ottawa Senators). Requested transfer for 2014-2015 season.

62/68 Correlation coefficient = 0.703 Follows Pettigrew(2015) Pettigrew, S. (2015), Assessing the offensive productivity of NHL players using in-game win probabilities, in '9th Annual MIT Sloan Sports Analytics Conference'.

63/68 2014-15 no location data

 Built state-space model of NHL dynamics.  The action-value function in reinforcement learning is just what we need.  Incorporates ◦ context ◦ lookahead  Familiar in AI, revolutionary in sports analytics! 64/68

 State-space Markov game model for ice hockey dynamics in the NHL.  A new context-aware method for evaluating locations, all actions and players.  “We assert that most questions that coaches, players, and fans have about basketball, particularly those that involve the offense, can be phrased and answered in terms of EPV [i.e. the value function].” Cervone, Bornn et al. 2014. 65/68

Thank you – any questions? 66/68

Cervone, D.; D’Amour, A.; Bornn, L. & Goldsberry, K. (2014), POINTWISE: Predicting points and valuing decisions in real time with NBA optical tracking data, in MIT Sloan Sports Analytics Conference. Routley, K. & Schulte, O. (2015), A Markov Game Model for Valuing Player Actions in Ice Hockey, in 'Uncertainty in Artificial Intelligence (UAI)', pp. 782--791. 67/68

 No ground truth. ◦ Relate to predicting something (?) ◦ Break down into strong and weak contexts?  Compare Apples-to-Apples. ◦ Cluster players by position. ◦ Learn player clusters. ◦ Interesting ideas in Cervone et al. 2014 70/68 Cervone, D.; D'Amour, A.; Bornn, L. & Goldsberry, K. (2014), 'A Multiresolution Stochastic Process Model for Predicting Basketball Possession Outcomes', ArXiv e-prints.

 Extract patterns about which actions have the most impact when. 71/68

 Fit parameters for each player (cricket, baseball, basketball).  Smooth towards similar players when a player visits a state rarely.  Combine reinforcement learning with clustering agents? 72/68

 Game Clock.  Penalty Clock.  Player, puck location (eventually).  Can we take existing RL off the shelf? ◦ E.g. continuous finite-time horizon?continuous finite-time horizon ◦ Spatial Planning? Spatial Planning? ◦ RL with both continuous time and space? 73/68

Kurt RoutleyOliver SchulteTim SchwartzZeyu ZhaoSajjad Gholami.

Similar presentations

Presentation on theme: "Kurt RoutleyOliver SchulteTim SchwartzZeyu ZhaoSajjad Gholami."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Kurt RoutleyOliver SchulteTim SchwartzZeyu ZhaoSajjad Gholami.

Similar presentations

Presentation on theme: "Kurt RoutleyOliver SchulteTim SchwartzZeyu ZhaoSajjad Gholami."— Presentation transcript:

Similar presentations

About project

Feedback