Download presentation
Presentation is loading. Please wait.
Published byKelly Howard Modified over 8 years ago
1
Kurt RoutleyOliver SchulteTim SchwartzZeyu ZhaoSajjad Gholami
2
North American Sports: $485 Billion$485 Billion Sports Analytics: ◦ growing in industry. $72.5M Investment in Hudl.Hudl ◦ growing in academia. #Sports Analytics papers 2008-2015 = 7x #applied operations research papers. AI ◦ modelling and learning game strategies. ◦ multi-agent systems. ◦ structured data. 2/68 Cochran, J. J. “The emergence of sports analytics” Analytics, 2010, 36-39. Coleman, B. J. “Identifying the players in sports analytics” Research Interfaces, 2012, 42, 109-118.
3
3/68 Reinforcement Learning Sports Analytics on-line intro text by Sutton and Barto
4
4/68
5
5/68 Sports Analytics Evaluate Player/Team Performance Predict Match Outcomes Identify strengths, weaknesses Advise on drafts, trades
6
6/68 Evaluate Player/Team Performance Action Value Counts Issues entails transitivity interpretable? considers final results only Latent Strength Model Chess: Elo Rating Gaming: MS TrueSkillTrueSkill
7
Olympics 2010 Golden Goal 7/68 Issues for action values: Common scale for all actions Context-awareness Lookahead
8
Sabermetrics in Baseball Sabermetrics in Baseball +/- Score in ice hockey nhl.com nhl.com Advanced Stats Advanced Stats 8/68
9
Search 9/68
10
Many areas of AI and optimization involve lookahead. In AI this is called search. Example: GPS route planning. 10/68
11
Backgammon AlphaGo! Chess. http://mygames. chessbase.com/js /apps/MyGames/ http://mygames. chessbase.com/js /apps/MyGames/ 11/68
12
Markov Chain Demo Markov Chain Demo our nhl model > 1M nodes Solving a Markov Decision Process ◦ Value Iteration Demo Value Iteration Demo 12/68
13
How much does the action change the expected reward at the current state? Example: how much does the action change the chance of winning at the current state? 13/68 Expected reward after action Expected reward before action
14
14/68
15
Transition graph with 5 parts: ◦ Players/Agents P ◦ States S ◦ Actions A ◦ Transition Probabilities T ◦ Rewards R Transitions, Rewards depend on state and tuple of actions, one for each agent. 15/68 Littman, M. L. (1994), Markov games as a framework for multi-agent reinforcement learning, in ’ICML', pp. 157--163.
16
16/68 GD = Goal Differential MP = ManPower PR = Period CV = chance that home team scores next goal
17
17/68
18
18/68 GD = Goal Differential MP = ManPower PR = Period CV = chance that home team scores next goal
19
19/68
20
20/68 GD = Goal Differential MP = ManPower PR = Period CV = chance that home team scores next goal
21
21/68
22
22/68
23
23/68
24
24/68
25
25/68
26
26/68
27
27/68
28
28/68
29
Players in our Markov game = {Home, Away}. Models average or random player. 29/68
30
Context Features ◦ Goal Differential GD ◦ Manpower Differential MD ◦ Period PR 30/68
31
13 Action Types Action parameters: team, location. ◦ faceoff(Home,Neutral) ◦ shot(Home,Offensive) ◦ hit(Away,Defensive 31/68 Action Types Blocked Shot Faceoff Giveaway Goal Hit Missed Shot Shot Takeaway...
32
Use action description notation (Levesque et al, 1998) ◦ Actions written in form a(T,L) Action a Team T Location/Zone L ◦ faceoff(Home,Neutral) ◦ shot(Home,Offensive) ◦ hit(Away,Defensive) 32/68
33
Transition probabilities are estimated from observances in play-by-play data ◦ Record occurrences of state s as Occ(s) ◦ Record occurrences of transition as Occ(s,s’) ◦ Parameter Learning. Transition probabilities T estimated as Occ(s,s’) / Occ(s). 33/68
34
Goals ◦ R(s) = 1 if s corresponds to a goal(Home,*) ◦ R(s) = -1 if s corresponds to a goal(Away,*) ◦ R(s) = 0 otherwise Penalties ◦ R(s) = 1 if s corresponds to a penalty(Home,*) ◦ R(s) = -1 if s corresponds to a penalty(Away,*) ◦ R(s) = 0 otherwise Wins ◦ R(s) = 1 if s corresponds to a Win(Home) ◦ R(s) = -1 if s corresponds to a Win(Away) ◦ R(s) = 0 otherwise 34/68
35
35/68 Basketball Demo - Open in Chrome Basketball Demo - Open in Chrome
36
The Data 36/68
37
Complete Tracking: which player is where when. Plus the ball/puck. ★ Box Score: Action Counts. Play-By-Play: Action/Event Sequence. 37/68
38
Basketball Example from SportsVUSportsVU Coming to the NHL? 38/68
39
Oilers vs. Canucks Oilers vs. Canucks 39/68
40
Successive Play Sequences Successive Play Sequences 40/68
41
Source: nhl.com 2007-2015 No Locations 41/68 NHL.com Teams32 Players1,951 Games9,220 Events2,827,467 SportLogiq Teams32 Players2,233 Games446 Events1,048,576 Source: SportLoqigSportLoqig 2015 Action Locations
42
Basic question: What difference does an action make? Quantify effect of action on outcome (goal) = action value. Player contribution = sum of scores of player’s actions. ◦ Schuckers and Curro (2013), McHall and Scarf (2005; soccer). Example: +/- Score in ice hockey ◦ nhl.com Advanced Stats nhl.comAdvanced Stats Schuckers, M. & Curro, J. (2013), Total Hockey Rating (THoR): A comprehensive statistical rating of National Hockey League forwards and defensemen based upon all on-ice events, in '7th Annual MIT Sloan Sports Analytics Conference’.
43
Computation 43/68
44
V(s) = Expected reward starting in state s 44/68 RewardAbsorbing StatesQ(s) represents WinGame EndWin Probability Differential GoalsGame EndExpected Goal Differential GoalsGame End + GoalsNext Goal Probability Differential PenaltiesGame EndExpected Penalty Differential PenaltiesGame End + PenaltiesNext Penalty Probability Differential
45
Iterative Value function computation (on policy) for i=1,...,h steps. h is the lookahead horizon 45/68 Immediate reward Prob. of Action Expected Future Reward given action and state
46
46/68 Cervone, D.; D’Amour, A.; Bornn, L. & Goldsberry, K. (2014), POINTWISE: Predicting points and valuing decisions in real time with NBA optical tracking data, in MIT Sloan Sports Analytics Conference
47
Examples 47/68
48
48/68 Immediate reward Expected Future Reward given action and state
49
We discretize locations by clustering the points at which a given action occurs. Example: 49/68
50
50/68
51
Average values of actions at location, over all states and both teams. 51/68 Action = shot Chance of scoring the next goal lookahead = 1
52
52/68 Chance of scoring the next goal lookahead = 1 Chance of scoring the next goal after shot lookahead = 14
53
53/68 Which is better? Figure by Shaun Kreider, Kreider Designs.
54
54/68 Chance of scoring the next goal after carry Chance of scoring the next goal after dump-in
55
55/68
56
expected reward after action 56/68 expected reward before action
57
Players 1.Apply the impact of an action to the player performing the action 2.Sum the impact of his actions over a game to get his net game impact. 3.Sum the net game impact of a player over a single season to get his net season impact. 57/68 Teams Sum the impact of all players.
58
Compare average impact of team in game (our model) average goal ratio of team in game (independent metric). 2-1 = 4-2 = 6-3 Correlation = 0.7! 58/68
59
Commonly used (e.g. Financial Times) Commonly used Correlation only 0.09 59/68
60
60/68 2014-15 no location data
61
61/20 Jason Spezza: high goal impact, low +/-. plays very well on poor team (Ottawa Senators). Requested transfer for 2014-2015 season.
62
62/68 Correlation coefficient = 0.703 Follows Pettigrew(2015) Pettigrew, S. (2015), Assessing the offensive productivity of NHL players using in-game win probabilities, in '9th Annual MIT Sloan Sports Analytics Conference'.
63
63/68 2014-15 no location data
64
Built state-space model of NHL dynamics. The action-value function in reinforcement learning is just what we need. Incorporates ◦ context ◦ lookahead Familiar in AI, revolutionary in sports analytics! 64/68
65
State-space Markov game model for ice hockey dynamics in the NHL. A new context-aware method for evaluating locations, all actions and players. “We assert that most questions that coaches, players, and fans have about basketball, particularly those that involve the offense, can be phrased and answered in terms of EPV [i.e. the value function].” Cervone, Bornn et al. 2014. 65/68
66
Thank you – any questions? 66/68
67
Cervone, D.; D’Amour, A.; Bornn, L. & Goldsberry, K. (2014), POINTWISE: Predicting points and valuing decisions in real time with NBA optical tracking data, in MIT Sloan Sports Analytics Conference. Routley, K. & Schulte, O. (2015), A Markov Game Model for Valuing Player Actions in Ice Hockey, in 'Uncertainty in Artificial Intelligence (UAI)', pp. 782--791. 67/68
68
68/68
69
69/68
70
No ground truth. ◦ Relate to predicting something (?) ◦ Break down into strong and weak contexts? Compare Apples-to-Apples. ◦ Cluster players by position. ◦ Learn player clusters. ◦ Interesting ideas in Cervone et al. 2014 70/68 Cervone, D.; D'Amour, A.; Bornn, L. & Goldsberry, K. (2014), 'A Multiresolution Stochastic Process Model for Predicting Basketball Possession Outcomes', ArXiv e-prints.
71
Extract patterns about which actions have the most impact when. 71/68
72
Fit parameters for each player (cricket, baseball, basketball). Smooth towards similar players when a player visits a state rarely. Combine reinforcement learning with clustering agents? 72/68
73
Game Clock. Penalty Clock. Player, puck location (eventually). Can we take existing RL off the shelf? ◦ E.g. continuous finite-time horizon?continuous finite-time horizon ◦ Spatial Planning? Spatial Planning? ◦ RL with both continuous time and space? 73/68
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.