Download presentation
Presentation is loading. Please wait.
Published byRobert St-Amour Modified over 6 years ago
1
Bayesian Ranking using Expectation Propagation and Factor Graphs
Dumitru Erhan Université de Montréal 19/11/2018 Dumitru Erhan - Bayesian Ranking
2
Dumitru Erhan - Bayesian Ranking
Preface Not my work (at all) EP: Tom Minka TrueSkillTM: Ralf Herbrich & Thore MSR Cambridge (UK) “TrueChess”: Pierre INRIA Rhône-Alpes (France) Slides, plots, and results taken with permission 19/11/2018 Dumitru Erhan - Bayesian Ranking
3
Dumitru Erhan - Bayesian Ranking
Outline Problem setting Xbox Live Factor Graphs Exact inference in Factor Graphs Approximate inference using EP Loopy schedules and chess ratings Results 19/11/2018 Dumitru Erhan - Bayesian Ranking
4
Dumitru Erhan - Bayesian Ranking
The Ranking Problem Vaguely speaking: Input: ordered subsets of data Output: a ranking function For example: Chess Online games Movie ratings Internet search 19/11/2018 Dumitru Erhan - Bayesian Ranking
5
Dumitru Erhan - Bayesian Ranking
Modelling Ranking Ordinal regression: Order learning: f (x) Rank 1 Rank 2 Rank 3 Rank 4 Rank 5 rank (x) f (x) f (c) f (b) f (a) 19/11/2018 Dumitru Erhan - Bayesian Ranking
6
Dumitru Erhan - Bayesian Ranking
Xbox Live 19/11/2018 Dumitru Erhan - Bayesian Ranking
7
Modelling the Bayesian Way I
Track belief distributions: Allow performance variations: Model game outcome: P ( s i ) = N ; 2 P ( s j ) = N ; 2 P ( x i j s ) = N ; 2 P ( x j s ) = N ; 2 P ( p l a y e r i w n s ) = I x j 19/11/2018 Dumitru Erhan - Bayesian Ranking
8
Modelling the Bayesian Way II
This leads to a probit-based likelihood Posterior is not Gaussian! Implications for inference, tracking, etc. What if we could obtain a nice visualization of the model and stay in the Gaussian/exponential family, and perform the approximations efficiently? Factor Graphs + Expectation Propagation! P ( p l a y e r i w n s j ; ) = 2 19/11/2018 Dumitru Erhan - Bayesian Ranking
9
Factor Graphs mini intro
A bi-partite graph that represents the factorization of a mathematical function Nodes: = Factors = Variables Function = product of all factors Edges: Dependencies of factors on variables z x y 19/11/2018 Dumitru Erhan - Bayesian Ranking
10
Factor Graphs continued
Used for modelling joint PDFs Interested in marginals of the type P(hidden | observed) Use the sum-product algorithm/belief propagation to compute them 19/11/2018 Dumitru Erhan - Bayesian Ranking
11
Sum-Product Algorithm I
y f3(x,y) v w x f1(v,w) f2(w,x) z f4(x,z) Observation: Sum of products becomes product of sums of all messages from neighboring factors to variable! 19/11/2018 Dumitru Erhan - Bayesian Ranking
12
Sum-Product Algorithm II
y f3(x,y) w x f2(w,x) z f4(x,z) Observation: Factors only need to sum out all their local variables! 19/11/2018 Dumitru Erhan - Bayesian Ranking
13
Sum-Product Algorithm III
y f3(x,y) x f2(w,x) z f4(x,z) Observation: Variables pass on the product of all incoming messages! 19/11/2018 Dumitru Erhan - Bayesian Ranking
14
Dumitru Erhan - Bayesian Ranking
Belief Propagation Concept of a message from node X to node Y: X tells Y what state Y should be in First “propagate” observed data Then nodes exchange messages (start with leaves) Messages + priors + conditional probabilities updates of beliefs Belief(x) = product of incoming messages Basically, unnormalized marginals Pass messages until convergence If graph is tree – guaranteed If not… 19/11/2018 Dumitru Erhan - Bayesian Ranking
15
Approximate message passing
Problem: The exact messages from factors to variables may not be closed under products TrueSkillTM: Gaussian x Step-fun Gaussian Solution: Approximate the marginal as well as possible in the sense of minimal KL divergence Expectation Propagation: Approximate the marginal by so-called “moment-matching” 6 = 19/11/2018 Dumitru Erhan - Bayesian Ranking
16
Expectation Propagation
Message Old marginal New marginal Exact = * Approx = * 19/11/2018 Dumitru Erhan - Bayesian Ranking
17
Tom Minka’s thesis in two lines
Approximate By Iterate Pick a factor Remove its influence Project and refine p ( x ) = f 1 2 : n ^ p ( x ) = f 1 2 : n ^ p k + 1 ( x ) = a r g m i n q 2 f l y K L j ^ f k + 1 i ( x ) = p 19/11/2018 Dumitru Erhan - Bayesian Ranking
18
Formal Problem Setting
k teams of n1,…,nk many players The outcome is a ranking among the teams (including draws) Questions: Skill si of each player such that the higher the skill the more likely the win Global ranking among all players. High quality of match among k teams. 19/11/2018 Dumitru Erhan - Bayesian Ranking
19
TrueSkillTM Factor Graph
Player 1 wins over Player draws with Player 4 s1 s2 s3 s4 Individual Skills t1 t2 t3 Team Performances Performances Differences d1 d2 19/11/2018 Dumitru Erhan - Bayesian Ranking
20
TrueSkillTM Model Details
Priors: Hidden variables Performance : Team performance: Likelihood: Win: Draw: Skill evolution: P ( s i ) = N ; 2 P ( x i j s ) = N ; 2 t j = P i x P ( t e a m 1 w i n s j ; 2 ) = I > " P ( t e a m 1 n d 2 r w j ; ) = I " P ( s t i j 1 ) = N ; + 2 19/11/2018 Dumitru Erhan - Bayesian Ranking
21
More details and assumptions
Specifies an order on the real line OK if we agree that 1-d is good enough Draws + transitivity = not good Assume and A “mini-FG” is generated each time! EP updates can be done efficiently Moments of a truncated Gaussian Information flows forward only No updates in the light of future data j t 1 2 " j t 2 3 " 19/11/2018 Dumitru Erhan - Bayesian Ranking
22
Dumitru Erhan - Bayesian Ranking
The Alternative – ELO Quite similar: Performances distributed around fixed skills Win probability: Skill updates: Linear update: Differences: No uncertainty tracking Linearized updates No notion of teams, multiple players/teams, etc. Not a generative model TrueSkillTM is a generalization of ELO P ( p l a y e r 1 w i n s ) = 2 s 1 = + y ; 2 s 1 2 p 19/11/2018 Dumitru Erhan - Bayesian Ranking
23
Data: Halo 2 Multiplayer Beta
Publicly available Real one is much larger Number of Games: 60022 Number of Players: 5943 Parameters in all experiments: Performance variation factor: 60% Draw Probability: 5% Dynamics variation factor: 2% 19/11/2018 Dumitru Erhan - Bayesian Ranking
24
Convergence properties
40 35 30 25 Level 20 15 Player 1 (TrueSkill) 10 Player 2 (TrueSkill) Player 1 (ELO) 5 Player 2 (ELO) 100 200 300 400 19/11/2018 Dumitru Erhan - Bayesian Ranking 25
25
Dumitru Erhan - Bayesian Ranking
Win probability 19/11/2018 Dumitru Erhan - Bayesian Ranking
26
Dumitru Erhan - Bayesian Ranking
Other results TrueSkillTM better at predicting tight matches The “additive team performance” assumption does not hold in some cases (Capture-the-Flag) There are some feedback loop issues 19/11/2018 Dumitru Erhan - Bayesian Ranking
27
TrueSkillTM conclusions
Every Xbox 360 Live game uses TrueSkillTM Service launched in November 2005. Distinguishing properties is a generalization of ELO tracks a belief distribution can deal with multiple teams/players/draws First “real-world” implementation of EP However: Draws are handled somewhat strangely (hack) Information “flows” only forward in time 19/11/2018 Dumitru Erhan - Bayesian Ranking
28
Dumitru Erhan - Bayesian Ranking
What if… we created a schedule that passes messages “back in time”? Effectively, this means that future information is used for updating the current beliefs! However, the FG is not a tree now Loopy message passing schedule Too much data in case of Xbox Live Let’s do chess instead! Makes sense: the “game graph” is not very connected in time Hard to have a fair comparison between players 19/11/2018 Dumitru Erhan - Bayesian Ranking
29
Dumitru Erhan - Bayesian Ranking
Chess Factor Graph S1 S2 Performance noise P1 P2 D = P1 - P2 D > eps Morphy > Paulsen Morphy = Paulsen Morphy > Paulsen Games in 1857 19/11/2018 Dumitru Erhan - Bayesian Ranking
30
Dumitru Erhan - Bayesian Ranking
Chess dataset Characteristics: 88 players games 300 games per player hidden variables edges Priors set to match ELO: Mean = 2704 Stddev = 100 19/11/2018 Dumitru Erhan - Bayesian Ranking
31
Dumitru Erhan - Bayesian Ranking
Results 19/11/2018 Dumitru Erhan - Bayesian Ranking
32
Dumitru Erhan - Bayesian Ranking
Chess results Inflation over time? Who’s the best player of all time? Kasparov? Fischer? Morphy? Data set limitations: No individual game results, only tournaments Runs up to 1991 88 best players only 19/11/2018 Dumitru Erhan - Bayesian Ranking
33
Dumitru Erhan - Bayesian Ranking
Final words TrueSkillTM for Xbox Live: mature tech “TrueChess”: quite experimental Inference in loopy graphs is hard Other applications: Ranking Go moves (ICML ’06, Snowbird) Social matchmaking (Future Best NIPS paper ) Oral NIPS this year 19/11/2018 Dumitru Erhan - Bayesian Ranking
34
Dumitru Erhan - Bayesian Ranking
That’s it Thank you! 19/11/2018 Dumitru Erhan - Bayesian Ranking
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.