Forecasting a Tennis Match at the Australian Open Tristan Barnett Stephen Clarke Alan Brown
Introduction Match Predictions Real Time Predictions Markov Chain Model Collecting Data Exponential Smoothing Combining Player Statistics Real Time Predictions Combining Sheets from Markov Chain Model Bayesian Updating Rule Excel Computer Demonstration Use a second option for Slide Master
Markov Chain Model Modelling a game of tennis Recurrence Formula: P(a,b) = pP(a+1,b) + (1-p)P(a,b+1) Boundary Conditions: P(a,b) = 1 if a=4, b ≤ 2 P(a,b) = 0 if b=4, a ≤ 2 where for player A: p = probability of winning a point on serve P(a,b) = conditional probability of winning the game when the score is (a,b) Use a second option for Slide Master
Markov Chain Model Table 1: The conditional probabilities of player A winning the game from various score lines for p = 0.6 Similarly sheet for player B serving sheets for a set (from sheets of a game) sheet for a match (from sheets of a set) Use a second option for Slide Master
Collecting Data The ATP tour matchfacts: http://www.atptennis.com/en/media/rankings/matchfacts.pdf Use a second option for Slide Master
Collecting Data fi = ai bi + ( 1 - ai ) ci gi = aav di + ( 1 - aav ) ei where the percentage for player i : fi = points won on serve gi = points won on return ai = 1st serves in play bi = points won on 1st serve ci = points won on 2nd serve di = points won on return of 1st serve ei = points won on return of 2nd serve where the percentage for average player on the ATP tour: aav = 1st serves in play = 58.7% Use a second option for Slide Master
Exponential Smoothing Fit = Fit-1 + [ 1 - ( 1 – α )n ] [ fit - Fit-1 ] Git = Git-1 + [ 1 - ( 1 – α )n ] [ git - Git-1 ] where: For player i at period t Fit = smoothed average of the percentage of points won on serve after observing fit Git = smoothed average of the percentage of points won on return of serve after observing git Initialised for average ATP tour player Fi0 = the ATP average of percentage of points won on serve Gi0 = the ATP average of percentage of points won on return of serve n = number of matches played since period t-1 α =smoothing constant When n=1, [1-(1-α)n] = α, as expected When n becomes large, [1-(1-α)n] → 1, as expected Use a second option for Slide Master
Combining Player Statistics fij = ft + ( fi - fav ) - ( gj - gav ) gji = gt + ( gj - gav ) - ( fi - fav ) where: For the combined player statistics fij = percentage of points won on serve for player i against player j gji =percentage points won on return for player j against player I For the tournament averages ft = percentage of points won on serve gt = percentage of points won on return of serve For the ATP tour averages fav = percentage of points won on serve gav = percentage of points won on return of serve Since ft + gt = 1, fij + gji = 1 for all i,j as required Use a second option for Slide Master
Combining Sheets The equation for the probability of player A winning a best-of-5 set match from (e,f) in sets, (c,d) in games, (a,b) in points, player A serving. P''(a,b:c,d:e,f ) = P(a,b) P'B(c+1,d) P''(e+1,f ) + P(a,b) [1-P'B(c+1,d)] P''(e,f+1) + [1-P(a,b)] P'B(c,d+1) P''(e+1,f ) + [1-P(a,b)] [1-P'B(c,d+1)] P''(e,f+1) where for player A : P''(a,b:c,d:e,f ) = probability of winning the match from (a,b:c,d:e,f ) P'B(c,d) = probability of winning the set from (c,d) when player B is serving P''(e,f ) = probability of winning the match from (e,f ) Use a second option for Slide Master
Bayesian Updating Rule where: θ ti = updated percentage of points won on serve at time t for player i μi = initial percentage of points won on serve for player i φ ti = actual percentage of points won on serve at time t for player i n = number of points played M = expected points to be played When n=0, θ 0i= μi as expected When M → 0, θ ti → φ ti Use a second option for Slide Master
Computer Demonstration ISF3.XLS 2003 Australian Open Quarter Final El Aynaoui versus Roddick Use a second option for Slide Master
Computer Demonstration ISF4.XLS End of 1st set where: = game to El Aynaoui = game to Roddick = set to El Aynaoui Use a second option for Slide Master
Computer Demonstration End of match where: = game to El Aynaoui by breaking serve = game to Roddick by breaking serve = set to El Aynaoui = set to Roddick Use a second option for Slide Master