February 19, 2010 How to Catch a Tiger: Understanding Putting Performance on the PGA TOUR Jason Acimovic MIT Operations Research Center,

Slides:



Advertisements
Similar presentations
Assumptions underlying regression analysis
Advertisements

Probability Distributions CSLU 2850.Lo1 Spring 2008 Cameron McInally Fordham University May contain work from the Creative Commons.
Golf.
Statistics 100 Lecture Set 7. Chapters 13 and 14 in this lecture set Please read these, you are responsible for all material Will be doing chapters
Welcome to the World of Investigative Tasks
“Drive For Show, Putt For Dough” By Waleed Khoury and Justin Greenlee.
The Normal Distribution. Distribution – any collection of scores, from either a sample or population Can be displayed in any form, but is usually represented.
Today’s Agenda Review Homework #1 [not posted]
Transforms What does the word transform mean?. Transforms What does the word transform mean? –Changing something into another thing.
October 26, 2001MED Classification1 Major Event Day Classification Rich Christie University of Washington Distribution Design Working Group Webex Meeting.
1 BA 555 Practical Business Analysis Review of Statistics Confidence Interval Estimation Hypothesis Testing Linear Regression Analysis Introduction Case.
1. How do you feel about the first test? 2. How did you prepare for the first test? 3. Was it effective? Or how would you change your study habits? 4.
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Quiz 5 Normal Probability Distribution.
Golf Basics. Golf Explained - Overview Slide o Slide 3: Golf purpose & “par” Slide 3 o Slide 4: Golf course layout & scoring Slide 4 o Slide 5: Golf scoring.
Quitters Never Win: The (Adverse) Incentive Effects of Competing with Superstars Jennifer Brown Northwestern University Aiden Yuhao Wang.
Drive for Show and Putt for Dough? DONALD L. ALEXANDER WILLIAM KERN WESTERN MICHIGAN UNIVERSITY (2008) An Analysis of the Earnings of the PGA Tour Golfers.
Today: Central Tendency & Dispersion
Course Layout  Basics Of Golf Basics Of Golf ◦ Overview & Hole Design.  What’s In The Bag What’s In The Bag ◦ Club Variations & Usage.  Scoring Scoring.
AJ Clair, Tommy Durand, & Jeremy Polster. Golf Background  Golf is hard!  Some view putting as the most difficult part of golf  A study examining professional.
From pre to post test, the HU group demonstrated a significant decrease in putter speed variability (2.28 to 1.93 cm/s) relative to the HD group (2.21.
Statistics Primer ORC Staff: Xin Xin (Cindy) Ryan Glaman Brett Kellerstedt 1.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 6 Sampling Distributions.
© Copyright McGraw-Hill CHAPTER 3 Data Description.
Statistical Power 1. First: Effect Size The size of the distance between two means in standardized units (not inferential). A measure of the impact of.
Quantile Regression By: Ashley Nissenbaum. About the Author Leo H. Kahane Associate Professor at Providence College Research Sport economics, international.
1 Statistical Distribution Fitting Dr. Jason Merrick.
TYPES OF STATISTICAL METHODS USED IN PSYCHOLOGY Statistics.
The Normal Curve Theoretical Symmetrical Known Areas For Each Standard Deviation or Z-score FOR EACH SIDE:  34.13% of scores in distribution are b/t the.
Chapter 7 Sampling Distributions Statistics for Business (Env) 1.
February 2012 Sampling Distribution Models. Drawing Normal Models For cars on I-10 between Kerrville and Junction, it is estimated that 80% are speeding.
Putting Accuracy Neal Hatch James Kuykendall 7/27/2006.
Chapter 6: Analyzing and Interpreting Quantitative Data
Data Summary Using Descriptive Measures Sections 3.1 – 3.6, 3.8
Week 6. Statistics etc. GRS LX 865 Topics in Linguistics.
Data Analysis, Presentation, and Statistics
Education 793 Class Notes Inference and Hypothesis Testing Using the Normal Distribution 8 October 2003.
© 2008 McGraw-Hill Higher Education The Statistical Imagination Chapter 5. Measuring Dispersion or Spread in a Distribution of Scores.
Variability Introduction to Statistics Chapter 4 Jan 22, 2009 Class #4.
PUTTING TO A BIGGER HOLE: GOLF PERFORMANCE RELATES TO PERCEIVED SIZE Erika Larose CLPS 1500 Jessica Witt, 2008.
Psych 230 Psychological Measurement and Statistics Pedro Wolf September 16, 2009.
From the population to the sample The sampling distribution FETP India.
ContentDetail  Two variable statistics involves discovering if two variables are related or linked to each other in some way. e.g. - Does IQ determine.
Lecture 8: Measurement Errors 1. Objectives List some sources of measurement errors. Classify measurement errors into systematic and random errors. Study.
6.1 Discrete and Continuous Random Variables Objectives SWBAT: COMPUTE probabilities using the probability distribution of a discrete random variable.
Chapter 9: Introduction to the t statistic. The t Statistic The t statistic allows researchers to use sample data to test hypotheses about an unknown.
Golf By: Volk Jensiriwanich. What is Golf? 1 Putting the ball in a hole. Use as few strokes as possible. Challenging and takes time to master. All ages.
Artificial Neural Network System to Predict Golf Score on the PGA Tour ECE 539 – Fall 2003 Final Project Robert Steffes ID:
Regression Analysis: A statistical procedure used to find relations among a set of variables B. Klinkenberg G
Psychology’s Statistics Appendix. Statistics Are a means to make data more meaningful Provide a method of organizing information so that it can be understood.
Copyright © 2009 Pearson Education, Inc. Chapter 11 Understanding Randomness.
Density Curves & Normal Distributions Textbook Section 2.2.
Returns to Skill in Professional Golf Leo H. Kahne International Journal of Sport Finance, 2010 A Quantile Regression Approach.
The Masters of Math Challenge
Chapter 14 Introduction to Multiple Regression
Different Types of Data
Eric Perry (Clear Lake HS)
Chapter 7 Exploring Measures of Variability
AP Statistics Chapter 14 Section 1.
Roland Minton Roanoke College Golf by the Numbers, JHU Press
Putting Putting In Its Proper Place Roland Minton Roanoke College.
Skill and Randomness on the PGA Tour Roland Minton Roanoke College
Statistics: The Z score and the normal distribution
Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017.
Sampling Distributions
CHAPTER 3 Data Description 9/17/2018 Kasturiarachi.
Analysis based on normal distributions
Chapter 2: Modeling Distributions of Data
Chapter 7: Sampling Distributions
Linear Regression and Correlation
Presentation transcript:

February 19, 2010 How to Catch a Tiger: Understanding Putting Performance on the PGA TOUR Jason Acimovic MIT Operations Research Center, Douglas Fearing MIT Operations Research Center, Professor Stephen Graves MIT Sloan School of Management,

February 19, 2010 Agenda 2 Introduction – Project Question – Applications – Approach and contribution Golf and data overview Putting model Off-green model Situational analysis

February 19, 2010 Project Question How well do people perform on tasks? – Tasks differ from each other – Not everyone performs every task – Even the same task can be different from person to person 3

February 19, 2010 Applications Evaluating employees in a distribution center – Pickers in a warehouse vary in skill (picks per hour) – Pick zones vary in difficulty (books vs. electronics) – Difficulty also varies by hour of day and day of week – Pickers shift around, but not enough to ensure perfect mixing – How do you compensate the best employees and identify underperformers? Golf putting – Different golfers play different tournaments – Greens vary in their difficulty – Different golfers start on the green from different distances – How do we identify the best putters? 4

February 19, 2010 Project approach and contribution Develop statistical models to predict strokes-to-go Correct for player skill and course difficulty Evaluate incremental value of each shot taken relative to the expectation for the field – Compare predicted strokes-to-go before and after shot Aggregate shot value across players, shot types, etc. to better understand player performance Compare our model to current metrics, namely, Putting Average Paper: (or us) 5

February 19, 2010 Agenda 6 Introduction Golf and data overview – Strokes-to-go example – ShotLink data Putting model Off-green model Situational analysis

February 19, 2010 Quick golf primer The goal is to get from the tee to the pin in the fewest number of strokes 18 holes in a round of golf Typically 4 rounds in a tournament Lowest total score wins 7 Tee Green Fairway

February 19, 2010 Strokes-to-go example Shot LocationStrokes-To-Go Shot Gain – 3.0 – 1 = 0.4

February 19, 2010 ShotLink Data 9 Every tournament, 250 volunteers gather data on every shot – Lasers pinpoint the ball location to within an inch – Field volunteers gather qualitative characteristics Data is used for both real time reporting as well as detailed analyses 5 Million shot data points 2 Million putt data points

February 19, 2010 Visual explanation of ShotLink TM dataset Course Year Round Number Hole Number Tee Location Ball Location Pin Location Player Shot Number Location Type Ball Lie Hole Par Stimp Reading Green Length X Coordinate Y Coordinate Z Coordinate 16 th Hole on Colonial 10 X Coordinate Y Coordinate Z Coordinate

February 19, 2010 Data for the 14 th hole at Quail Hollow – 1 day 11

February 19, 2010 Agenda 12 Introduction Golf and data overview Putting model – Empirical data – Two stage model Holing out submodel Distance-to-go submodel – Markov chain – Correct for hole difficulty and player skill – Putts-gained per round and results Off-green model Situational analysis

February 19, 2010 Empirical mean and std. dev. of putts-to-go MeanStd. Dev. 13

February 19, 2010 Two-stage model to predict putts-to-go First stage sub-model – From anywhere on the green, the first model predicts the probability of sinking the putt 14 Probability of 0.1 of making it in on this putt

February 19, 2010 Second stage sub-model – If the golfer misses the putt, the second model calculates the distribution of the distance-to-go for the green If I miss, I have a probability of being in this blue area. (calculate this for entire green) Second stage finds conditional distance-to-go 15

February 19, 2010 We can calculate the putts-to-go distribution from anywhere on the green Combine and … 16 Consider only distance in our model

February 19, 2010 Empirical probabilities of holing out 17 Empirical probability of holing out vs. distance

February 19, 2010 Normal regression is inappropriate With Ordinary Least Squares regression, “one” might predict the probability of making a putt based on starting distance…. But… – We want to predict a probability with a range between 0 and 1 – Errors are not normal 18

February 19, 2010 One-putt logistic regression model Y – putts-to-go d – initial distance to the pin Fitted model parameters: Probability: 19

February 19, 2010 Model holing out as a logistic regression 20 Model probability of holing out vs. distance

February 19, nd -stage problem, determining distance-to-go What happens if we miss the first putt? 21 z

February 19, 2010 Empirical mean and std. dev. of distance-to-go MeanStd. Dev. 22

February 19, 2010 Empirical distributions of distance-to-go From 10 ft.From 30 ft. 23

February 19, 2010 Distance-to-go gamma regression model d – initial distance to the pin z – distance-to-go (assuming a miss) Fitted model parameters: Mean: Density: 24

February 19, 2010 Distance-to-go model: mean and std. dev. MeanStd. Dev. 25 October 19, 2015

February 19, 2010 Distance-to-go model distributions From 10 ft.From 30 ft. 26

February 19, 2010 Putts-to-go as Markov chain 27 distanceH p = 1 p = [ 1 + exp(…) ] -1 g (z|d) = (1 - [ 1 + exp(…) ] -1 ) x f(z|d) z Where g(z|d): probability density of ending up at z conditioned on starting at d f(z|d)probability density of ending up at z conditioned on missing and starting at d (from the distance-to-go gamma regression model) d Probability of holing out in n putts is probability of reaching absorbing state in n transitions

February 19, 2010 Making it within n putts (model prediction) Over 90% of golfers 2-putt or better within 35 ft. Only a 1.6% chance of 4-putting or worse at 100 ft. 28 Two-Stage Model Within N Putts

February 19, 2010 Two-stage model mean and std. dev. MeanStd. Dev. 29

February 19, 2010 Comparing putt quality Greens vary in difficulty – Fast vs. slow greens – Type and length of grass Good putts on a hard green should be valued more than the same on an easy green Adjust parameters for each hole to the logistic and gamma regression models 30

February 19, 2010 Revised logistic and gamma regressions Every player p and hole h have their own dummy variables and specific holing-out probabilities * – I p is the indicatory variable, and is equal to 1 if observation i contains player p and is zero otherwise. – Instead of a regression with 6 parameters, we now have thousands of parameters E.g., there is a β 0h parameter for every hole 31 *The actual analysis accounts for the number of observations per player and per hole, so that the model is more complex for players about whom we know more. The gamma regression is adjusted similarly

February 19, 2010 Visualizing player skill level differences 32 Comparison of above average (Brent Geiberger), below average (John Huston), and field average putter for an average green

February 19, 2010 Visualizing green difficulty differences Comparison of an easy green (Bay Hill #9), difficult green (Sawgrass #1), and average green based on a field average golfer 33

February 19, 2010 Calculating putts gained per round Calculate the gain associated with each putt – Relative to the putts-to-go for each specific hole – Example: Golfer starts at 12 ft. and takes 2 putts to sink ball Expected putts-to-go: 1.71 Actual number of putts: 2 Relative gain: (- 0.29) Sum the relative gains for each player Divide by the number of rounds played feet 1.71 putts to go

February 19, 2010 Top 10 putts gained per round 35 RankGolfer Putts Gained / Round Number of Rounds Putts Gained / Round Stdev 1 Tiger Woods David Frost Fredrik Jacobson Nathan Green Aaron Baddeley Jesper Parnevik Stewart Cink Darren Clarke Ben Crane Willie Wood

February 19, 2010 Putting average is the most popular metric today Putting Average – Average number of putts per green * When a golfer reaches a green – Count the putts it takes to get it in the hole – Average this among all his green appearances – Regardless of how close he starts on the green 36 * Actually, a green in regulation, which means the green was reached in no more than (par – 2) strokes

February 19, 2010 Comparing with putting average 37 Golfer Putts Gained / Round PG/R RankPutting Average PA Rank Tiger Woods David Frost Fredrik Jacobson Nathan Green Aaron Baddeley Jesper Parnevik Stewart Cink Darren Clarke Ben Crane Willie Wood

February 19, 2010 Understanding the discrepancies Insert first-putt distance histograms for most severe outlier. 38 PG/R PercentileGolfer Putts Gained / RoundPutting Average PA Percentile 9 th Stephen Leaney th 88 th Ernie Els th 54% for All Players 51% for Stephen Leaney 60% for Ernie Els Percentage of 1 st putts 20 ft. or closer On average he starts closer to the hole, so his putting average is inflated by his excellent approach shots

February 19, 2010 Agenda 39 Introduction Golf and data overview Putting model Off-green model Situational analysis

February 19, 2010 Evaluating off-green performance For each hole, calculate “field par” – Empirical average number of strokes corrected for player skill and hole difficulty Calculate total strokes gained per round for each player Calculate off-green strokes gained per round 40 (Off-green strokes gained = Total strokes gained – putts gained)

February 19, 2010 Top 10 golfers (on and off green performance) 41 RankGolfer Putts Gained / Round Off-Green Gain / RoundTotal 1 Tiger Woods Vijay Singh Jim Furyk Phil Mickelson Ernie Els Adam Scott Sergio Garcia David Toms Retief Goosen Stewart Cink

February 19, 2010 Agenda 42 Introduction Golf and data overview Putting model Off-green model Situational analysis – Player specific putts – Fourth round pressure – Tiger woods’ fourth round performance

February 19, 2010 Situational putting performance Above, we used the general putting model to evaluate putting relative to the field of professionals We also have the capability to evaluate a golfer’s putting relative to his own expected performance For instance, even if Tiger Woods usually putts better than the field, we can also determine when he putts worse than himself – Does he putt better or worse after the cut? – Does he putt better or worse for birdie vs. for par? 43

February 19, 2010 Player-specific putts gained – example On the 10 th green at Quail Hollow, 9 feet from the pin: – Tiger Woods’ personal expected putts-to-go is 1.54 – Vijay Singh’s personal expected putt-to-go is 1.59 – If they each sink it, Tiger gains only 0.54 strokes whereas Vijay gains 0.59 strokes 44 Tiger: E[putts] = 1.54 Vijay: E[putts] = ft

February 19, 2010 Advantages of player-specific putts gained Easy to test various hypotheses – After calculating the shot value for every putt, we need only to filter and aggregate the results Describes the magnitude in terms of score impact Suggests areas for further investigation – Standard deviation of putts gained provides the relative significance of the effect 45

February 19, 2010 Fourth round pressure 46 Putting does not seem to be affected by the pressures of being in the fourth round

February 19, 2010 Tiger Woods’ fourth round performance A common perception is that Tiger has the ability to kick it up a notch during the final round Looking at his putts-gained suggests otherwise 47

February 19, 2010 Conclusion Developed a model for putting – Corrected for player skill and hole difficulty – Intuitive model that describes how putts occur Demonstrated the differences between our metric and current putting statistics Developed a “field par” which corrects for hole difficulty and quality of field Compared on- and off-green performance Examined situational putting performance 48