Download presentation
Presentation is loading. Please wait.
1
The Ratings Game: Scoring Washington Reds
Christopher Bitter University of Washington
2
Introduction Motivation Data: Questions
U.S. consumers are “buying based on points” / ratings have a huge impact on wine sales Is this a viable strategy? How relevant are ratings? Data: 1,293 Washington State red wines rated by Wine Advocate, Wine Enthusiast, and Wine Spectator (3,879 total ratings) vintages; 11 varietals; 8 AVAs; $11 to $150 (median $45); average score: 90.7 points Questions Do the publications agree with one another? Are the differences in scoring systematic? In other words – can they be explained by subjective preferences? Simplicity All know that a single number can’t capture the nuances in wine and the circumstances surrounding its enjoyment Can it help use choose higher quality wines that we will enjoy more?
3
Prior Work U.S. Wine Competitions Bordeaux en Primeur Tastings
Hodgson (2008; 2009); Ashton (2012); Cao (2014); etc. Low correlations in scoring across judges – lack consensus Judges also lack reliability – unable to replicate scores in subsequent tastings of the same wine Bordeaux en Primeur Tastings Moderate degree of consensus (Ashton 2013, etc.) Differences are systematic – indicative of subjectivity (Masset et al. 2015; Cardebat & Vivat 2016) Unique settings - not entirely relevant to the typical U.S. wine drinker - ability to generalize results is uncertain Stuen et al. (2015) – study of CA and WA wines
4
Agreement?: Scoring Distributions
Wine Enthusiast gives the highest scores / Wine Spectator the lowest (bias) Wine Spectator uses a narrower scoring range – 98% fall within a 9 point range (discriminates less) Do they use the 100 point scale in a consistent manner
5
Agreement? Correlations
Low-to-moderate degree of consensus regarding wine quality Correlations intermediate between wine competition and Bordeaux settings
6
Agreement? Variation in Scores
Mean standard deviation is 1.40 for the 1,293 wines Range is 4 or more 40% of the time May just focus on the range here
7
Disagreement Potential causes of disagreement in scoring
Lack of accuracy / reliability Subjective preferences Testing for subjectivity If preferences play a role – scoring differences should be systematically related to wine attributes Difference in score between two publications modelled as a function of: price, vintage, varietal, appellation, and winery Ordinary least squares estimation
8
Regression Results Price and label attributes explain:
33% of the difference between Advocate & Enthusiast 43% of the difference between Advocate & Spectator 21% of the difference between Enthusiast & Spectator
9
Implications Consumers Producers
Single score is not always representative of consensus opinion – limits relevance – better to consider multiple scores 63% of all wines in the $15 - $25 range achieved a max score of 90 or above – only 9% had a “consensus” score of 90 Subjectivity not necessarily negative - but implies that some ratings may be more relevant than others Ratings are relevant – but a blunt instrument Producers Good producers should be rewarded in the end – but variability in scoring favors those with better access to the review system Probability of getting a 90 point score in the $15 - $25 category improves from 28% with 1 rating to 63% with 3 Opportunity to exploit knowledge of scoring differences and preferences in order to improve ratings and sales? Superscoring
10
The End. for a copy of the paper or more information
11
Regression Coefficients: Raw Score Models
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.