ELECTION 2008: BOILING DOWN THE HUNDREDS OF POLLS INTO GRASPABLE ESTIMATES OF WHO'S LIKELY TO WIN THE U.S. PRESIDENCY Alan Reifman, Ph.D., Professor, Human Development & Family Studies For this talk, I put on my methodologist-statistician and political- observer hats. The Electoral College system of U.S. presidential elections means that, in reality, we have to watch 50 separate state elections (plus D.C.), rather than a single federal election. Further, there are approximately 20 polling outfits (often working on behalf of a newspaper or television station) that release pre-election surveys, with great frequency. In a close election year (as this November's contest appears to be), therefore, even highly motivated citizens may have a difficult time aggregating the large volume of polls into a graspable estimate of who is likely to win the election. Fortunately, a number of polling-analysis websites have sprung up, each of which applies some type of statistical analysis to distill the collection of polls into probability estimates of each candidate winning. Other sites present graphical representations of trends, which also serve to simplify the information. Such approaches -- which I will discuss -- include converting poll results into win probabilities for each candidate; computer simulations; and local (loess or lowess) regression. Department of Mathematics & Statistics, Texas Tech University, September 10, 2008
Each state’s EV’s = No. of U.S. House seats (based on pop’n) + 2 U.S. Senate Seats All states (except NE & ME) winner-take-all; even if candidate narrowly wins state’s popular vote, he or she still gets 100% of state’s EV’s Need 270 Electoral Votes to win the presidency Many states are overwhelmingly D or R leaning and thus not contest- ed (see dark blue and red below), but the remaining states are competitive to varying degrees The Electoral College System
In Each of the “Swing” States, Many Polls Are Taken, Requiring Some Type of Within-State Aggregation (“Meta-Analysis”) Simple Averaging (Arithmetic Mean): Weighted Average:
Typical Poll Report Format Smith…….53% Jones……47% Margin of Error +/- 3% (which includes possibility the race is really 50/50) Based on Ayres, Super Crunchers, pp True value of parameter will be within point estimate +/- MoE, with 95% confidence Normal curve from: Extra 2.5% on this side would also indicate winning; hence Smith would have… 97.5% probability of winning, hardly a “statistical dead-heat”
Which is consistent with…
A More Empirical Approach
State-by-State Candidate Win Percentages Can Then Be Used to Conduct Simulations of the Overall Election As a simplified example, suppose our best estimate is that Obama has a 60% probability of winning a given state and McCain has a 40% probability… OBAMA “WINS” McCAIN “WINS” HAVE COMPUTER GENERATE A RANDOM NUMBER BETWEEN ONE “ELECTION” CONSISTS OF A SIMULATION FOR EVERY STATE; THOUSANDS OF ELECTIONS CAN BE SIMULATED
10,000 simulations per day 100,000 simulations per day
Thanks to Peter Westfall for bringing this to my attention
LOcally Estimated Scatterplot Smoothing (LOESS) Regression “ ” * *Also LOWESS, with W for Weighted
Other Useful Websites (not to be confused with Scroll down to “Probability of Win by State,” then click on color bars Also see: