Download presentation
Presentation is loading. Please wait.
Published byJerome Craig Modified over 6 years ago
1
Computational Campaign Coverage with PollyVote.com
Andreas Graefe Tow Center September 13, 2016
2
PollyVote History The PollyVote started in 2004 as a forecasting research project and was applied Prospectively to U.S. presidential elections in 2004, 2008, 2012, the 2013 German federal election Retrospectively to U.S. presidential elections in 1992, 1996, and 2000.
3
PollyVote Goals Demonstrate the power of combining forecasts for improving accuracy Assess the relative accuracy of different methods over time Develop models that aid decision-making Validate (new) methods for forecasting elections Transparency and accessibility Distribute PollyVote to other countries Provide automated campaign coverage (texts and visualizations) Make the project sustainable
4
How does the PollyVote work?
Mechanical combination of forecasts Within different components of methods by calculating unweighted averages. Indicate that this is an example of 2 methods. Also show the averages for each/
5
How does the PollyVote work?
Mechanical combination of forecasts Within Across different components of methods by calculating unweighted averages. Combined forecasts are accurate because They use more information Errors of individual forecasts cancel out Indicate that this is an example of 2 methods. Also show the averages for each/
6
Relative accuracy of forecasting methods
On average across the last 100 days prior to each of the past 6 elections… Polls performed worst (MAE: 2.8 percentage points) Expectations (citizen forecasts) performed best (MAE: 1.2 percentage points) Performance of methods varies (within and) across elections Mean absolute error 1992 1996 2000 2004 2008 2012 Average Polls 4.8 4.7 2.7 1.5 1.7 1.3 2.8 Prediction markets 0.8 1.2 0.6 Econometric models 3.6 3.9 3.2 1.1 2.4 Expert judgment 1.4 Index models 0.4 2.3 Citizen forecasts 0.7 1.8 COLORS AGAIN
7
Power of combining forecasts
On average across the last 100 days prior to each of the past 6 elections the PollyVote missed the final election outcome on average by 1 percentage point. was more accurate than each of the (combined) component forecasts. Also, since 2004, the PollyVote always predicted the correct winner, at any given day. Mean absolute error 1992 1996 2000 2004 2008 2012 Average Polls 4.8 4.7 2.7 1.5 1.7 1.3 2.8 Prediction market 0.8 1.2 0.6 Econometric models 3.6 3.9 3.2 1.1 2.4 Expert judgment 1.4 Index models 0.4 2.3 Citizen forecasts 0.7 1.8 PollyVote 0.9 1.0
8
PollyVote is a resource for election observers
Provides accurate forecasts of the election outcome
9
PollyVote is a resource for election observers
Provides accurate forecasts of the election outcome Allows for Comparing methods/models Discovering “interesting” stories
10
Our project Goals: Make our content more accessible by
Visualizing data Providing first drafts of a story Develop automated news Based on data from our component methods / models In English and German language
11
Automated („Robot“) Journalism
Process of using software or algorithms to automatically generate news stories without human intervention …after the initial programming of the algorithm
12
AJ in newsrooms Leading media outlets have started to experiment with automated news. Early expansion phase Few news organizations Limited topics (sports, finance, crime, weather) No prior work for political news (but see WaPo‘s Heliograf)
15
Status quo The good news
Completely automated process, from collecting the forecasting data to publishing texts in both languages More than 13,000 texts published since April But… Many texts contain errors Some texts are missing, others are published many times This is due to errors In the raw data With the data export / import In the algorithm
16
Key learnings to date (I)
Data management Clean and accurate data are most crucial But difficult to maintain
17
Key learnings to date (II)
Article quality Simple texts are easy to automate That is, texts based on a single row in the data Example: Description of poll results Tradeoff: adding complexity (i.e., insights) vs. risk of introducing new errors Additions to texts had to be delayed due to constant error fixing Source of error can be difficult to detect
18
Key learnings to date (III)
Algorithm development Large onboarding effort to educate the NLG provider about our project NLG software difficult to use (but constantly improving) Very challenging to set the rules for the algorithm, for example, How to refer to the margin in polls? When does a candidate have a momentum? When is there a trend in the data? Boundaries of automation?
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.