Audit Thoughts Ronald L. Rivest MIT CSAIL Audit Working Meeting

Slides:



Advertisements
Similar presentations
Chapter 6 Sampling and Sampling Distributions
Advertisements

ThreeBallot, VAV, and Twin Ronald L. Rivest – MIT CSAIL Warren D. Smith - CRV Talk at EVT’07 (Boston) August 6, 2007 Ballot Box Ballot Mixer Receipt G.
A technical analysis of the VVSG 2007 Stefan Popoveniuc George Washington University The PunchScan Project.
HYPOTHESIS TESTING Four Steps Statistical Significance Outcomes Sampling Distributions.
Chapter 7 Sampling and Sampling Distributions
This material in not in your text (except as exercises) Sequence Comparisons –Problems in molecular biology involve finding the minimum number of edit.
August 6, 2007Electronic Voting Technology 2007 On Estimating the Size and Confidence of a Statistical Audit Javed A. Aslam College of Computer and Information.
Part III: Inference Topic 6 Sampling and Sampling Distributions
Control Charts for Attributes
Introduction to Hypothesis Testing for μ Research Problem: Infant Touch Intervention Designed to increase child growth/weight Weight at age 2: Known population:
 The “4 Steps” of Hypothesis Testing: 1. State the hypothesis 2. Set decision criteria 3. Collect data and compute sample statistic 4. Make a decision.
Perspectives on “End-to-End” Voting Systems Ronald L. Rivest MIT CSAIL NIST E2E Workshop George Washington University October 13, 2009 Ballot Bob Ballot.
Audit Purpose of Audit Quality assurance procedure Check accuracy of machine tally of ballots Ballots for a contest are sampled, manually verified, and.
Sample Size Determination Donna McClish. Issues in sample size determination Sample size formulas depend on –Study design –Outcome measure Dichotomous.
ENSEMBLE LEARNING David Kauchak CS451 – Fall 2013.
Section 8.1 Estimating  When  is Known In this section, we develop techniques for estimating the population mean μ using sample data. We assume that.
Andreas Steffen, , LinuxTag2009.ppt 1 LinuxTag 2009 Berlin Verifiable E-Voting with Open Source Prof. Dr. Andreas Steffen Hochschule für Technik.
1 Chapter 10: Introduction to Inference. 2 Inference Inference is the statistical process by which we use information collected from a sample to infer.
1 Chapter 7 Sampling Distributions. 2 Chapter Outline  Selecting A Sample  Point Estimation  Introduction to Sampling Distributions  Sampling Distribution.
Section 10.1 Confidence Intervals
Excursions in Modern Mathematics, 7e: Copyright © 2010 Pearson Education, Inc. 15 Chances, Probabilities, and Odds 15.1Random Experiments and.
Small Vote Manipulations Can Swing Elections By Anthony Di Franco, Andrew Petro, Emmett Shear, and Vladimir Vladimirov October 2004.
Education 793 Class Notes Decisions, Error and Power Presentation 8.
Electronic Voting R. Newman. Topics Defining anonymity Need for anonymity Defining privacy Threats to anonymity and privacy Mechanisms to provide anonymity.
VVPAT Building Confidence in U.S. Elections. WHAT IS VVPAT ? Voter-verifiable paper audit trail Requires the voting system to print a paper ballot containing.
8.2 Estimating a Population Proportion Objectives SWBAT: STATE and CHECK the Random, 10%, and Large Counts conditions for constructing a confidence interval.
CSC 395 – Software Engineering Lecture 27: White-Box Testing.
The accuracy of averages We learned how to make inference from the sample to the population: Counting the percentages. Here we begin to learn how to make.
Auditability and Verifiability of Elections Ronald L. Rivest MIT ACM-IEEE talk March 16, 2016.
Chapter 6 Sampling and Sampling Distributions
Lecture Notes and Electronic Presentations, © 2013 Dr. Kelly Significance and Sample Size Refresher Harrison W. Kelly III, Ph.D. Lecture # 3.
Audit Sampling: An Overview and Application to Tests of Controls
Ronald L. Rivest MIT NASEM Future of Voting Meeting June 12, 2017
Predicting Elections adapted and updated from a feature on ABC News’ Nightline.
Confidence Intervals for Proportions
Chapter Eight Estimation.
One-Sample Tests of Hypothesis
Sampling.
CHAPTER 9 Testing a Claim
Perspectives on “End-to-End” Voting Systems
Understanding Sampling Distributions: Statistics as Random Variables
Significance Test for the Difference of Two Proportions
ThreeBallot, VAV, and Twin
Election Audit?? What in the world?.
The Diversity of Samples from the Same Population
Understanding Results
Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017.
Chapter 8: Inference for Proportions
Ronald L. Rivest MIT NASEM Future of Voting December 7, 2017
Chapter 8 Inference for Proportions
Texas Secretary of State Elections Division
The 2004 Ohio Presidential Election: Cuyahoga County Analysis
Chapter 9 Testing A Claim
Statistical Process Control
One-Sample Tests of Hypothesis
A 95% confidence interval for the mean, μ, of a population is (13, 20)
ISI Day – 20th Anniversary
Texas Secretary of State Elections Division
Multiple Choice Review 
Auditability and Verifiability of Elections
WARM – UP A local newspaper conducts a poll to predict the outcome of a hotly contested state ballot initiative that would legalize gambling. After conducting.
Econ 3790: Business and Economics Statistics
Four-Cut: An Approximate Sampling Procedure for Election Audits
Bayesian audits (by example)
Keller: Stats for Mgmt & Econ, 7th Ed Sampling Distributions
One-Sample Tests of Hypothesis
Chapter 8 Inference for Proportions
Risk Limiting Audits Nuts, Bolts, and Paperclips
Ronald L. Rivest MIT ShafiFest January 13, 2019
Presentation transcript:

Audit Thoughts Ronald L. Rivest MIT CSAIL Audit Working Meeting ASA, Alexandria, VA October 24, 2009

Viewpoint on high-tech We should think of a computer (or other forms of “high tech”) as a very fast and well-trained four-year old child. The child may be very helpful (she is fast, and well-trained!) but may not always do the right thing (she’s only four!). For something as important as an election, a ``grown-up’’ should always check her work.

Audit Method Types Post-Election Manual Tally (PEMT) Audit tallying by ``batches’’ Single-ballot methods (e.g. CHF’07) Convert all ballots to electronic form first Audit the conversion; then tally is easy (even IRV) End-to-End Voting Systems Scantegrity II, Pret A Voter, … Takoma Park election (Nov 2009)

Assume ``chain of custody’’ is OK ??

Voting Steps recorded as intended cast (and collected) as recorded counted as cast Ballot Bob Box Bob 42 Sue 31

End-to-End Voting Steps verifiably recorded as intended (by voter) verifiably cast (and collected) as recorded (‘’) verifiably counted as cast (by anyone) Verified! Ballot Bob Box Bob 42 Sue 31

PEMT considerations PEMT is also about tradeoffs between Cost Level of assurance provided Simplicity / Understandability If you’re already spending $6 / voter, spending another $0.10 on integrity/audit is ``low order’’ (e.g. auditing 20% at $0.50/ballot)

PEMT considerations Precincts have variable sizes! A small amount of ``interpretation error’’ is expected (e.g. people see voter intent differently than a scanner does) Late batches vs. fast start Staged audits vs. tight timescale Multiple, overlapping, contests

Detection vs. Correction Much initial work (e.g. APR) strove for high-probability detection of error sufficient to have changed outcome. Models tended to ignore interpretation error. APR and similar works also treated ``what to do’’ (correction) lightly. E.g. assuming that full recount would be done if error was detected (which would then make them two-stage risk-limiting audits). See Stark for more discussion of turning detection  correction.

Margin-based audits Let M = reported margin of victory Want smaller audit when M is large Assume n batches Let u_i be upper bound on error in i-th batch U = sum_i u_i is their total Let e_i be actual error in i-th batch towards changing outcome (determinable by audit) Want to know if sum_i e_i >= M Many approaches (Saltman; SAFE; …)

PPEBWR [APR’07] Probability proportional to error bound, with replacement Pick batch i with probability proportional to u_i / U; do this t times (with replacement). Chance that precincts with error of total magnitude M is never picked is <= ( 1 – M/U )t To get this chance < alpha (e.g. alpha = 0.05): t > ln( alpha ) / ln( 1 – M/U )

NEGEXP [APR’07] Batch i is picked independently with probability p_i = 1 - alpha ^ ( ui / M ) When total error is at least M , the chance of not detecting any errors in sampled batches is less than alpha .

PPEBWR and NEGEXP Require that you know margins to get started Both require more sophisticated sampling than simple random (uniform) sampling. Are not risk-limiting unless you do full recount when error detected (or embed them otherwise in an appropriate escalation procedure).

Escalation of Sample Size PPEBWR fairly straightforward: you are effectively just increasing t and continuing the drawing process. For NEGEXP: Easier to think of this as decreasing alpha ; so p_i’s are increasing. (Imagine having a random x_i for batch i ; where x_i is in [0,1]. Batch i is audited iff x_i <= p_i Increasing p_i’s will cause more to be audited, in a nice telescoping way. )

Combining multiple races Assume that there are ``economies’’ – hard part is fetching ballots, easy to audit multiple races once you have ballots… (Is this true??) With NEGEXP, each race gives probability of audit for a batch: p’_i , p’’_i , p’’’_i, ... We can then audit batch with probability p_i = max( p’_i, p’’_i, p’’’_i, …) and satisfy auditing conditions for all races simultaneously…

KYBS* * Keep Your Batches Small!

The End