Using Multiple Methods to Reduce Errors in Survey Estimation: The Case of US Farm Numbers Jaki McCarthy, Denise Abreu, Mark Apodaca, and Leslee Lohrenz National Agricultural Statistics Service US Department of Agriculture Paper presented at the International Total Survey Error Workshop Stowe, VT June 2010
What is the goal of reducing TSE? Surveys used to estimate a construct Goal of TSE reduction is to more accurately estimate construct in that survey Construct: Number of Farms in the US Measured in multiple ways: – June Agricultural Survey – Census of Agriculture
The Council of Advisors Multiple sources provide advice Each is likely biased in some direction We assume that collective advice is better than any single source
Comparisons between “advisors” may uncover errors in both or either In most cases, there is no 100% accurate source Each estimate makes different assumptions, uses different procedures
Farm Number Estimates June Agricultural Survey (JAS) – Purpose: direct estimates of acreage and measures of sampling coverage – Area Frame Based – In Person Data Collection – Sample Survey – Voluntary – All non-response is manually estimated Census of Agriculture (COA) – Purpose: detailed county level agricultural data on all commodities produced and expenses, income and operator characteristics – List Based – Primarily self administered mail data collection – Census – Mandatory – Non-response weighting adjustment
Census Undercoverage and Misclassification Historically, JAS is a benchmark for COA – Area frame has theoretically complete coverage – Flagship survey for NASS with personal interviews Classification Error Survey uncovered errors in both JAS and COA identification of farms, but with most in JAS
What perspective? Advisor #1: JAS – Primary objective is to produce acreage estimates, farm numbers are secondary Advisor #2: COA – Primary objective is to collect information on ALL farms
US Farm Numbers
So we begin with 2 independent estimates of farm numbers….. To improve JAS estimate, additional follow up was conducted to estimate number of farms in subset that were originally estimated or classified as NOT farms Result: additional farms missed (misclassified) in the JAS This can be added to original JAS estimates
US Farm Numbers
ADD another advisor: another independent estimate of farms Assumption that operations reporting themselves as farms on the 2007 COA, but not JAS were misclassified in JAS Another regression estimate based on this assumption applied to 2009 JAS survey data
US Farm Numbers
The Council of Advisors is used to set the “official” farm number Each of the methods is measuring the same construct Each of the methods is independent, has different emphasis, and has its unique errors Objective is not to “fix” an individual survey estimate or measure its errors Objective is to combine all of these estimates to produce the “best” number: Reducing Total Construct Error
My questions to you: Do you use similar practices? How do you combine multiple sources of information? What is the best way to do this? How does this fit into the TSE context?