Presentation is loading. Please wait.

Presentation is loading. Please wait.

Some ACS Data Issues and Statistical Significance (MOEs) Table Release Rules Statistical Filtering & Collapsing Disclosure Review Board Statistical Significance.

Similar presentations


Presentation on theme: "Some ACS Data Issues and Statistical Significance (MOEs) Table Release Rules Statistical Filtering & Collapsing Disclosure Review Board Statistical Significance."— Presentation transcript:

1 Some ACS Data Issues and Statistical Significance (MOEs) Table Release Rules Statistical Filtering & Collapsing Disclosure Review Board Statistical Significance Testing & Margins of Error (MOEs)

2 Table Release Rules February 28, 2007

3 “B” and “C” Tables

4 Full Table – PASSED FILTERING Statistically too Small

5 Collapsed Table

6 The Census Bureau Story Why did we collect all this data if we were not going to release it?

7 ACS Data Release Rules Doug Hillmer Data Products Area American Community Survey Office U.S. Census Bureau October 11, 2006

8 Limitation of Disclosure Risk –The Census Bureau’s Disclosure Review Board (DRB) must clear all data products prior to their release to the public. Assurance of Statistical Reliability –Data users need to be able to use ACS estimates as official Census Bureau data. Thus, some rules must be in place to ensure minimum reliability of estimates. –Statistical reliability is assured by: Population size thresholds below which estimates are not released Data release testing and collapsing of tables that fail The Census Bureau Will Not Release All Available Estimates to the Public

9 The ACS “Identity Crisis” on Reliability Ultimately, the 5-year estimates, with no “data release rules” acts as a long-form replacement Single-year ACS sample is more like a current demographic survey – although much larger in size Question to answer for single-year estimates: Do we accept less detail in our measures of characteristics or do we allow more detail but with data release rules in place? Less detail punishes those areas with the diversity to support the detail.

10 Choices for displaying estimates in ACS data products No suppression 1.Publish full detail with no suppression but higher pop threshold (eg., 500,000) 2.Publish limited set of estimates for all areas with 65,000+ pop 3.Published more detailed estimates for higher pop threshold and limited set for lower threshold With suppression or Warnings 4.Define a very detailed set of estimates for all geo areas with 65,000+ pop and suppress estimates that fail reliability test 5.Define a very detailed set of estimates for all geo areas with 65,000+ pop and flag estimates that fail reliability test

11 Filtering > Goal: to identify “weak” tables Some tables have many zero or “near zero” cells and relatively large standard errors Filtering > rule used during 2000-2004 ACS: drop tables if… –Universe is less than 500 (weighted) –Average cell size is less than 2 cases (unweighted) filtering > rule used now: –Accept if median coefficient of variation is less than or equal to 61% –Otherwise, collapse and review again

12

13 Why not just use cell suppression as is done for the Economic products? Advantages Gets rid of the “bad” estimates Keeps the “good” estimates (depends on complementary suppression) Disadvantages Creates “holes” in distributions Makes new problems for combined estimates (eg., in derived products, such as data profiles) Produces a new set of problems for year-to-year comparisons

14 Data Release Testing – Step by Step Compute coefficients of variation –Coefficient of variation = standard error / estimate –Standard error = (upper bound – estimate) / 1.65 –If the estimate = 0 set coefficient of variation = 100% Ignore total and sub-total lines in base table Sort coefficients of variation in descending order Find the middle value (the median) If the median is greater than 61% the table FAILS (median > 61% means more than half of the cells have a lower bound of 0; i.e., these cells are not statistically different from 0) If the median is 61% or less the table PASSES

15 Collapsing Goal: release a simplified version of a base table for a geographic area that otherwise would get nothing Decisions on design of collapsed tables are made by subject-matter experts at the Census Bureau For operational reasons, only one collapsed version of each base table will be available regardless of geographic area

16 How the Data Release Rules will Work with Collapsed Versions of Base Tables

17 More About Collapsing Collapsed Tables are designed to assure that derived products (profiles, ranking tables, subject tables,…) can still be sourced from the base tables 2005 Tables: if a table passes filtering and a collapsed version exists, publish both the original version and the collapsed version for that geographic area

18 Problems to fix in the current implementation of the data release rules Collapsed versions missing in some cases Collapsed versions that aren’t working Poor choices in “sourcing” for derived products (eg., profiles)

19 Statistical Significance Testing Why should I do it? When should I do it? How do I do it?

20 Testing is Important Testing is Important

21 Estimate X is bigger than Y Estimate X is bigger than Y Estimate X this year is larger than X last year Estimate X this year is larger than X last year Estimate X is smaller than Census 2000 value Estimate X is smaller than Census 2000 value State Z has the highest value State Z has the highest value Statements you might want to make

22 1.Get the Margin of Error (MOE) from ACS 1. Get the Margin of Error (MOE) from ACS 2. Calculate the Standard Error (SE) [SE = MOE / 1.645] 3. Solve for Z where A and B are the two estimates 4. If Z 1.645 Difference is Significant at 90% confidence How do I do a significance test?

23 Obtaining Standard Errors is the Key Sum or Difference of Estimates Sum or Difference of Estimates Proportions and Percents Proportions and Percents Means and Other Ratios Means and Other Ratios Simple Formulas Where….

24 There is HELP off in the wings

25 But what if I am using 2000 non-ACS Data? Where’s are my MOEs?

26

27 Lets get to work on the Standard Error N = Size of publication area (population) Y = Estimate of characteristic X Survey Design Factor

28 www.census.gov/prod/cen2000/doc/tablec-xx.pdf xx=fl Mode to Work 1.4 1.2 0.9 0.7

29 N = Size of publication area (population = 362,563 ) Y = Estimate of characteristic 5Y = 5* 126,540 632,700 1 - (Y/N) = 126,540 / 362,563 1- 0.3490152 0.6509848 SE = 641.7772

30 X Survey Design Factor SE = 641.777126,540 / 362,563 = 35% Survey Design Factor = 0.7 Final Adjusted SE = 450

31

32

33

34 Tempting Green is OK This is NOT

35 Want to do an exercise on your own?


Download ppt "Some ACS Data Issues and Statistical Significance (MOEs) Table Release Rules Statistical Filtering & Collapsing Disclosure Review Board Statistical Significance."

Similar presentations


Ads by Google