Presentation is loading. Please wait.

Presentation is loading. Please wait.

Exploring Error and the American Community Survey.

Similar presentations


Presentation on theme: "Exploring Error and the American Community Survey."— Presentation transcript:

1 Exploring Error and the American Community Survey

2 Class prep Go to https://tufts.box.com/s/u7jgivo4w7uyi25brald https://tufts.box.com/s/u7jgivo4w7uyi25brald Download and unzip “American Community Survey Error Exploration” to your Desktop Make writable – right-click on folder => properties => uncheck read-only Double-click on “Exploring Error in the American Community Survey.mxd”

3 Class prep 1 Using Windows File Manager, go to the following folder on your DESKTOP: American Community Survey Error Exploration \AFF_data_tables\ Median_HH_Income_tract 2 Open: a ACS_10_SF4_B19013_metadata.csv – this is the metadata file for the ACS data a ACS_10_SF4_B19013_Med_HH_Income.xlsx – this is the data table (median household income)

4 Test: why do we need to use ACS data in policy / environmental analysis?

5 Because it has important information about our communities…

6

7 So we need to learn to use the information reliably… And especially to understand the margin of error for ACS estimates

8 Review – What is the ACS? American Community Survey A continuous monthly survey of households Long set of questions covering many topics Data is released once a year  1 Year averages – areas with a population 65,000+  3 Year averages – areas with a population 20,000+  5 Year averages - all other areas (including census tracts and blockgroups) E.g., average number of people commuting by bicycle for 2007-2011

9 Use Census 2010 data where possible because it is 100% survey, thus has smaller sampling error Population Counts  Age  Race / Hispanic Ethnicity Housing Unit Counts and Tenure (rented, owner-occupied) Household and Family Relationships

10 ACS and Margin of Error Means of transportation for commute – Tract Level - ACS 2005-2009 5 year estimates Universe is workers 16 and over Workers 16 and Over

11 ACS: Use the highest aggregation you can in terms of tables (can be hard to find)

12

13 Open the Excel files… aACS_10_SF4_B19013_Med_HH_Income.xlsx – this is the data table (median household income) aACS_10_SF4_B19013_metadata.csv – this is the metadata file for the ACS data

14 Metadata file and data table…

15 So let’s understand the margin of error…

16 What is Sampling Error? Definition The uncertainty associated with an estimate that is based on data gathered from a sample of the population rather than the full population 16

17 Illustration of Sampling Error Estimate average number of children per household for a population with 3 households living in a block: Household A has1 child Household B has2 children Household C has3 children The block average based on the full population is two children per household: (1+2+3)/3 = 2 17

18 Conceptualizing Sampling Error Three different samples of 2 households: 1. Households A and B (1 child, 2 children) 2. Households B and C (2 children, 3 children) 3. Households A and C (1 child, 3 children) Three different averages based on which sample is used: 1. (1 + 2) / 2 = 1.5 children 2. (2 + 3) / 2 = 2.5 children 3. (1 + 3) / 2 = 2 children 18

19 Sampling Error Census 2010 is a 100% survey so has smaller error ACS data is based on samples – error is larger The smaller the geography, the larger the error (because the sample is smaller) Especially true for variables that sample a small number of people, e.g., bike commuters

20 ACS and Margin of Error Means of transportation for commute – Tract Level - ACS 2005-2009 5 year estimates Universe is workers 16 and over Workers 16 and Over

21 American Community Survey and sampling error The margin of error is calculated and published with each estimate Calculated at 90% confidence level What does that mean?

22 ACS and Margin of Error Means of transportation for commute – Tract Level - ACS 2005-2009 5 year estimates Universe is workers 16 and over Workers 16 and Over

23 Confidence level of 90% We don’t know for sure how many people in Tract 3.02 take public transit to work Based on the ACS sample, our estimate over 5 years is that an average of 747 people took transit, +/- 226 at 90% confidence level If we did many, many samples of that same tract, 90% of the time the resulting range (521- 973 people) would contain the real number of commuters taking transit. 10% of the time it would not

24 Confidence level of 90% The confidence level of a margin of error indicates the likelihood that the true population value (real number) falls within the margin of error We can be 90% confident that somewhere between 571 and 973 people take transit to work in tract 3.02

25 Also, we know that Tract 3.02 has somewhere between 1958 and 2684 workers) And between 571 and 973 of those take transit to work. So maybe half the workers take transit, or maybe just a fifth of them do. Ugh!!!

26 If using ACS data, pay attention to margin of error!

27 ACS table from American Factfinder….

28 Use metadata file plus AFF web site This table is showing Educational Attainment for universe of people 25 years and older

29 Use AFF web site plus metadata file

30 Bottom line for ACS More up to date information Continuous versus point in time measurement 5 year estimates are the most reliable because they have the largest samples But…  Poorer precision at finer scales (e.g., census tract) or areas of low population (rural areas)  Poorer precision for variables with low numbers (e.g., people who bike to work)

31 Don’t go any lower than tracts for mapping ACS data

32 Geographic Hierarchy

33 Measures associated with sampling error 33

34 Measures Associated with Sampling Error Standard Error (SE) Margin of Error (MOE) Coefficient of Variation (CV) 34

35 Standard Error (SE) Definition A measure of the variability of an estimate due to sampling Depends on variability in the population and sample size Formula SE = MOE / 1.645 (for 90% confidence level) 35

36 Look at the Med_HH_Income Error Analysis worksheet Excel file

37 To calculate CV, we first calculate the SE: SE = (MOE / 1.645)

38 Coefficient of Variation (CV) Definition The relative amount of sampling error associated with a sample estimate (by estimate, we mean the value, like number of people biking to work) The CV is a measure of reliability Formula CV = Standard Error / Estimate * 100% 38

39 Then the CV% formula is: CV = (SE / estimate)*100

40 CV% is a measure of reliability. So what is a good CV %? No agreement Depends on purpose Census case studies:  less than 15% may be reliable  15-30% - not reliable, be very careful  Over 30% - not reliable, use with extreme caution

41 Two examples Median household income and biking to work

42

43

44 Why do you think median household income generally show lower CVs (more reliable estimates)?

45 Exploring Error and the American Community Survey The American Community Survey Margin of Error Tutorial goes through all this, so do this on your own time for practiceAmerican Community Survey Margin of Error Tutorial

46 Census data table modifications Preparing data takes understanding and time Probably best to do it in Excel ahead of time Always remember to process the GeoID2 field to make it text To be compatible with shape file:  Column names – 10 characters max, no spaces or symbols

47 Close Excel tables before opening ArcMap

48 From desktop, open the following mapfile American Community Survey Error Exploration \ Exploring Error in the American Community Survey.mxd

49 Showing in ArcMap Join the fixed Household Median Income table to Census Tract shape file Create a map of Household Median income – 5 classes by quantiles Right-click and copy tract layer Right-click on Layers and choose Paste Layers Map CV – 3 classes, with breaks at 15, 30, and max value

50 Symbolizing CV with hatch patterns

51 Hands on exploration of commute data

52 GIS Tools for Mapping ACS Estimates and Data Quality Information http://gesg.gmu.edu/

53 For your census mapping assignment You need to make 6 maps 6 different census variables (not necessarily from 6 different tables) At least two of the maps have to show ACS variables You don’t have to show CV on your maps but if you want to experiment, it’s good practice!

54 For your census mapping assignment You can use census data you find from GIS clearinghouses – e.g., MassGIS Instructions for clipping coastal tracts on GIS Tips and Tutorials web site

55 ACS and Error Always be aware of error Have a statement about error if you are making maps Might be good to visualize the CV as well, at least as an inset? In tables, include the margin of error It’s your reputation that’s at stake!


Download ppt "Exploring Error and the American Community Survey."

Similar presentations


Ads by Google