Download presentation
Presentation is loading. Please wait.
Published byRebecca Strickland Modified over 7 years ago
1
ACS Public Use Microdata Samples DataFerrett SACOG
Luz M Castillo Data Dissemination Specialist Los Angeles Regional Office U.S. Census Bureau
2
Outline Summary Data vs. Microdata Fundamentals of PUMS Data
Geography and the PUMS Accessing PUMS Data Documentation and Guidance
3
Summary Data Versus Microdata
Premade or published tables Easy to get, even for small areas Limitations: fixed content Dataset of individual responses to questionnaire Enables custom tables and analyses Limitations: edits to protect privacy, can’t study small areas 3 3
4
Summary Data Source: 2010 ACS 1-year Estimates. Table B FIRST ANCESTRY REPORTED 4
5
Microdata Source: 2010 ACS 1-year PUMS file
6
Microdata in SAS Source: 2010 ACS 1-year PUMS file.
7
Outline Summary data vs. Microdata Fundamentals of PUMS Data
Geography and the PUMS Accessing PUMS Data Documentation and Guidance
8
What are PUMS data? Public Use anonymized, downloadable Microdata
records of individual people Sample a representative sample of the population 8
9
PUMS Overview PUMS sample is a subsample of ACS interviews, one percent of all US households PUMS is a “weighted” sample Weighting variables must be used in analysis A set of two files - housing units and persons Available as SAS files, CSV files, via DataFerrett and redistributors such as IPUMS 9 9
10
Why Use PUMS? Data needed for a tabulation or a specific universe not supported by standard ACS tables (e.g., population groups by single year of age) Statistical analysis required to understand relationships between economic, demographic or housing variables (e.g., correlation analysis) Can create new measures using multiple variables or other people in household (spouse’s occupation, same-sex couples, number of kids) 10 10
11
ACS PUMS Availability Produced every year since 2000
Person-level files includes about 250 variables Housing unit files include about 200 variables Includes people in housing units and group quarters Includes many useful constructed variables (e.g., poverty status, subfamily identification, etc.) Includes collapsed codes for some variables (e.g., race, Hispanic origin, ancestry, place of birth, industry, occupation, etc.) 11
12
Person records in ACS PUMS
(millions) Person records in ACS complete data (millions) Population represented 2001 1.2 285 2002 287 2003 290 2004 293 2005 2.9 4.5 296 2006 3.0 298 2007 301 2008 304 2009 307 2010 3.1 309 2011 5.0 312 12
13
Types of PUMS Files Released
We release 3 new PUMS files every year 1 year PUMS (example: year PUMS) October 3-year PUMS (example: year PUMS) Discontinued after 2013 5-year PUMS (example: year PUMS) January Most documentation released one week prior to data 13
14
Modifications to Multiyear PUMS
Multiyear PUMS have the same cases and geography as their component 1-year files How are multiyear PUMS different from single year? Weights are produced using latest population estimate “vintages” Coding schemes and dollar amounts are standardized Why use the multiyear PUMS files? For studying small groups, where more cases are needed When analysis is also making use of multiyear summary data 14
15
Outline Summary data vs. Microdata Fundamentals of PUMS Data
Geography and the PUMS Accessing PUMS Data Documentation and Guidance
16
Limited Geographic Detail
Geographic identifiers are region, division, state, PUMA PUMAs can be used to identify most cities of 100,000+ and many metropolitan areas, but not all Combinations of adjacent counties and census tracts within states Also, divisions of geo areas (counties/cities) PUMS is not designed for statistical analysis of small geographic areas
17
Public Use Microdata Area (PUMA)
Defined after each census by the states in coordination with the Census Bureau’s Geography Division Redefined PUMAs for 2012 PUMS files Forthcoming multiyear files to have dual PUMA vintages Large enough to meet disclosure avoidance requirements An area of size 100,000 population or more To determine population, housing, or land ratio visit the Missouri State Data Center site PUMAs are identified by a five-digit number, unique within each state 17 17
18
Public Use Microdata Areas
19
PUMA Maps
20
PUMA Maps
21
2010 Census – PUMA Reference Map: Sacramento City (Central/Downtown & Midtown)
21 21 21
22
Outline Summary data vs. Microdata Fundamentals of PUMS Data
Geography and the PUMS Accessing PUMS Data Documentation and Guidance
23
American FactFinder 23
24
American FactFinder (cont’d)
24
25
American FactFinder (cont’d)
Main benefit of accessing PUMS via AFF: Convenient access if comfortable with AFF from regular use of summary tables
26
Census Bureau FTP Site
27
Census Bureau FTP Site (cont’d)
Main benefit of accessing PUMS via FTP: Complete listing of files by year and state
28
DataFerrett 28
29
DataFerrett (cont’d) Main benefit of accessing PUMS via DF:
Menu driven system doesn’t require knowledge of a stats package (i.e. SAS, SPSS, etc.) Ability to download variables individually 29
30
Powerful Tabulation Capabilities
Simple table layout that supports: Flexible design Frequencies and trends Spreadsheet math for robust analysis Complex nesting Hide columns/rows Applies weighting variables Fast results using large datasets Save as HTML, PDF & JPEG
31
Highlight spreadsheet rows or columns to create:
Data Visualization Highlight spreadsheet rows or columns to create: Maps Graphs
32
What We’re Working On Calculating variances on-the-fly for microdata tabulations Calculating margins of error for custom summations of aggregate data Integrating Google maps with DataFerrett thematic maps
33
Outline Summary data vs. Microdata Fundamentals of PUMS Data
Geography and the PUMS Accessing PUMS Data Documentation and Guidance
34
PUMS Documentation Subjects in the PUMS Code Lists
PUMS Top Coded and Bottom Coded Values PUMS Estimates for User Verification Accuracy of the PUMS 34
35
PUMS Guidance Compass Handbook on Using PUMS
soup-to-nuts overview of getting and using the data Training PPT on Using PUMS overview of PUMS basics
36
Exercise 1 In Placer County, how many foreign born individuals entered before 2000, between 2000 and 2009 and after 2010?
37
Exercise 1 – Nativity and Year of Entry
Access: American Community Survey, Year Estimates PUMS Foreign Born and Year of Entry Variables Create a Recode for Year of Entry All PUMAS within Placer County Create a Table
38
Go to www.census.gov Type ‘DataFerrett’ in the Search Box
39
Click ‘TheDataWeb – DataFerrett’
40
Launch DataFerrett
41
CAUTION Do Not Navigate Away or Close This Window While DataFerret is Loading
42
Enter Your Email Address and Click ‘Ok’
43
Click ‘Get Data Now’
44
American Community Survey with PUMS and Other Datasets
45
Select American Community Survey Open Public Use Microdata Sample to view years Select Click View Variables (drop down)
46
Click ‘Selectable Geographies’ and ‘Population’
Click ‘Selectable Geographies’ and ‘Population’. Click ‘Search Variables’
47
Click on ‘Variable Label’ to Alphabetize Column
48
Select ‘Nativity’. Hold control button down and select ‘Year of Entry (YOEP)’. Click ‘Browse/Select Highlighted Variable’ (Blue Button).
49
Check the box next to ‘Select’ ACS Nativity’
50
Highlight ‘ACS YOEP’ Check the box next to ‘Select’ ACS YOEP Year of entry’ Click ‘OK’
51
You have added 2 variables for your DataBasket Click ‘OK’
52
Double Click to ‘Selectable Geographies’ Variable’ Click ‘Browse/Select Highlighted Variables’ (Blue Button)
53
Select ‘Public Use Microdata Area’ from ‘Types of Geographies Available’. Highlight the PUMA code in the Hierarchies section and click ‘Use Hierarchy’ Hierarchies
54
Double click ‘California’ from ‘Select State of current residence’
Double click ‘California’ from ‘Select State of current residence’. Highlight ‘California’ in middle box and click ‘Next Level’
55
Note: ALL PUMAs in California are Listed by County Double Click or Highlight and drag PUMA/s to box on far right. Click ‘Finish’
56
Note: There are 3 variables in DataBasket Click on ‘Step2: DataBasket/Download/Make A Table’
57
Highlight ‘Year of Entry’ variable
Highlight ‘Year of Entry’ variable. Click ‘Recode Variable’ from right side of screen
58
Rename ‘Recode1’ to ‘Year of Entry Recode’
59
Highlight the categories from ‘1921 to 1999’ and click ‘Recode’ button below
60
Highlight all of the categories from ‘2000 to 2009’ and click ‘Recode’ button below
61
Note: there are three categories for the new recoded variable
Note: there are three categories for the new recoded variable Change the ‘Label’ Names by double clicking inside the cells. (Make sure to hit the Enter Key when completed).
62
Note: ‘Year of Entry Recode’ now listed Click ‘Make a Table’
63
Click ‘OK’
64
You Will Now Make A Nested Table Using the Variables
65
Drag the ‘Geog-101 PUMA’ to ‘C1,R2’
66
Drag ‘RECODE1 Year of Entry’ to ‘C2,R1’
67
Nest ‘Nativity’ variable by dropping it onto any of the ‘Year of Entry Labels’.
68
Click ‘GO Get Data’
69
From File, Click ‘Save As’
70
You Can Save to Your Desktop Save File as Text Document – Comma Delimited (Excel)
71
Exercise 2 In Sacramento County, what age group under 50 has a higher estimate of individuals with a disability?
72
Exercise 2 - Age and Disability
Access: American Community Survey, Year Estimates PUMS Population with a Disability Create a Recode for Age Disaggregation All PUMAS within Sacramento County Create a Pivot Table
73
Go to the ‘Step1’ Tab and Click ‘Empty DataBasket’
74
Select American Community Survey Open Public Use Microdata Sample to view years Select Click View Variables (drop down)
75
Click ‘Selectable Geographies’ and ‘Population’
Click ‘Selectable Geographies’ and ‘Population’. Click ‘Search Variables’
76
Click ‘Variable Label’ to Alphabetize Column
77
Select ‘Age’. Hold control button down and select and ‘Disability Recode’. Click ‘Browse/Select Highlighted Variables’ (Blue Button)
78
Check the box next to ‘Select’ ACS AGEP’
79
Highlight ‘ACS Disability Recode’
Check the box next to ‘Select’ ACS DIS Disability Recode’ Click ‘OK’
80
You have added 2 variables for your DataBasket Click ‘OK’
81
Note: 2 Variables selected in Data Basket Double Click to ‘Select Geographies’ Variable Click ‘Browse/Select Highlighted Variables’ (Blue Button)
82
Select ‘Public Use Microdata Area’ from ‘Types of Geographies Available’. Highlight the PUMA code in the Hierarchies section and click ‘Use Hierarchy’ Hierarchies
83
Double click ‘California’ from ‘Select State of current residence’
Double click ‘California’ from ‘Select State of current residence’. Highlight ‘California’ in middle box and click ‘Next Level’
84
Note: ALL PUMAs in California are Listed by County Double Click or Highlight and drag PUMA/s to box on far right. Click ‘Finish’
85
Note: There are 3 variables in DataBasket Click on ‘Step2: DataBasket/Download/Make A Table’
86
Highlight ‘Age’ variable
Highlight ‘Age’ variable. Click ‘Recode Variable’ from right side of screen
87
Rename ‘Recode1’ to ‘Age Recode’
88
Change Range to ‘1 through 17’ and Click ‘Recode’
2 Change Range to ‘18 through 19’ and Click ‘Recode’ 3 Change Range to ‘20 through 24’ and Click ‘Recode’
89
Change the rest of the age groups and recode
90
Note: there are nine categories for the new recoded variable
Note: there are nine categories for the new recoded variable Change the ‘Label’ Names by double clicking inside the cell. (Make sure to hit the Enter Key when completed).
91
Note: ‘Age Recode’ now listed Click ‘Make a Table’
92
Click ‘OK’
93
You Will Now Make A Nested Table Using the Variables
94
Making a Pivot Table 1. Drag and Drop “Recode 1 Age” to C1, R2
2 . Drag and Drop “GEOG-101 to C2, R1 3. Drag and Drop “Disability” above R1
95
Click ‘GO Get Data’
96
From ‘File’ drop-down, Click ‘Save As’
97
Save Your Table
98
Exercise 3 For each race group, which age group has the highest estimated number of males and females?
99
Exercise 3 – Sex by Race and Age
Accessing: American Community Survey, Year Estimates PUMS Add more variables Create a Race, Sex and Age Disaggregation All PUMAS within Sacramento County Create a Table, Chart and Map
100
Close Table and Click ‘Step1’ Tab
101
Select American Community Survey Open Public Use Microdata Sample to view years Select Click View Variables (drop down)
102
Click ‘Population’. Click ‘Search Variables’
103
Click on ‘Variable Label’ to Alphabetize Column
104
Select ‘RAC1P-Recoded Detailed Race Code’
Select ‘RAC1P-Recoded Detailed Race Code’. Hold control button down and select and ‘Sex’. Click ‘Browse/Select Highlighted Variables’ (Blue Button)
105
1. Click ‘Select ALL Variables’
2. Click ‘OK’ 3. Confirm that you have modified 2 Variables by Clicking ‘OK’
106
Click ‘Step2’ Tab and Click ‘Make a Table’
107
1. Drag and Drop “RAC1P” to C1, R2
2 . Drag and Drop “GEOG-101 to C2, R1 3. Drag and Drop “Recode1 Age” to C1, R2 (On top of ‘Total RAC1P’)
108
Click ‘GO Get Data’
110
To Create a Bar Chart or Map, Change the Variable Label and Highlight the estimates in that row, Click the Chart or Map Icons
112
Resources: Need Assistance?
Data Dissemination Branch Customer Liaison and Marketing Services Office U.S. Census Bureau (844) ASK-DATA Toll Free Cell: 112 112
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.