Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Integrated Public Use Microdata Series database IPUMSwww.ipums.org Lab 1 Background on the IPUMS and SPSS.

Similar presentations


Presentation on theme: "The Integrated Public Use Microdata Series database IPUMSwww.ipums.org Lab 1 Background on the IPUMS and SPSS."— Presentation transcript:

1

2 The Integrated Public Use Microdata Series database IPUMSwww.ipums.org Lab 1 Background on the IPUMS and SPSS

3  What is the IPUMS?  Who uses IPUMS?  What research is IPUMS best for? Lab 1: Introduction to the datasets  Other IPUMS-like datasets  Getting and using the data

4 Census Samples Included in the IPUMS

5 WHAT ARE MICRODATA? Individual-level data every record represents a separate person all of their individual characteristics are recorded users must manipulate the data themselves Different from aggregate/summary/tabular data a disability table from www.factfinder.census.gov an occupation table from a published census volume from the library

6 1930 Census Population Schedule, made public April 2002

7 Raw Census Microdata from IPUMS

8 Relationship Age Sex Race Birthplace Mother’s birthplace Occupation IPUMS Data Structure Household record (shaded) followed by a person record for each member of the household For each type of record, specific columns correspond to different variables

9 The Advantages of Microdata  Combination of all of a person’s characteristics  Characteristics of everyone with whom a person lived  Freedom to make any table you need  Freedom to make models to look at multivariate relationships

10 INTEGRATION What the IPUMS actually does to the original census samples

11 IPUMS Translation Table for RACE Original codes for “Black” IPUMS assigned codes Column location in original samples

12 IPUMS Translation Table for RELATIONSHIP

13 IPUMS Documentation: Farm Status Variable

14 Additional ways in which IPUMS improves the original samples  Additional documentation, including all enumeration forms and instructions  Consistent occupation/industry classifications  Consistent metropolitan classifications  Constructed family variables  Locator variables for spouse and parents

15  What is the IPUMS?  Who uses IPUMS?  What research is IPUMS best for? Lab 1: Introduction to the datasets  Other IPUMS-like datasets  Getting and using the data

16 Quantity of IPUMS Data Downloaded

17 Who uses the data? Approximately 9,000 registered users About 90% are affiliated with universities Among those: 40% are economists 25% are sociologists Most other academics are from the social sciences Other main users include journalists and policy-makers Profile of IPUMS users

18 How do people get IPUMS data 15% download complete datasets 1850-1970 datasets less than 1GB each 1980-2000 datasets about 5GB each We provide raw data and command files 85% make “extracts” using online interface Choose the variables you want We provide customized data and command files ?? Go to data redistributors Querylogic (www.querylogic.com) PDQ (www.pdq.com)

19  What is the IPUMS?  Who uses IPUMS?  What research is IPUMS best for? Lab 1: Introduction to the datasets  Other IPUMS-like datasets  Getting and using the data

20 4 Key Strengths of the Census Microdata Samples National in scope Results aren’t subject to local peculiarities Moreover, they provide context for local studies Have more cases than any comparable datasets Enable study of relatively small populations Large Long-term Provide historical depth Microdata Can make your own tabulations Apply multivariate techniques

21 Limitations of the Census Microdata Samples Geographic detail Confidentiality restrictions 1940-2000 Decennial Any historical analysis must use 10-year gaps Cross-sectional data Not longitudinal Need knowledge of a statistical package 1-in-100 samples (1-in-20 for 1970-2000) Too small to answer some questions

22 Studies that do not need to identify geographic areas of less than 100,000 after 1940 (e.g., cannot identify Clemson, SC. Can identify a group of several counties of which Clemson is a part). Subjects that are likely to deal with at least 10,000 people, preferably more. 10,000 individuals will generate about 100 cases in IPUMS. Anything less than this is probably too small a sample for useful analysis. Any analysis of census-related question that is not answered via the published census volumes or summary files. What type of question is IPUMS best suited for?

23 Published census volumes can tell you --How many southern-born persons of each race lived in each state in 1900, 1920, 1930, and 1960 --occupations of all African-Americans in the North But you’re also interested in --The jobs held by actual migrants --How their jobs compared to those who stayed home --How their jobs compared to northern-born blacks --How their settlement changed from 1870 onward An example: Southern migrants in the North 1870-1970

24 An example: Why this analysis works The numbers are very large --over 500,000 southerners are in the North in every decade from every decade from 1870 on --state of residence is available in every census --a sub-state designation known as State Economic Area (SEA) is even available for every census I don’t need to know particular towns Data not available anywhere else --and so it is worth the trouble

25 An example: What you can’t do with the IPUMS How did the southerners do in Pittsburgh? --IPUMS has data on 90 employed southern black men in Pittsburgh in 1970, fewer in previous years. --you don’t know their street, tract, or ward --all you know is their city, and only that if it was a pretty big one (>100K for 1940-50 and 1980-90; >250K for 1960-70; >100K in 2000). Were the migrants segregated in the north? --The census samples are cross-sectional databases, not longitudinal ones Did migrants’ jobs improve over time?

26  What is the IPUMS?  Who uses IPUMS?  What research is IPUMS best for? Lab 1: Introduction to the datasets  Other IPUMS-like datasets  Getting and using the data

27 Ongoing data projects at the MPC New high-density Public Use files 1880: 100% data for selected variables 20% sample for minorities (all variables) 10% sample for entire population (all variables) 1900:10% sample 1930: 5% sample 1960: 5% sample

28 Ongoing data projects at the MPC New high-density Public Use files: number of person records in each file 0 2,000,000 4,000,000 6,000,000 8,000,000 10,000,000 12,000,000 14,000,000 16,000,000 18,000,000 20,000,000 1850186018701880189019001910192019301940195019601970198019902000 Census year Samples planned and in progress Existing samples

29 New harmonized intercensal series American Community Survey Available from 2001-2002 on main IPUMS site 2003 data will be available in the Fall of 2004 March Current Population Survey Spans from 1962-2003 Available at http://beta.ipums.org/cps Includes special questions on labor markets Ongoing data projects at the MPC

30 IPUMS International Currently contains 22 samples from 6 countries About 80 variables currently available IPUMS Latin America 15 country project Got underway this year IPUMS Europe 18 country project Got underway this year

31  What is the IPUMS?  Who uses IPUMS?  What research is IPUMS best for? Lab 1: Introduction to the datasets  Other IPUMS-like datasets  Getting and using the data


Download ppt "The Integrated Public Use Microdata Series database IPUMSwww.ipums.org Lab 1 Background on the IPUMS and SPSS."

Similar presentations


Ads by Google