SJTU CMGPD 2012 Methodological Lecture Recommended Acknowledgments Contemporary Applications of Historical Data Origins of the CMGPD-LN Key Features.

Slides:



Advertisements
Similar presentations
An Introduction to the UK Data Archive and the Economic and Social Data Service November 2007 Jack Kneeshaw, UKDA.
Advertisements

Transitions from independent to supported environments in England and Wales: examining trends and differentials using the ONS Longitudinal Study Emily.
Historical Population Register for Norway We are currently building a national Historical Population Register (HPR) for Norway based on mainly censuses.
The importance of life course research in an aging population ESRC International Centre for Life Course Studies in Society and Health UC London, Imperial,
A comparison of the characteristics of childless women and mothers in the ONS Longitudinal Study Simon Whitworth Martina Portanti Office for National Statistics.
The Demograpic Data Base (DDB) Umeå University, Sweden Professor Anders Brändström Associate professor Sören Edvinsson Centre for Population Studies.
Dissemination of U.S. Census Data and Results: The role of ICPSR First Conference of Al-Khawarezmi Committee on Statistics Doha, Qatar 6-8 December 2010.
Chris Dibben University of Edinburgh Linking historical administrative data.
Chuck Humphrey Data Library University of Alberta.
CMGPD-LN Substantive Lecture Day 6 Marriage and Reproduction.
Carl E. Bentelspacher, Ph.D., Department of Social Work Lori Ann Campbell, Ph.D., Department of Sociology Michael Leber Department of Sociology Southern.
M. Manfredini, M. Breschi & A. Fornasin Living arrangements and the elderly Casalguidi,
REPUBLIC OF TURKEY TURKISH STATISTICAL INSTITUTE TurkStat Population and Demography Statistics Department Population and Migration Statistics Team
Sample of Anonymised Records: User Meeting Propensity to migrate by ethnic group: 1991 & 2001 Paul Norman 1, John Stillwell 2 & Serena Hussain 2 School.
Search for Predictors of Exceptional Human Longevity: Using Computerized Genealogies and Internet Resources for Human Longevity Studies Natalia S. Gavrilova,
CMGPD-LN Methodological Lecture Day 1 Why Use Historical Data? Origins of the CMGPD-LN Basic Characteristics of the CMPGD-LN.
St. Lucia Country Report By Edwin St Catherine Director, Central Statistical Office Presented to IPUMS Workshop August 24 th, 2007.
Quantitative methods for researching lives through time Heather Laurie Institute for Social and Economic Research University of Essex
Migration, methodologies and health inequality SEED Group
The ONS Longitudinal Study. © London School of Hygiene and Tropical Medicine The Office for National Statistics Longitudinal Study (LS) o What is it o.
CMGPD-LN Methodological Lecture Day 7 Health and Mortality.
Household Projections for Northern Ireland 9 th September 2009 Dr David Marshall & Dr Jos IJpelaar Demography & Methodology Branch Northern Ireland Statistics.
The Gender Gap in Educational Attainment: Variation by Age, Race, Ethnicity, and Nativity in the United States Sarah R. Crissey, U.S. Census Bureau Nicole.
The effects of persistent poverty on children’s outcomes Dr Jung-Sook Lee University of New South Wales.
Aspects of the National Health Interview Survey (NHIS) Chris Moriarity National Conference on Health Statistics August 16, 2010
*Chapter One: What is Footnote?* Footnote allows people to find and share over 70 million historical documents Use the search engine to explore documents.
Construction of a longitudinal and intergenerational database Antwerp, Prof. K. Matthijs Dra. S. Moreels.
Curating and Managing Research Data for Re-Use Review & Processing Jared Lyle.
Liesl Eathington Iowa Community Indicators Program Iowa State University October 2014.
Using the Health Survey for England to examine ethnic differences in obesity, diet and physical activity Vanessa Higgins & Angela Dale Centre for Census.
U.S. Decennial Census Finding and Accessing Data Summer Durrant October 20, 2014 Data & Geographical Information Librarian Research Data Services
By HABIB ULLAH KHAN POPULATION CENSUS ORGANIZATION, PAKISTAN
Gender Differences in Longevity Predictors: Effects of Early-Life and Midlife Conditions on Exceptional Longevity Leonid A. Gavrilov Natalia S. Gavrilova.
SJTU CMGPD 2012 Methodological Lecture Day 9 Kinship.
SJTU CMGPD Methodological Lecture Day 8 Family and contextual influences.
Study Designs Afshin Ostovar Bushehr University of Medical Sciences Bushehr, /4/20151.
Using IPUMS.org Katie Genadek Minnesota Population Center University of Minnesota The IPUMS projects are funded by the National Science.
Introduction to the Public Use Microdata Sample (PUMS) File from the American Community Survey Updated February 2013.
The 2006 National Health Interview Survey (NHIS) Paradata File: Overview And Applications Beth L. Taylor 2008 NCHS Data User’s Conference August 13 th,
A graphical user interface for demographic simulation in high- performance environment Demonstration.
 According to UN recommendations and the Statistical Law of Cambodia, the Royal Government of Cambodia is Committed to conducting a general population.
Data and Social Research Chuck Humphrey Data Library Rutherford North Library.
1 POPULATION PROJECTIONS Session 8 - Projections for sub- national and sectoral populations Ben Jarabi Population Studies & Research Institute University.
Longitudinal Data Analysis Professor Vernon Gayle
1 Data Linkage for Educational Research Royal Statistical Society March 19th 2007 Andrew Jenkins and Rosalind Levačić Institute of Education, University.
Census.ac.uk The UK Census Longitudinal Studies Chris Dibben, University of St Andrews.
Social Statistics ESDS FEASIBILITY STUDY: CHANGING CIRCUMSTANCES DURING CHILDHOOD IAN PLEWIS and PIERRE WALTHERY UNIVERSITY OF MANCHESTER PRESENTATION.
Economics and Statistics Administration U.S. CENSUS BUREAU U.S. Department of Commerce Assessing the “Year of Naturalization” Data in the American Community.
SJTU CMGPD 2012 Methodological Lecture Day 3 Position and Status Variables.
2006 Annual Meeting of the Gerontological Society of America, Dallas, TX New Approach to Study Determinants of Exceptional Human Longevity Dr. Leonid A.
The Role of Metadata in Census Data Dissemination Presented By Mrs. Shirley Christian-Maharaj Assistant Director of Statistics CSO Trinidad &Tobago.
TRADE LIBERALIZATION AND CHILDREN Understanding and coping with children vulnerabilities Javier Escobal Group for the Analysis of Development.
Creating Open Data whilst maintaining confidentiality Philip Lowthian, Caroline Tudor Office for National Statistics 1.
SJTU CMGPD 2012 Methodological Lecture Day 1 (supplemental) Strengths and Weaknesses of the CMGPD-LN.
The Integrated Public Use Microdata Series database IPUMSwww.ipums.org Lab 1 Background on the IPUMS and SPSS.
Population Projections Input Data & UN Model Tables
Agency on Statistics of the Republic of Kazakhstan A strategy for the dissemination of statistical information: the Kazakhstan experience.
Demographic models Lecture 2. Stages and steps of modeling. Demographic groups, processes, structures, states. Processes: fertility, mortality, marriages,
Sources of Increasing Differential Mortality among the Aged by Socioeconomic Status Barry Bosworth, Gary Burtless and Kan Zhang T HE B ROOKINGS I NSTITUTION.
Insights on Adolescence From a Life Course Perspective.
REPUBLIC OF TURKEY TURKISH STATISTICAL INSTITUTE TurkStat Demography Statistics Department Population and Migration Statistics Group EXPERIENCES.
2010 World Programme on Population and Housing Censuses Workshop on Civil Registration and Vital Statistics in the UNESCWA Region Cairo, Egypt, December.
United Nations Economic Commission for Europe Statistical Division Migration stocks and flows: Basic concepts and definitions in the International recommendations.
GHANA STATISTICAL SERVICE IPUMS – Country Report: Ghana BY N.N.N. Nsowah-Nuamah (Deputy Government Statistician)
2011 POPULATION AND HOUSING CENSUS OF TURKEY
POPULATION PROJECTIONS
CMGPD-LN Methodological Lecture
Working with PolicyMap
Presentation transcript:

SJTU CMGPD 2012 Methodological Lecture Recommended Acknowledgments Contemporary Applications of Historical Data Origins of the CMGPD-LN Key Features

CMGPD-LN Public release at ICPSR supported by the United States Department of Health and Human Services. National Institutes of Health. Eunice Kennedy Shriver National Institute of Child Health and Human Development (R01 HD A1) with funds from the American Recovery and Reinvestment Act

Acknowledging the CMGPD Please include acknowledge and cite the CMGPD in your publications This will allow us to document use of the CMGPD Will facilitate future applications for support to release additional databases by providing evidence of demand Please also send us copies of any papers that results from use of the CMGPD

Recommended acknowledgement Please include in ALL publications This research made use of the CMGPD-LN dataset. Preparation of the CMGPD-LN and documentation for public release via ICPSR DSDR was supported by United States Department of Health and Human Services National Institutes of Health Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD) R01 HD A1 "Multi-Generational Family and Life History Panel Dataset" with funds from the American Recovery and Reinvestment Act.

Recommended citations Please include in ALL Publications User guide – Lee, James Z, Cameron Campbell, and Shuang Chen China Multi-Generational Panel Dataset, Liaoning (CMGPD-LN) User Guide. Ann Arbor, MI: Inter- university Consortium for Political and Social Research. Dataset – Lee, James Z., and Cameron D. Campbell. China Multi- Generational Panel Dataset, Liaoning (CMGPD-LN), [Computer file]. ICPSR27063-v5. Ann Arbor, MI: Inter- university Consortium for Political and Social Research [distributor], doi: /ICPSR27063

Contemporary Topics Family contextual effects on individual outcomes Neighborhood and community context Life-course processes – Conditions in childhood – Long-term effects of socioeconomic status Economic, climatic and other shocks Multigenerational processes – Interactions with stratification and inequality

Limitations of contemporary data Time depth – Panel/cohort studies are recent – Prospective data only for portions of life span – Exceptions: British Cohort Studies Family context – Limited to parents, sometimes siblings – Typically co-resident – Exceptions: PSID, WLS

Limitations of contemporary data Event counts – When mortality is low, ‘degree of freedom’ problem in all but the largest datasets – Difficult to explore complex interactions Exogenous shocks – Rare enough that their consequences are studied individually – Indonesian Tsunami, Hurricane Katrina etc.

Historical population databases Individual life histories Prospective In some cases… – Multigenerational – Household and community context – Kinship Exogenous shocks: Price spikes, climate fluctuations, disease epidemics High mortality levels Examples: CMGPD-LN, HSN, PRDH, UAS, UPD

History of the CMGPD-LN Early 1980s: Ju Deyuan at the First Historical Archives alerted James Lee to the existence of the registers at the Liaoning Provincial Archives (LPA) in the early James Lee visits LPA three times Lee and Campbell visit LPA 1987 LPA provides Daoyi registers (dataset 1) that become basis of Fate and Fortune

History of the CMGPD-LN Datasets 3 and 2 obtained from LPA in early nineties and coded : datasets 4-10 – Datasets became available from the Genealogical Society of Utah – Data entry carried out in the United States : datasets – Data entry carried out in China

CMGPD-LN Organization of the Release Basic Dataset (DS-001) – Identifiers for data management, basic variables Restricted Dataset (DS-002) – Names and village locations Analytic Dataset (DS-003) – Richer set of socioeconomic status variables Kinship Dataset (DS-004) – Ancestry identifiers, constructed kin counts Additional files with

CMGPD-LN Contents Longitudinal Individuals and households can be linked from one register to the next 1.5 million observations of 260,000 people 1,051 paternal descent groups identified through record linkage 698 communities Generational depth generations (Relatively) Easy to Use Resemble longitudinally-linked Censuses Discrete-time event history (logistic regression etc.)

CMGPD-LN Contents Demographic outcomes Mortality Marriage Reproduction (based on surviving children) Migration Timing of events Closed, can identify individuals at risk Health and Disability In early registers, annotation of specific conditions for adult males. In later registers, indicator of whether or not disabled for adult males.

CMGPD-LN Contents Socioeconomic characteristics Attainment of official position for adult males Status as an exam candidate, indicative of high education Given name Flag variables for types of name Diminutive, indicative of low status or aspirations Non-Han, indicative of expressed ethnicity Pinyin transcriptions in restricted release

CMGPD-LN Contents Geographic context Villages distributed across a region the size of New Jersey Wide variety of economic and ecological contexts Basic release Region Unique village identifier Restricted release Geocodes for villages accounting for 95% of population

CMGPD-LN Contents Household and family context Household of residence Relationship to head Relatives can be linked to reconstruct descent groups Via automated record linkage based on household relationship and longitudinal linkage of individual records Kin outside the household Based kinship variables, including parent identifiers, and counts of close kin, available now Additional constructed kinship variables available next year

REGION (approximate)

DISTRICT (approximate)

CMGPD-LN Format Similar in format to a series of triennial Censuses – Individuals listed in the same order and easy to link across time Organizing by community, kin group, household Detailed specification of relationship to household head Events since the previous register are annotated – Basis for construction of flag variables specifying occurrence of events between current register and the next Discrete-time event history analysis – Typically, logistic regression or complementary log-log regression – Outcome: death in the next three years Restricting to registers for which the immediately succeeding register is also available

CMGPD Processing Images scanned from microfilm Provided to coders in China Coders in China transcribe contents to Excel spreadsheets – Copy previous spreadsheet over and update based on contents of new register STATA programs import the contents of the spreadsheets and perform error-checking – Inconsistencies across registers Reports sent to coders for cleaning – Original registers coded ‘as is’, so if an inconsistency is in the original register we leave it STATA programs carry out automated linking of kin and generation of variables for analysis

Pre-1789 format Feidi Yimiangcheng, 1783

Post-1789 Format Feidi Yimiancheng, 1792

Daoyi 1816 Illegal Escape 23 sui 74 sui Dead 42 sui

Daoyi 1819 Dead New arrival

Using the Data RECORD_NUMBER RECORD_NUMBER identifies the same observation across the different datasets Use as the basis for one-to-one merge local cmgpd_ln_location "..\CMGPD-LN from ICPSR\ICPSR_27063“ use "`cmgpd_ln_location'\DS0001\ Data“ merge 1:1 RECORD_NUMBER using "`cmgpd_ln_location'\DS0003\ Data"

Using the Data RECORD_NUMBER If the merged datasets won’t fit into memory, make use of options on use and merge to load specific variables use RECORD_ID YEAR SEX using "`cmgpd_ln_location'\DS0001\ Data“ merge 1:1 RECORD_NUMBER using "`cmgpd_ln_location'\DS0003\ Data“, keepusing(NON_HAN_NAME) tab YEAR if SEX == 2, sum(NON_HAN_NAME)

Using the Data Missing Values Following standard practice, missing values are coded as -98 or -99 – -98 is structural missing – -99 is missing These are not the same as STATA missing, so observations will not be excluded automatically Especially in regressions, computations of means, etc., either manually exclude these, or recode to force exclusion – recode ZHI_SHI_REN =. or – summ ZHI_SHI_REN if ZHI_SHI_REN != -98 & ZHI_SHI_REN != -99