CMGPD-LN Methodological Lecture Day 1 Why Use Historical Data? Origins of the CMGPD-LN Basic Characteristics of the CMPGD-LN.

Slides:



Advertisements
Similar presentations
An Introduction to the UK Data Archive and the Economic and Social Data Service November 2007 Jack Kneeshaw, UKDA.
Advertisements

Transitions from independent to supported environments in England and Wales: examining trends and differentials using the ONS Longitudinal Study Emily.
Historical Population Register for Norway We are currently building a national Historical Population Register (HPR) for Norway based on mainly censuses.
The importance of life course research in an aging population ESRC International Centre for Life Course Studies in Society and Health UC London, Imperial,
A comparison of the characteristics of childless women and mothers in the ONS Longitudinal Study Simon Whitworth Martina Portanti Office for National Statistics.
The Demograpic Data Base (DDB) Umeå University, Sweden Professor Anders Brändström Associate professor Sören Edvinsson Centre for Population Studies.
Dissemination of U.S. Census Data and Results: The role of ICPSR First Conference of Al-Khawarezmi Committee on Statistics Doha, Qatar 6-8 December 2010.
Chuck Humphrey Data Library University of Alberta.
CMGPD-LN Substantive Lecture Day 6 Marriage and Reproduction.
Carl E. Bentelspacher, Ph.D., Department of Social Work Lori Ann Campbell, Ph.D., Department of Sociology Michael Leber Department of Sociology Southern.
M. Manfredini, M. Breschi & A. Fornasin Living arrangements and the elderly Casalguidi,
REPUBLIC OF TURKEY TURKISH STATISTICAL INSTITUTE TurkStat Population and Demography Statistics Department Population and Migration Statistics Team
Sample of Anonymised Records: User Meeting Propensity to migrate by ethnic group: 1991 & 2001 Paul Norman 1, John Stillwell 2 & Serena Hussain 2 School.
Search for Predictors of Exceptional Human Longevity: Using Computerized Genealogies and Internet Resources for Human Longevity Studies Natalia S. Gavrilova,
St. Lucia Country Report By Edwin St Catherine Director, Central Statistical Office Presented to IPUMS Workshop August 24 th, 2007.
Quantitative methods for researching lives through time Heather Laurie Institute for Social and Economic Research University of Essex
Migration, methodologies and health inequality SEED Group
The ONS Longitudinal Study. © London School of Hygiene and Tropical Medicine The Office for National Statistics Longitudinal Study (LS) o What is it o.
CMGPD-LN Methodological Lecture Day 7 Health and Mortality.
Lecture 3: Data sources Health inequality monitoring: with a special focus on low- and middle-income countries.
Household Projections for Northern Ireland 9 th September 2009 Dr David Marshall & Dr Jos IJpelaar Demography & Methodology Branch Northern Ireland Statistics.
The Gender Gap in Educational Attainment: Variation by Age, Race, Ethnicity, and Nativity in the United States Sarah R. Crissey, U.S. Census Bureau Nicole.
Introducing HealthStats Eleanor Howell, MS Manager, Data Dissemination Unit State Center for Health Statistics February 2, 2012.
Aspects of the National Health Interview Survey (NHIS) Chris Moriarity National Conference on Health Statistics August 16, 2010
Aging and family support in the State of Mexico Ma. Viridiana Sosa Márquez Centro de Investigación y Estudios Avanzados de la Población Universidad Autónoma.
Construction of a longitudinal and intergenerational database Antwerp, Prof. K. Matthijs Dra. S. Moreels.
Curating and Managing Research Data for Re-Use Review & Processing Jared Lyle.
Liesl Eathington Iowa Community Indicators Program Iowa State University October 2014.
U.S. Decennial Census Finding and Accessing Data Summer Durrant October 20, 2014 Data & Geographical Information Librarian Research Data Services
By HABIB ULLAH KHAN POPULATION CENSUS ORGANIZATION, PAKISTAN
Population Census Topics included in the 2011 Population and Housing Census for Jamaica Presented by: Valerie Nam Director, 2011 Population and Housing.
SJTU CMGPD Methodological Lecture Day 8 Family and contextual influences.
Study Designs Afshin Ostovar Bushehr University of Medical Sciences Bushehr, /4/20151.
Using IPUMS.org Katie Genadek Minnesota Population Center University of Minnesota The IPUMS projects are funded by the National Science.
Introduction to the Public Use Microdata Sample (PUMS) File from the American Community Survey Updated February 2013.
A graphical user interface for demographic simulation in high- performance environment Demonstration.
 According to UN recommendations and the Statistical Law of Cambodia, the Royal Government of Cambodia is Committed to conducting a general population.
United Nations Regional Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys Bangkok,
Data and Social Research Chuck Humphrey Data Library Rutherford North Library.
1 POPULATION PROJECTIONS Session 8 - Projections for sub- national and sectoral populations Ben Jarabi Population Studies & Research Institute University.
SJTU CMGPD 2012 Methodological Lecture Recommended Acknowledgments Contemporary Applications of Historical Data Origins of the CMGPD-LN Key Features.
Longitudinal Data Analysis Professor Vernon Gayle
1 Overview of the National Long Term Care Survey (NLTCS) Conference on Chinese Healthy Aging and Socioeconomic Development Durham, NC August, 2004 Nicholas.
Event Data History David Adams BNL Atlas Software Week December 2001.
Census.ac.uk The UK Census Longitudinal Studies Chris Dibben, University of St Andrews.
Social Statistics ESDS FEASIBILITY STUDY: CHANGING CIRCUMSTANCES DURING CHILDHOOD IAN PLEWIS and PIERRE WALTHERY UNIVERSITY OF MANCHESTER PRESENTATION.
Economics and Statistics Administration U.S. CENSUS BUREAU U.S. Department of Commerce Assessing the “Year of Naturalization” Data in the American Community.
SJTU CMGPD 2012 Methodological Lecture Day 3 Position and Status Variables.
2006 Annual Meeting of the Gerontological Society of America, Dallas, TX New Approach to Study Determinants of Exceptional Human Longevity Dr. Leonid A.
TRADE LIBERALIZATION AND CHILDREN Understanding and coping with children vulnerabilities Javier Escobal Group for the Analysis of Development.
Creating Open Data whilst maintaining confidentiality Philip Lowthian, Caroline Tudor Office for National Statistics 1.
SJTU CMGPD 2012 Methodological Lecture Day 1 (supplemental) Strengths and Weaknesses of the CMGPD-LN.
The Integrated Public Use Microdata Series database IPUMSwww.ipums.org Lab 1 Background on the IPUMS and SPSS.
Population Projections Input Data & UN Model Tables
2010 PHC NATIONAL PUBLICITY COMMITTEE SENSITIZATION WORKSHOP SEPTEMBER BASIC CONCEPTS, COVERAGE.
Demographic models Lecture 2. Stages and steps of modeling. Demographic groups, processes, structures, states. Processes: fertility, mortality, marriages,
Sources of Increasing Differential Mortality among the Aged by Socioeconomic Status Barry Bosworth, Gary Burtless and Kan Zhang T HE B ROOKINGS I NSTITUTION.
Insights on Adolescence From a Life Course Perspective.
REPUBLIC OF TURKEY TURKISH STATISTICAL INSTITUTE TurkStat Demography Statistics Department Population and Migration Statistics Group EXPERIENCES.
2010 World Programme on Population and Housing Censuses Workshop on Civil Registration and Vital Statistics in the UNESCWA Region Cairo, Egypt, December.
United Nations Economic Commission for Europe Statistical Division Migration stocks and flows: Basic concepts and definitions in the International recommendations.
Integrated Public Use Microdata Series IPUMSwww.ipums.org.
GHANA STATISTICAL SERVICE IPUMS – Country Report: Ghana BY N.N.N. Nsowah-Nuamah (Deputy Government Statistician)
2011 POPULATION AND HOUSING CENSUS OF TURKEY
POPULATION PROJECTIONS
CMGPD-LN Methodological Lecture
CMGPD-LN Methodological Lecture Day 3
Population and Housing Census 2015, and Challenge
Census topics selection
Presentation transcript:

CMGPD-LN Methodological Lecture Day 1 Why Use Historical Data? Origins of the CMGPD-LN Basic Characteristics of the CMPGD-LN

Contemporary Topics Family contextual effects on individual outcomes Neighborhood and community context Life-course processes – Conditions in childhood – Long-term effects of socioeconomic status Economic, climatic and other shocks Multigenerational processes – Interactions with stratification and inequality

Limitations of contemporary data Time depth – Panel/cohort studies are recent – Prospective data only for portions of life span – Exceptions: British Cohort Studies Family context – Limited to parents, sometimes siblings – Typically co-resident – Exceptions: PSID, WLS

Limitations of contemporary data Event counts – When mortality is low, ‘degree of freedom’ problem in all but the largest datasets – Difficult to explore complex interactions Exogenous shocks – Rare enough that their consequences are studied individually – Indonesian Tsunami, Hurricane Katrina etc.

Historical population databases Individual life histories Prospective In some cases… – Multigenerational – Household and community context – Kinship Exogenous shocks: Price spikes, climate fluctuations, disease epidemics High mortality levels Examples: CMGPD-LN, HSN, PRDH, UAS, UPD

CMGPD-LN Public release at ICPSR supported by the United States Department of Health and Human Services. National Institutes of Health. Eunice Kennedy Shriver National Institute of Child Health and Human Development (R01 HD A1) with funds from the American Recovery and Reinvestment Act

Recommended acknowledgement This research made use of the CMGPD-LN dataset. Preparation of the CMGPD-LN and documentation for public release via ICPSR DSDR was supported by United States Department of Health and Human Services National Institutes of Health Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD) R01 HD A1 "Multi-Generational Family and Life History Panel Dataset" with funds from the American Recovery and Reinvestment Act.

History of the CMGPD-LN Early 1980s: Ju Deyuan at the First Historical Archives alerted James Lee to the existence of the registers at the Liaoning Provincial Archives (LPA) in the early James Lee visits LPA three times Lee and Campbell visit LPA 1987 LPA provides Daoyi registers (dataset 1) that become basis of Fate and Fortune

History of the CMGPD-LN Datasets 3 and 2 obtained from LPA in early nineties and coded : datasets 4-10 – Datasets became available from the Genealogical Society of Utah – Data entry carried out in the United States : datasets – Data entry carried out in China

Recommended citations User guide – Lee, James Z, Cameron Campbell, and Shuang Chen China Multi-Generational Panel Dataset, Liaoning (CMGPD-LN) User Guide. Ann Arbor, MI: Inter- university Consortium for Political and Social Research. Dataset – Lee, James Z., and Cameron D. Campbell. China Multi- Generational Panel Dataset, Liaoning (CMGPD-LN), [Computer file]. ICPSR27063-v5. Ann Arbor, MI: Inter- university Consortium for Political and Social Research [distributor], doi: /ICPSR27063

CMGPD-LN Organization of the Release Basic Dataset (DS-001) – Identifiers for data management – Basic kinship and demographic variables Restricted Dataset (DS-002) – Names and village locations Analytic Dataset (DS-003) – Richer set of socioeconomic status variables Kinship Dataset (DS-004) – Ancestry identifiers – Constructed kin counts

CMGPD-LN Contents Longitudinal Individuals and households can be linked from one register to the next 1.5 million observations of 260,000 people 1,051 paternal descent groups identified through record linkage 698 communities Generational depth generations (Relatively) Easy to Use Resemble longitudinally-linked Censuses Discrete-time event history (logistic regression etc.)

CMGPD-LN Contents Demographic outcomes Mortality Marriage Reproduction (based on surviving children) Migration Timing of events Closed, can identify individuals at risk Health and Disability In early registers, annotation of specific conditions for adult males. In later registers, indicator of whether or not disabled for adult males.

CMGPD-LN Contents Socioeconomic characteristics Attainment of official position for adult males Status as an exam candidate, indicative of high education Given name Flag variables for types of name Diminutive, indicative of low status or aspirations Non-Han, indicative of expressed ethnicity Pinyin transcriptions in restricted release

CMGPD-LN Contents Geographic context Villages distributed across a region the size of New Jersey Wide variety of economic and ecological contexts Basic release Region Unique village identifier Restricted release Geocodes for villages accounting for 95% of population

CMGPD-LN Contents Household and family context Household of residence Relationship to head Relatives can be linked to reconstruct descent groups Via automated record linkage based on household relationship and longitudinal linkage of individual records Kin outside the household Based kinship variables, including parent identifiers, and counts of close kin, available now Additional constructed kinship variables available next year

REGION (approximate)

DISTRICT (approximate)

CMGPD-LN Format Similar in format to a series of triennial Censuses – Individuals listed in the same order and easy to link across time Organizing by community, kin group, household Detailed specification of relationship to household head Events since the previous register are annotated – Basis for construction of flag variables specifying occurrence of events between current register and the next Discrete-time event history analysis – Typically, logistic regression or complementary log-log regression – Outcome: death in the next three years Restricting to registers for which the immediately succeeding register is also available

CMGPD Processing Images scanned from microfilm Provided to coders in China Coders in China transcribe contents to Excel spreadsheets – Copy previous spreadsheet over and update based on contents of new register STATA programs import the contents of the spreadsheets and perform error-checking – Inconsistencies across registers Reports sent to coders for cleaning – Original registers coded ‘as is’, so if an inconsistency is in the original register we leave it STATA programs carry out automated linking of kin and generation of variables for analysis

Pre-1789 format Feidi Yimiangcheng, 1783

Post-1789 Format Feidi Yimiancheng, 1792

Daoyi 1816 Illegal Escape 23 sui 74 sui Dead 42 sui

Daoyi 1819 Dead New arrival

Using the Data RECORD_NUMBER RECORD_NUMBER identifies the same observation across the different datasets Use as the basis for one-to-one merge local cmgpd_ln_location "..\CMGPD-LN from ICPSR\ICPSR_27063“ use "`cmgpd_ln_location'\DS0001\ Data“ merge 1:1 RECORD_NUMBER using "`cmgpd_ln_location'\DS0003\ Data"

Using the Data RECORD_NUMBER If the merged datasets won’t fit into memory, make use of options on use and merge to load specific variables use RECORD_ID YEAR SEX using "`cmgpd_ln_location'\DS0001\ Data“ merge 1:1 RECORD_NUMBER using "`cmgpd_ln_location'\DS0003\ Data“, keepusing(NON_HAN_NAME) tab YEAR if SEX == 2, sum(NON_HAN_NAME)

Using the Data Missing Values Following standard practice, missing values are coded as -98 or -99 – -98 is structural missing – -99 is missing These are not the same as STATA missing, so observations will not be excluded automatically Especially in regressions, computations of means, etc., either manually exclude these, or recode to force exclusion – recode ZHI_SHI_REN =. or – summ ZHI_SHI_REN if ZHI_SHI_REN != -98 & ZHI_SHI_REN != -99