Deriving and Modelling Fertility Variables in the NCDS and BCS70 Dylan Kneale, Institute of Education Supervisors: Professor Heather Joshi & Dr Jane Elliott.

Slides:



Advertisements
Similar presentations
Being Educated or in Education: the Impact of Education on the Timing of Entry into Parenthood Dieter H. Demey Faculty of Social and Political Sciences.
Advertisements

What is Event History Analysis?
What is Event History Analysis?
If we use a logistic model, we do not have the problem of suggesting risks greater than 1 or less than 0 for some values of X: E[1{outcome = 1} ] = exp(a+bX)/
Logistic Regression Psy 524 Ainsworth.
Fertility history and health in later life: A study among older women and men in the British Household Panel Survey Sanna Read and Emily Grundy Centre.
Multilevel survival models A paper presented to celebrate Murray Aitkin’s 70 th birthday Harvey Goldstein ( also 70 ) Centre for Multilevel Modelling University.
The Changing Well-being of Older Status First Nations Adults An Application of the Registered Indian Human Development Index Symposium on Aboriginal Experiences.
HSRP 734: Advanced Statistical Methods July 24, 2008.
SC968: Panel Data Methods for Sociologists
Statistics for Managers Using Microsoft® Excel 5th Edition
How Long Until …? Given a strike, how long will it last?
Chapter 11 Survival Analysis Part 2. 2 Survival Analysis and Regression Combine lots of information Combine lots of information Look at several variables.
Analysis Age and Sex Distribution Data
An Introduction to Logistic Regression
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 11 th Edition.
FENICs Female Employment and Family Formation in National Institutional Contexts Women’s Entry into Motherhood in France, Sweden, East and West Germany,
Modeling clustered survival data The different approaches.
Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation.
Model Checking in the Proportional Hazard model
So are how the computer determines the size of the intercept and the slope respectively in an OLS regression The OLS equations give a nice, clear intuitive.
Assessing Survival: Cox Proportional Hazards Model Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research.
Survival Analysis A Brief Introduction Survival Function, Hazard Function In many medical studies, the primary endpoint is time until an event.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 12: Multiple and Logistic Regression Marshall University.
Inference for regression - Simple linear regression
Centre for Market and Public Organisation Understanding the effect of public policy on fertility Mike Brewer (Institute for Fiscal Studies) Anita Ratcliffe.
Pathways to Parenthood: Exploring the influence of Context as a Predictor of Early Parenthood PhD Student: Dylan Kneale Supervisors: Professor Heather.
Following lives from birth and through the adult years Examining the truth behind the myth of the 'the Monstrous Army on the March' Dylan.
Ingrid Schoon London, Institute of Education Llakes Conference London, 5-6th July 2010 Planning for the future: Changing education expectations in three.
Essentials of survival analysis How to practice evidence based oncology European School of Oncology July 2004 Antwerp, Belgium Dr. Iztok Hozo Professor.
G Lecture 121 Analysis of Time to Event Survival Analysis Language Example of time to high anxiety Discrete survival analysis through logistic regression.
On Model Validation Techniques Alex Karagrigoriou University of Cyprus "Quality - Theory and Practice”, ORT Braude College of Engineering, Karmiel, May.
Assessing Survival: Cox Proportional Hazards Model
Parents’ basic skills and children’s test scores Augustin De Coulon, Elena Meschi and Anna Vignoles.
2 December 2004PubH8420: Parametric Regression Models Slide 1 Applications - SAS Parametric Regression in SAS –PROC LIFEREG –PROC GENMOD –PROC LOGISTIC.
01/20151 EPI 5344: Survival Analysis in Epidemiology Survival curve comparison (non-regression methods) March 3, 2015 Dr. N. Birkett, School of Epidemiology,
Cognitive Development Across Adulthood Lecture 11/29/04.
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Week 7 Logistic Regression I.
HSRP 734: Advanced Statistical Methods July 17, 2008.
The Predictors and Consequences of Early Parenthood PhD Student: Dylan Kneale Supervisors: Professor Heather Joshi & Dr Jane Elliott Centre for Longitudinal.
The Challenge of Non- Response in Surveys. The Overall Response Rate The number of complete interviews divided by the number of eligible units in the.
Early Motherhood in the UK: Micro and Macro Determinants Denise Hawkes and Heather Joshi Centre for Longitudinal Research Institute of Education University.
Pro gradu –thesis Tuija Hevonkorpi.  Basic of survival analysis  Weibull model  Frailty models  Accelerated failure time model  Case study.
1 Using the Cohort Studies: Understanding the postponement of parenthood to later ages Ann Berrington ESRC Centre for Population Change University of Southampton,
Sub-regional Workshop on Census Data Evaluation, Phnom Penh, Cambodia, November 2011 Evaluation of Age and Sex Distribution United Nations Statistics.
Educational Attainment, Labour Market Conditions and the Timing of First and Higher-Order Births in Britain Andrew Jenkins, Heather Joshi & Mark Killingsworth.
Parametric Conditional Frailty Models for Recurrent Cardiovascular Events in the LIPID Study Dr Jisheng Cui Deakin University, Melbourne.
Predicting Stage Transitions in the Development of Nicotine Dependence Carolyn E. Sartor, Hong Xian, Jeffrey F. Scherrer, Michael Lynskey, William True,
01/20151 EPI 5344: Survival Analysis in Epidemiology Cox regression: Introduction March 17, 2015 Dr. N. Birkett, School of Epidemiology, Public Health.
Chapter 8: Simple Linear Regression Yang Zhenlin.
Logistic Regression Analysis Gerrit Rooks
Treat everyone with sincerity,
Love does not come by demanding from others, but it is a self initiation. Survival Analysis.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 10 th Edition.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
Joint Modelling of Accelerated Failure Time and Longitudinal Data By By Yi-Kuan Tseng Yi-Kuan Tseng Joint Work With Joint Work With Professor Jane-Ling.
REGRESSION MODEL FITTING & IDENTIFICATION OF PROGNOSTIC FACTORS BISMA FAROOQI.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 13: Multiple, Logistic and Proportional Hazards Regression.
03/20161 EPI 5344: Survival Analysis in Epidemiology Testing the Proportional Hazard Assumption April 5, 2016 Dr. N. Birkett, School of Epidemiology, Public.
Stats Methods at IC Lecture 3: Regression.
Chapter 15 Multiple Regression Model Building
Logistic Regression APKC – STATS AFAC (2016).
Understanding Non Response in the 1958 Birth Cohort
Anastasiia Raievska (Veramed)
Multiple Regression Prof. Andy Field.
Notes on Logistic Regression
Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II
EVENT PROJECTION Minzhao Liu, 2018
Centre for Market and Public Organisation
Love does not come by demanding from others, but it is a self initiation. Survival Analysis.
Presentation transcript:

Deriving and Modelling Fertility Variables in the NCDS and BCS70 Dylan Kneale, Institute of Education Supervisors: Professor Heather Joshi & Dr Jane Elliott

Pathways to Parenthood: Exploring the influence of Context as a Predictor of Timing to Parenthood Overall Aims 1. Define early parenthood…teenage? 2. Explore strength of different ‘known’ sets of predictors of early parenthood 3. Explore the influence of context as a predictor of early parenthood 4. Explore the influence of context on the other side of the spectrum: postponement and childlessness

1958 NCDS Birth 1965 NCDS (Age 7) 1969 NCDS (Age 11) 1970 BCS70 Birth 1974 NCDS (Age 16) 1975 BCS70 (Age 5) 1981 NCDS (Age 23) 1980 BCS70 (Age 10) 1986 BCS70 (Age 16) 1991 NCDS (Age 33) 1996 BCS70 (Age 26) 2000 NCDS (Age 42) 2000 BCS70 (Age 30) 2004 BCS70 (Age 34) 2004 NCDS (Age 46) NCDS BCS70 i. Deriving fertility variables (NCDS) ii. Modelling fertility variables (NCDS & BCS70)

Fertility variables collected at all waves since childhood (Age 23, 33, 41/42, 46 years) 2004 sweep allows for analysis of full fertility schedule adding to previous analyses of NCDS cohort e.g. Holdsworth & Elliott (2001) Want to create variables for Event History Analysis First attempt to create variable for modelling entry into parenthood in Event History terms could work as: Time to first parenthood (Event) = Minimum Recorded Child’s Date of Birth (Age 23, 33, 41/42, 46 years) Childless cohort members (Censored) = Maximum Recorded Interview Date (Age 23, 33, 41/42, 46 years) Deriving fertility variables (NCDS) I

Using this method produces the following summary KM statistics: Deriving fertility variables (NCDS) II Median Age 1 st Parenthood % Childless at last observation (46) ♂ 30.6 years26.7% ♀ 27.0 years20.4% Median estimates of entry to parenthood are higher than other sources for NCDS. However, of more concern; estimates of childlessness using data up to 46 years don’t differ significantly from those up to 33 years. Equivalent of only additional 3.3% of women becoming mothers (Holdsworth & Elliott 2001). ONS estimates transition between 33 and 46 years at twice this rate

Deriving fertility variables (NCDS) III Possible discrepancy at age 41/42 years: Partially complete fertility history collected ^ Symbol reflects a ‘text fill’ – used in CAPI questionnaires. Text fill used to tailor questionnaire to respondent. “Since 1991” meant to be applicable to all those present at age 33 years but not those missing. Possible that this filter was used for those rejoining the study at age 41/42 and 46 years when not needed? Build up evidence for this:

Deriving fertility variables (NCDS) IV Evidence 1: Those rejoining the study had a lower number of births recorded before 1991 than those continuing. 6.3% of births recorded at 41/42 years occurred before previous interview for those continuing in the study 3.7% of births for those re-entering the study occurred before 1991 Evidence 2: Those recorded as childless had children using information from other sources: Of 880 cohort members recorded as being childless at 41/42 years and not present at data collection at 33 years 12% had children living elsewhere (natural?) Conclusive proof: 44% had natural children over 9 years old living with them in household

Deriving fertility variables (NCDS) V Evidence suggests that filter applied to both those continuing study and rejoining study. In which case, fertility histories collected that do not include age 33 years may have to be capped or excluded: PresentNumberTruncation/Adjustment Ages 23, 33, 41-42, 46 years7138Censored at 46 years Ages 33, 41-42, 46 years947Censored at 46 years Ages 41-42, 46 years294Not used Ages 23, 46 years104Censored at 23 years Ages 23, years383Censored at 23 years Ages 23, 33 years887Censored at 33 years Ages 23, 33, years1444Censored at 42 years Ages 33 and 46 years63Censored at 46 years Age 23 years1591Censored at 23 years Age 33 years310Censored at 33 years Age years203Not used Ages 33, years320Censored at 42 years Ages 23, 33 and 46 years298Censored at 46 years Ages 23, 41-42, 46 years690Censored at 23 years Total Potentially Included14672

Deriving fertility variables (NCDS) VI New method gives following summary KM statistics: Median Age 1 st Parenthood % Childless at last observation (46) ♂ 29.4 years20.7% ♀ 26.5 years15.6% More importantly, those detected as being possible parents at a later wave of data collection but with no accurate fertility history are censored at an earlier point – this applies to 430+ cohort members. Has implications for the whole fertility schedule for men and women. This factor could be responsible for inflated estimates of childlessness among NCDS cohort members found in other sources. The method used here results in slightly smaller sample but one that errs on the side of caution

Modelling fertility variables (NCDS & BCS70) Can see how highest hazard of entry into parenthood is reached among NCDS earlier than among BCS70 cohort. Inverse bathtub shape of hazard for NCDS. For BCS70, shape is a little more variable. However, this applies to whole distribution. Interest in my particular case is entry to early parenthood

Modelling fertility (NCDS & BCS70) I Strategies for event history modelling using parenthood data: Have continuous data (as opposed to discrete) – first stage in guiding model selection Began with a Cox’s Proportional Hazards Model as find it intuitive and easier to compute and interpret. Also can use same model when hazard is different between data as no assumption made. Basic model: At each point, model is estimated through comparing the characteristics of an individual experiencing an event compared to those who remain in the risk set. Used Tenure as an example to assess suitability. Tenure is a universal predictor in other models.

Modelling fertility (NCDS & BCS70) III A fundamental assumption of Cox Proportional Hazard Model is that the Hazard remains proportional throughout the observation period. Assessed validity of assumption graphically and through statistical test. Numerous ways of assessing graphically. According to the PH assumption, while difference in cumulative hazard would vary absolutely, difference in log cumulative hazard should remain constant with no systematic variation with time (Singer and Willett 2003)

Modelling fertility (NCDS & BCS70) III Also tested PH assumption through Schoenfeld residual test – examining departure from 0. Significantly different suggests not Proportional e.g.: BCS70 Female model (entry up to 23 yrs) Tenureρχ²p-value Owner Occ--- Council Private/Oth Full Model Test Possible solutions 1.Limit observation time 2.Consider Using Time Varying Covariates 3.Consider using a different model

Modelling fertility (NCDS & BCS70) IV Limiting observation time to between years did work but against message and evidence presented in rest of thesis. Using a time varying covariate is okay for Tenure as data supports this. However, may be poorer strategy in terms of data for larger models. Plus computationally difficult.

Modelling fertility (NCDS & BCS70) V Stratification not really an option with my data – know that numerous factors predict early parenthood and potentially split sample. Tried interacting Tenure with time to make the model explicitly non- proportional: where Interaction terms are significant. However, in extended models interacting time with covariates will be computationally difficult and also difficult to interpret. Use the AIC from these models to compare with other modelling strategies.

Modelling fertility (NCDS & BCS70) VI Alternative modelling strategies. Want an alternative that: -Can be used for both genders and both cohorts -Know that hazards are not monotonic – want alternative that can deal with these. Can rule out PH models. Can rule out only monotonic models - Weibull, Gompertz and Exponential distributions (Wu and Chuang 2002) Left with 3 types of Accelerated Failure Time Models – Gamma, Log- logistic and Lognormal models

Modelling fertility (NCDS & BCS70) VII Accelerated Failure Time Models analogous to simple linear model and do not model the hazard directly but model survival time. Specification (distribution) for the δ term and intercept distinguishes between models – follow the normal, logistic or gamma distribution Test these models using tenure and compare results using AIC (Akaike’s Information Criteria) to find best fit. For NCDS, all three models produced very similar results. Little differentiation either in parameter values or model fitting statistics, as other studies (Kwong and Hutton 2003; Cleves, Gould et al. 2004; Ghilagaber 2005). AIC estimates for all three distributions are all similar and all substantially lower than the AIC for the best fitting Cox model constructed (inc Time interacted model).

LognormalLog-logisticGeneralised Gamma ♂♀♂♀♂♀ Baseline: Owner Occupation Council-0.097**-0.154**-0.093** **-0.152** Private-0.038*-0.106**-0.038* *-0.106** Tied and Other-0.083**-0.072**-0.083** **-0.072** Log-Likelihood Akaike Information Criteria

Modelling fertility (NCDS & BCS70) X When examining differences in AIC, Log-logistic gives marginally poorer fitting values consistently leaving choice between Gamma and Lognormal models. Gamma model is particularly suitable for “bath-tub” shape distribution and used often in Demography for modelling mortality. Inverse is suitable for fertility and would be suitable for modelling whole NCDS fertility distribution. However, as I am modelling early fertility then Gamma model not as suitable – tries to model concave shape when one not always present. Therefore using Lognormal models to model entry into first parenthood in early adulthood

Modelling fertility (NCDS & BCS70) XI Univariate results (Time Ratio) for years: NCDSBCS70 ♂♀♂♀ Baseline (Owner Occupation) Council0.907**0.858**0.849**0.821** Private0.963*0.900** * Tied and Other0.921**0.930**

CohortDefinition Tenure (Baseline: Owner Occupation Only) Mixed Owner Occupation Tenure Only Council Tenure Some Council, no owner occupation tenure Other NCDS Early Fatherhood **2.172**1.527 Very Early Fatherhood1.493*1.522** Teenage FatherhoodNot significant in full model Lognormal time to first fatherhood (16-23) **0.947*0.954* Lognormal time to first fatherhood (16-30) 0.960**0.961** ** BCS70 Early Fatherhood ** Very Early Fatherhood * Teenage FatherhoodNot significant in full model Lognormal time to first fatherhood (16-23) Not significant in full model Lognormal time to first fatherhood (16-30) * ** p < 0.01; * p < 0.05

Conclusions – challenges I found when modelling fertility When deriving NCDS fertility variables need to acknowledge that participation at Wave 5 (Age 33 years) is crucial in determining inclusion criteria. Failure to adjust for this leads to modest change in median survival time and larger changes in estimates of childlessness CAPI filters? Traditional Cox model was not suited to my data even after allowing for Time varying covariates etc Final choice between Gamma and Lognormal Accelerated Failure Time models. Gamma more suitable for whole distribution; Lognormal for early parenthood