Judith D. Singer & John B. Willett Harvard Graduate School of Education Modeling discontinuous and nonlinear change ALDA, Chapter Six “Things have changed”

Slides:



Advertisements
Similar presentations
Judith D. Singer & John B. Willett Harvard Graduate School of Education Discrete-time survival analysis ALDA, Chapters 10, 11, and 12 Times change, and.
Advertisements

Correlation and regression
Economics 20 - Prof. Anderson1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 7. Specification and Data Problems.
Polynomial Regression and Transformations STA 671 Summer 2008.
This PowerPoint presentation shows you how to use the NRM 1.0.xls Excel Workbook to fit several popular regression models to experimental data. The models.
Probability Distributions CSLU 2850.Lo1 Spring 2008 Cameron McInally Fordham University May contain work from the Creative Commons.
11 Simple Linear Regression and Correlation CHAPTER OUTLINE
Multiple Regression Fenster Today we start on the last part of the course: multivariate analysis. Up to now we have been concerned with testing the significance.
Lecture 8 Relationships between Scale variables: Regression Analysis
John B. Willett & Judith D. Singer Harvard Graduate School of Education Introducing discrete-time survival analysis ALDA, Chapter Eleven “To exist is to.
1 Module II Lecture 3: Misspecification: Non-linearities Graduate School Quantitative Research Methods Gwilym Pryce.
Longitudinal Experiments Larry V. Hedges Northwestern University Prepared for the IES Summer Research Training Institute July 28, 2010.
1Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 7. Specification and Data Problems.
Statistics for Managers Using Microsoft® Excel 5th Edition
Stat 112: Lecture 10 Notes Fitting Curvilinear Relationships –Polynomial Regression (Ch ) –Transformations (Ch ) Schedule: –Homework.
Statistics for Managers Using Microsoft® Excel 5th Edition
Econ 140 Lecture 131 Multiple Regression Models Lecture 13.
John B. Willett & Judith D. Singer Harvard Graduate School of Education Introducing the Multilevel Model for Change: ALDA, Chapter Three “When you’re finished.
Lecture 19: Tues., Nov. 11th R-squared (8.6.1) Review
Multiple Regression Models
Longitudinal Research: Present status and future prospects John B. Willett & Judith D. Singer Harvard University Graduate School of Education Contact us.
The Analysis of Variance
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 11 th Edition.
Spreadsheet Problem Solving
10. Introduction to Multivariate Relationships Bivariate analyses are informative, but we usually need to take into account many variables. Many explanatory.
Unit 4c: Taxonomies of Logistic Regression Models © Andrew Ho, Harvard Graduate School of EducationUnit 4c – Slide 1
Copyright ©2011 Pearson Education 15-1 Chapter 15 Multiple Regression Model Building Statistics for Managers using Microsoft Excel 6 th Global Edition.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.
Unit 4c: Taxonomies of Logistic Regression Models © Andrew Ho, Harvard Graduate School of EducationUnit 4c – Slide 1
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 15-1 Chapter 15 Multiple Regression Model Building Statistics for Managers using Microsoft.
© Willett & Singer, Harvard University Graduate School of Education S077/Week #4– Slide 1 S077: Applied Longitudinal Data Analysis Week #4: What Are The.
BPS - 3rd Ed. Chapter 211 Inference for Regression.
Inference for Linear Regression Conditions for Regression Inference: Suppose we have n observations on an explanatory variable x and a response variable.
1 G Lect 6M Comparing two coefficients within a regression equation Analysis of sets of variables: partitioning the sums of squares Polynomial curve.
So far... We have been estimating differences caused by application of various treatments, and determining the probability that an observed difference.
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
1 Psych 5510/6510 Chapter 10. Interactions and Polynomial Regression: Models with Products of Continuous Predictors Spring, 2009.
© Willett & Singer, Harvard University Graduate School of EducationS077/Week #3– Slide 1 S077: Applied Longitudinal Data Analysis Week #3: What Topics.
© Willett & Singer, Harvard University Graduate School of EducationS077/Week#2– Slide 1 S077: Applied Longitudinal Data Analysis Week #2: What Topics Are.
1 G Lect 7M Statistical power for regression Statistical interaction G Multiple Regression Week 7 (Monday)
MGS3100_04.ppt/Sep 29, 2015/Page 1 Georgia State University - Confidential MGS 3100 Business Analysis Regression Sep 29 and 30, 2015.
+ Chapter 12: More About Regression Section 12.1 Inference for Linear Regression.
1 Chapter 10: Introduction to Inference. 2 Inference Inference is the statistical process by which we use information collected from a sample to infer.
Time series Model assessment. Tourist arrivals to NZ Period is quarterly.
1 11 Simple Linear Regression and Correlation 11-1 Empirical Models 11-2 Simple Linear Regression 11-3 Properties of the Least Squares Estimators 11-4.
Sample Size Considerations for Answering Quantitative Research Questions Lunch & Learn May 15, 2013 M Boyle.
© Willett & Singer, Harvard University Graduate School of Education S077/Week #5– Slide 1 S077: Applied Longitudinal Data Analysis Week #5: What Are The.
Stat 112 Notes 10 Today: –Fitting Curvilinear Relationships (Chapter 5) Homework 3 due Thursday.
Overview of Regression Analysis. Conditional Mean We all know what a mean or average is. E.g. The mean annual earnings for year old working males.
SEM Basics 2 Byrne Chapter 2 Kline pg 7-15, 50-51, ,
Chapter 14: Inference for Regression. A brief review of chapter 4... (Regression Analysis: Exploring Association BetweenVariables )  Bi-variate data.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
Correlation They go together like salt and pepper… like oil and vinegar… like bread and butter… etc.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Chapter 10 Correlation and Regression 10-2 Correlation 10-3 Regression.
Chapter 5 Multilevel Models
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 10 th Edition.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
BPS - 5th Ed. Chapter 231 Inference for Regression.
F73DA2 INTRODUCTORY DATA ANALYSIS ANALYSIS OF VARIANCE.
Chapter 4 More on Two-Variable Data. Four Corners Play a game of four corners, selecting the corner each time by rolling a die Collect the data in a table.
Yandell – Econ 216 Chap 15-1 Chapter 15 Multiple Regression Model Building.
Stats Methods at IC Lecture 3: Regression.
Chapter 15 Multiple Regression Model Building
Chapter 4 Basic Estimation Techniques
CHAPTER 12 More About Regression
POSC 202A: Lecture Lecture: Substantive Significance, Relationship between Variables 1.
Undergraduated Econometrics
CHAPTER 12 More About Regression
MGS 3100 Business Analysis Regression Feb 18, 2016
Presentation transcript:

Judith D. Singer & John B. Willett Harvard Graduate School of Education Modeling discontinuous and nonlinear change ALDA, Chapter Six “Things have changed” Bob Dylan

Chapter 6: Modeling discontinuous and nonlinear change Discontinuous individual change (§6.1)—especially useful when discrete shocks or time-limited treatments affect the life course Using transformations to model non-linear change (§6.2)—perhaps the easiest way of fitting non-linear change models Can transform either the outcome or TIME We already did this with ALCUSE (which was a square root of a sum of 4 items) Using polynomials of TIME to represent non-linear change (§6.3) While admittedly atheoretical, it’s very easy to do Probably the most popular approach in practice Truly non-linear trajectories (§6.4) Logistic, exponential, and negative exponential models, for example A world of possibilities limited only by your theory (and the quality and amount of data) Discontinuous individual change (§6.1)—especially useful when discrete shocks or time-limited treatments affect the life course Using transformations to model non-linear change (§6.2)—perhaps the easiest way of fitting non-linear change models Can transform either the outcome or TIME We already did this with ALCUSE (which was a square root of a sum of 4 items) Using polynomials of TIME to represent non-linear change (§6.3) While admittedly atheoretical, it’s very easy to do Probably the most popular approach in practice Truly non-linear trajectories (§6.4) Logistic, exponential, and negative exponential models, for example A world of possibilities limited only by your theory (and the quality and amount of data) General idea: All our examples so far have assumed that individual growth is smooth and linear. But the multilevel model for change is much more flexible: © Singer & Willett, page 2

Example for discontinuous individual change: Wage trajectories & the GED Sample: the same 888 male high school dropouts (from before) Research design Each was interviewed between 1 and 13 times after dropping out 34.6% (n=307) earned a GED at some point during data collection OLD research questions How do log(WAGES) change over time? Do the wage trajectories differ by ethnicity and highest grade completed? Additional NEW research questions: What is the effect of GED attainment? Does earning a GED: affect the wage trajectory’s elevation? affect the wage trajectory’s slope? create a discontinuity in the wage trajectory? Sample: the same 888 male high school dropouts (from before) Research design Each was interviewed between 1 and 13 times after dropping out 34.6% (n=307) earned a GED at some point during data collection OLD research questions How do log(WAGES) change over time? Do the wage trajectories differ by ethnicity and highest grade completed? Additional NEW research questions: What is the effect of GED attainment? Does earning a GED: affect the wage trajectory’s elevation? affect the wage trajectory’s slope? create a discontinuity in the wage trajectory? Data source: Murnane, Boudett and Willett (1999), Evaluation Review (ALDA, Section 6.1.1, pp ) © Singer & Willett, page 3

First steps: Think about how GED receipt might affect an individual’s wage trajectory (ALDA, Figure 6.1, p 193) Let’s start by considering four plausible effects of GED receipt by imagining what the wage trajectory might look like for someone who got a GED 3 years after labor force entry (post dropout) How do we model trajectories like these within the context of a linear growth model???  © Singer & Willett, page 4 GED EXPER LNW A: No effect of GED whatsoever B: An immediate shift in elevation; no difference in rate of change D: An immediate shift in rate of change; no difference in elevation F: Immediate shifts in both elevation & rate of change

Including a discontinuity in elevation, not slope (Trajectory B) (ALDA, Section 6.1.1, pp ) Key idea: It’s easy; simply include GED as a time-varying effect at level-1 Pre-GED (GED=0): Post-GED (GED=1): © Singer & Willett, page 5

Including a discontinuity in slope, not elevation (Trajectory D) Using an additional temporal predictor to capture the “extra slope” post-GED receipt (ALDA, Section 6.1.1, pp ) Pre-GED (POSTEXP=0): Post-GED (POSTEXP clocked in same cadence as EXPER): POSTEXP ij = 0 prior to GED POSTEXP ij = “Post GED experience,” a new TV predictor that clocks “TIME since GED receipt” (in the same cadence as EXPER) © Singer & Willett, page 6

Including a discontinuities in both elevation and slope (Trajectory F) Simple idea : Combine the two previous approaches (ALDA, Section 6.1.1, pp ) Pre-GED Post-GED © Singer & Willett, page 7

Many other types of discontinuous individual change trajectories are possible (ALDA, Section 6.1.1, pp ) How do we select among the alternative discontinuous models? Just like a regular regression model, the multilevel model for change can include discontinuities, non- linearities and other ‘non- standard’ terms Generally more limited by data, theory, or both, than by the ability to specify the model  Extra terms in the level-1 model translate into extra parameters to estimate What kinds of other complex trajectories could be used? Effects on elevation and slope can depend upon timing of GED receipt (ALDA pp ) You might have non-linear changes before or after the transition point The effect of GED receipt might be instantaneous but not endure The effect of GED receipt might be delayed Might there be multiple transition points (e.g., on entry in college for GED recipients) Think carefully about what kinds of discontinuities might arise in your substantive context © Singer & Willett, page 8

Let’s start with a “baseline model” (Model A) against which we’ll compare alternative discontinuous trajectories (ALDA, Section 6.1.2, pp ) (UERATE-7) is the local area unemployment rate (added in previous chapter as an example of a TV predictor), centered around 7% for interpretability -7 To appropriately compare this deviance statistic to more complex models, we need to know how many parameters have been estimated to achieve this value of deviance Benchmark against which we’ll evaluate discontinuous models 5 fixed effects 4 random effects © Singer & Willett, page 9

How we’re going to proceed… (ALDA, Section 6.1.2, pp ) © Singer & Willett, page 10 Instead of constructing tables of (seemingly endless) parameter estimates, we’re going to construct a summary table that presents the… Baseline just shown specific terms in the model n parameters (for d.f.) deviance statistic (for model comparison)

First steps: Investigating the discontinuity in elevation by adding the effect of GED (ALDA, Section 6.1.2, pp ) © Singer & Willett, page 11 B: Add GED as both a fixed and random effect (1 extra fixed parameter; 3 extra random)  Deviance=25.0, 4 df, p<.001—keep GED effect C: But does the GED discontinuity vary across people? (do we need to keep the extra VCs for the effect of GED?)  Deviance=12.8, 3 df, p<.01— keep VCs What about the discontinuity in slope?

Next steps: Investigating the discontinuity in slope by adding the effect of POSTEXP (without the GED effect producing a discontinuity in elevation) D: Adding POSTEXP as both a fixed and random effect (1 extra fixed parameter; 3 extra random)  Deviance=13.1, 4 df, p<.05— keep POSTEXP effect E: But does the POSTEXP slope vary across people? (do we need to keep the extra VCs for the effect of POSTEXP?)  Deviance=3.3, 3 df, ns—don’t need the POSTEXP random effects (but in comparison with A still need POSTEXP fixed effect) What if we include both types of discontinuity? (ALDA, Section 6.1.2, pp ) © Singer & Willett, page 12

Examining both discontinuities simultaneously (ALDA, Section 6.1.2, pp ) © Singer & Willett, page 13 F: Add GED and POSTEXP simultaneously (each as both fixed and random effects) comp. with D shows significance of GED comp. with B shows significance of POSTEXP

Can we simplify this model by eliminating the VCs for POSTEXP (G) or GED (H)? We actually fit several other possible models (see ALDA) but F was the best alternative—so…how do we display its results? (ALDA, Section 6.1.2, pp ) © Singer & Willett, page 14 Each results in a worse fit, suggesting that Model F (which includes both random effects) is better (even though Model E suggested we might be able to eliminate the VC for POSTEXP)

Displaying prototypical discontinuous trajectories (Log Wages for HS dropouts pre- and post-GED attainment) (ALDA, Section 6.1.2, pp ) LNW 12 th grade dropouts 9 th grade dropouts Highest grade completed Those who stay longer have higher initial wages This differential remains constant over time White/ Latino Black Race At dropout, no racial differences in wages Racial disparities increase over time because wages for Blacks increase at a slower rate earned a GED GED receipt has two effects Upon GED receipt, wages rise immediately by 4.2% Post-GED receipt, wages rise annually by 5.2% (vs. 4.2% pre- receipt) © Singer & Willett, page 15

Modeling non-linear change using transformations (ALDA, Section 6.2, pp ) When facing obviously non-linear trajectories, we usually begin by trying transformation: A straight line—even on a transformed scale—is a simple form with easily interpretable parameters Since many outcome metrics are ad hoc, transformation to another ad hoc scale may sacrifice little When facing obviously non-linear trajectories, we usually begin by trying transformation: A straight line—even on a transformed scale—is a simple form with easily interpretable parameters Since many outcome metrics are ad hoc, transformation to another ad hoc scale may sacrifice little Earlier, we modeled ALCUSE, an outcome that we formed by taking the square root of the researchers’ original alcohol use measurement We can ‘detransform’ the findings and return to the original scale, by squaring the predicted values of ALCUSE and re- plotting The prototypical individual growth trajectories are now non-linear: By transforming the outcome before analysis, we have effectively modeled non-linear change over time So…how do we know what variable to transform using what transformation? © Singer & Willett, page 16

The “Rule of the Bulge” and the “Ladder of Transformations” Mosteller & Tukey (1977): EDA techniques for straightening lines (ALDA, Section 6.2.1, pp ) Step 1: What kinds of transformations do we consider? Generic variable V expand scale compress scale Step 2: How do we know when to use which transformation? 1.Plot many empirical growth trajectories 2.You find linearizing transformations by moving “up” or “down” in the direction of the “bulge” © Singer & Willett, page 17

The effects of transformation for a single child in the Berkeley Growth Study (ALDA, Section 6.2.1, pp ) Down in TIME Up in IQ How else might we model non-linear change? © Singer & Willett, page 18

Representing individual change using a polynomial function of TIME (ALDA, Section 6.3.1, pp ) Polynomial of the “zero order” (because TIME 0 =1) Like including a constant predictor 1 in the level-1 model Intercept represents vertical elevation Different people can have different elevations Polynomial of the “first order” (because TIME 1 =TIME) Familiar individual growth model Varying intercepts and slopes yield criss-crossing lines Second order polynomial for quadratic change Includes both TIME and TIME 2  0i =intercept, but now both TIME and TIME 2 must be 0  1i =instantaneous rate of change when TIME=0 (there is no longer a constant slope)  2i =curvature parameter; larger its value, more dramatic its effect Peak is called a “stationary point”—a quadratic has 1. Third order polynomial for cubic change Includes TIME, TIME 2 and TIME 3 Can keep on adding powers of TIME Each extra polynomial adds another stationary point—a cubic has 2 © Singer & Willett, page 19

Example for illustrating use of polynomials in TIME to represent change Sample: 45 boys and girls identified in 1 st grade: Goal was to study behavior changes over time (until 6 th grade) Research design At the end of every school year, teachers rated each child’s level of externalizing behavior using Achenbach’s Child Behavior Checklist: 3 point scale (0=rarely/never; 1=sometimes; 2=often) 24 aggressive, disruptive, or delinquent behaviors Outcome: EXTERNAL—ranges from 0 to 68 (simple sum of these scores) Predictor: FEMALE—are there gender differences? Research question How does children’s level of externalizing behavior change over time? Do the trajectories of change differ for boys and girls? Sample: 45 boys and girls identified in 1 st grade: Goal was to study behavior changes over time (until 6 th grade) Research design At the end of every school year, teachers rated each child’s level of externalizing behavior using Achenbach’s Child Behavior Checklist: 3 point scale (0=rarely/never; 1=sometimes; 2=often) 24 aggressive, disruptive, or delinquent behaviors Outcome: EXTERNAL—ranges from 0 to 68 (simple sum of these scores) Predictor: FEMALE—are there gender differences? Research question How does children’s level of externalizing behavior change over time? Do the trajectories of change differ for boys and girls? Source: Margaret Keiley & colleagues (2000), J of Abnormal Child Psychology (ALDA, Section 6.3.2, p. 217) © Singer & Willett, page 20

Selecting a suitable level-1 polynomial trajectory for change Examining empirical growth plots (which invariably display great variability in temporal complexity) (ALDA, Section 6.3.2, pp ) Little change over time (flat line?) Linear decline (at least until 4 th grade) Quadratic change (but with varying curvatures) Two stationary points? (suggests a cubic) Three stationary points? (suggests a quartic!!!) When faced with so many different patterns, how do you select a common polynomial for analysis? © Singer & Willett, page 21

Examining alternative fitted OLS polynomial trajectories Order optimized for each child (solid curves) and a common quartic across children (dashed line) (ALDA, Section 6.3.2, pp ) First impression: Most fitted trajectories provide a reasonable summary for each child’s data Second impression: Maybe these ad hoc decisions aren’t the best? Quadratic? Would a quadratic do? Third realization: We need a common polynomial across all cases (and might the quartic be just too complex)? Using sample data to draw conclusions about the shape of the underlying true trajectories is tricky—let’s compare alternative models  © Singer & Willett, page 22

Using model comparisons to test higher order terms in a polynomial level-1 model (ALDA, Section 6.3.3, pp ) Add polynomial functions of TIME to person period data set Compare goodness of fit (accounting for all the extra parameters that get estimated) A: significant between- and within-child variation B: no fixed effect of TIME but significant var comps  Deviance=18.5, 3df, p<.01 C: no fixed effects of TIME & TIME 2 but significant var comps  Deviance=16.0, 4df, p<.01 D: still no fixed effects for TIME terms, but now VCs are ns also  Deviance=11.1, 5df, ns Quadratic (C) is best choice— and it turns out there are no gender differentials at all. © Singer & Willett, page 23

Example for truly non-linear change Sample: 17 1st and 2nd graders During a 3 week period, Terry repeatedly played a two- person checkerboard game called Fox ‘n Geese, (hopefully) learning from experience Fox is controlled by the experimenter, at one end of the board Children have four geese, that they use to try to trap the fox Great for studying cognitive development because: There exists a strategy that children can learn that will guarantee victory This strategy is not immediately obvious to children Many children can deduce the strategy over time Research design Each child played up to 27 games (each game is a “wave”) The outcome, NMOVES is the number of moves made by the child before making a catastrophic error (guaranteeing defeat)—ranges from 1 to 20 Research question: How does NMOVES change over time? What is the effect of a child’s reading (or cognitive) ability?—READ (score on a standardized reading test) Sample: 17 1st and 2nd graders During a 3 week period, Terry repeatedly played a two- person checkerboard game called Fox ‘n Geese, (hopefully) learning from experience Fox is controlled by the experimenter, at one end of the board Children have four geese, that they use to try to trap the fox Great for studying cognitive development because: There exists a strategy that children can learn that will guarantee victory This strategy is not immediately obvious to children Many children can deduce the strategy over time Research design Each child played up to 27 games (each game is a “wave”) The outcome, NMOVES is the number of moves made by the child before making a catastrophic error (guaranteeing defeat)—ranges from 1 to 20 Research question: How does NMOVES change over time? What is the effect of a child’s reading (or cognitive) ability?—READ (score on a standardized reading test) Data source : Terry Tivnan (1980) Dissertation at Harvard Graduate School of Education (ALDA, Section 6.4.1, pp ) © Singer & Willett, page 24

Selecting a suitable level-1 nonlinear trajectory for change Examining empirical growth plots (and asking what features should the hypothesized model display?) (ALDA, Section 6.4.2, pp ) A lower asymptote, because everyone makes at least 1 move and it takes a while to figure out what’s going on An upper asymptote, because a child can make only a finite # moves each game A smooth curve joining the asymptotes, that initially accelerates and then decelerates These three features suggest a level-1 logistic change trajectory,which unlike our previous growth models will be non-linear in the individual growth parameters © Singer & Willett, page 25

Understanding the logistic individual growth trajectory (which is anything but linear in the individual growth parameters) (ALDA, Section 6.4.2, pp )  0i is related to, and determines, the intercept  1i determines the rapidity with which the trajectory approaches the upper asymptote Upper asymptote in this particular model is constrained to be 20 (1+19) Higher the value of  0i, the lower the intercept When  1i is small, the trajectory rises slowly (often not reaching an asymptote) When  1i is large, the trajectory rises more rapidly Models can be fit in usual way using provided your software can do it  © Singer & Willett, page 26

Results of fitting logistic change trajectories to the Fox ‘n Geese data (ALDA, Section 6.4.2, pp ) Begins low and rises smoothly and non-linearly Not statistically significant (note small n’s), but better READers approach asymptote more rapidly © Singer & Willett, page 27

A limitless array of non-linear trajectories awaits… (each is illustrated in detail in ALDA, Section 6.4.3) (ALDA, Section 6.4.3, pp ) © Singer & Willett, page 28