Download presentation
Presentation is loading. Please wait.
Published byTeresa Hudson Modified over 9 years ago
2
© Willett, Harvard University Graduate School of Education, 1/19/2016S052/I.2(a) – Slide 1 More details can be found in the “Course Objectives and Content” handout on the course webpage. Multiple Regression Analysis (MRA) Multiple Regression Analysis (MRA) Do your residuals meet the required assumptions? Test for residual normality Use influence statistics to detect atypical datapoints If your residuals are not independent, replace OLS by GLS regression analysis Use Individual growth modeling Specify a Multi-level Model If your sole predictor is continuous, MRA is identical to correlational analysis If your sole predictor is dichotomous, MRA is identical to a t-test If your several predictors are categorical, MRA is identical to ANOVA If time is a predictor, you need discrete- time survival analysis… If your outcome is categorical, you need to use… Binomial logistic regression analysis (dichotomous outcome) Multinomial logistic regression analysis (polytomous outcome) If you have more predictors than you can deal with, Create taxonomies of fitted models and compare them. Form composites of the indicators of any common construct. Conduct a Principal Components Analysis Use Cluster Analysis Use non-linear regression analysis. Transform the outcome or predictor If your outcome vs. predictor relationship is non-linear, How do you deal with missing data? S052/I.2(a): Applied Data Analysis Roadmap of the Course – What Is Today’s Topic Area? Today’s Topic Area
3
© Willett, Harvard University Graduate School of Education, 1/19/2016S052/I.2(a) – Slide 2 S052/§I.2(a): Dealing “Empirically” With Non-Linear Relationships Printed Syllabus – What Is Today’s Topic? Please check inter-connections among the Roadmap, the Daily Topic Area, the Printed Syllabus, and the content of today’s class when you pre-read the day’s materials. Syllabus Section I.2(a), on Dealing Empirically with Non-Linear Relationships, includes: A disaster awaits, if you ignore nonlinearities (Slide 3). Introducing the BAYLEY data (Slide 4). Detecting nonlinearity graphically (Slides 5-6). Two ways to deal with nonlinearity in regression analysis (Slide 7). Implementing an “empirical” approach (Slide 8). Tukey’s Ladder & the Rule of the Bulge (Slides 9-11). Programming transformations in PC-SAS (Slide 12). Implementing an empirical approach with the BAYLEY data (Slides 13-18). Fine-tuning a transformation (Slides 19-22). What does it mean? (Slides 23-25) Appendix 1: Why do transformations work? (Slide 23). Syllabus Section I.2(a), on Dealing Empirically with Non-Linear Relationships, includes: A disaster awaits, if you ignore nonlinearities (Slide 3). Introducing the BAYLEY data (Slide 4). Detecting nonlinearity graphically (Slides 5-6). Two ways to deal with nonlinearity in regression analysis (Slide 7). Implementing an “empirical” approach (Slide 8). Tukey’s Ladder & the Rule of the Bulge (Slides 9-11). Programming transformations in PC-SAS (Slide 12). Implementing an empirical approach with the BAYLEY data (Slides 13-18). Fine-tuning a transformation (Slides 19-22). What does it mean? (Slides 23-25) Appendix 1: Why do transformations work? (Slide 23).
4
© Willett, Harvard University Graduate School of Education, 1/19/2016S052/I.2(a) – Slide 3 Subsequent re-analysis presented in the International Review of Education, 1972, 508-516, this time with graphics: + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Teacher Indirectness Annual Class-Average Change in Reading S052/§I.2(a): Dealing “Empirically” With Non-Linear Relationships OLS Regression Analysis Automatically Assumes A Linear Outcome/Predictor Relationship! outcome/predictor relationship linear You may make a big mistake if you fail to check that the outcome/predictor relationship you are investigating with OLS regression analysis is linear … as in the following example from Process/Product research: Process : Teacher Behavior (e.g., teacher indirectness) Process : Teacher Behavior (e.g., teacher indirectness) Product : Student Learning (e.g., annual class-average change in reading score) Product : Student Learning (e.g., annual class-average change in reading score) Initial analysis documented in an ERIC report (1966), Prof. Robert Soar: Investigated the relationship between student learning and teacher behavior. Examined no plots in his data- analysis. Out of thousands of results, reported very few statistically significant findings. Initial analysis documented in an ERIC report (1966), Prof. Robert Soar: Investigated the relationship between student learning and teacher behavior. Examined no plots in his data- analysis. Out of thousands of results, reported very few statistically significant findings.
5
© Willett, Harvard University Graduate School of Education, 1/19/2016S052/I.2(a) – Slide 4 DatasetBAYLEY.txt OverviewIQ as a function of age for a female infant, from birth to age 60 months. Source Target child is a female infant (infant #8) from the Berkeley Growth and Guidance Study. More Info To learn more about the data, consult: The overview of the Oakland and Berkeley Growth and Guidance Studies at the Carolina Population Center. Carolina Population Center Glen Elder’s presentation on “Longitudinal Studies and the Life Course, the 1960s and 1970s,” prepared for the anniversary of the Institute of Human Development, UC Berkeley (2003).Longitudinal Studies and the Life Course, the 1960s and 1970s Sample sizeOne infant, over 21 occasions of measurement. Last updatedOctober 6, 2007 Structure of Dataset Col. # Variable Name Variable DescriptionVariable Metric/Labels 1IQ Infant’s score on the Bayley Scales of Infant DevelopmentBayley Scales of Infant Development Continuous raw score 2AGEAge of infantMonths S052/§I.2(a): Dealing “Empirically” With Non-Linear Relationships Data-Example For Use In Today’s Class – One Female From The Berkeley Growth Study? non-linear relationshipoutcomepredictor Another example of a non-linear relationship between outcome and predictor is found in my BAYLEY data … IQ T 4 1 10 2 17 3 37 5 65 7 85 9 88 10 95 11 101 12 103 13 107 14 113 15 121 18 148 21 161 24 165 27 187 36 205 42 218 48 218 54 228 60
6
© Willett, Harvard University Graduate School of Education, 1/19/2016S052/I.2(a) – Slide 5 S052/§I.2(a): Dealing “Empirically” With Non-Linear Relationships How Do You Know When Your Outcome/Predictor Relationship is Non-Linear? *-------------------------------------------------------------------------------* Input the data, name and label the variables in the dataset *-------------------------------------------------------------------------------*; DATA BAYLEY; INFILE 'C:\DATA\S052\BAYLEY.txt'; INPUT IQ AGE; LABEL IQ = 'Bayley Infant IQ Score' AGE = 'Age (Months)'; *-------------------------------------------------------------------------------* Inspect the plotted bivariate relationship for potential non-linearity *-------------------------------------------------------------------------------*; PROC PLOT DATA=BAYLEY; PLOT IQ*AGE = '+'; *-------------------------------------------------------------------------------* Despite the evident non-linearity, it's often useful to examine the raw residuals *-------------------------------------------------------------------------------*; * Fit a linear trend to the scatterplot(!!!) to obtain the raw residuals; PROC REG DATA=BAYLEY; M1: MODEL IQ = AGE; * Output the raw residuals to a diagnostic dataset; OUTPUT OUT=DIAGNOSE RESIDUAL=RAWRES; * Don't believe the statistical inference, but plot the raw resids vs. the predictor(s); PROC PLOT DATA=DIAGNOSE; PLOT RAWRES*AGE = '+'; RUN; *-------------------------------------------------------------------------------* Input the data, name and label the variables in the dataset *-------------------------------------------------------------------------------*; DATA BAYLEY; INFILE 'C:\DATA\S052\BAYLEY.txt'; INPUT IQ AGE; LABEL IQ = 'Bayley Infant IQ Score' AGE = 'Age (Months)'; *-------------------------------------------------------------------------------* Inspect the plotted bivariate relationship for potential non-linearity *-------------------------------------------------------------------------------*; PROC PLOT DATA=BAYLEY; PLOT IQ*AGE = '+'; *-------------------------------------------------------------------------------* Despite the evident non-linearity, it's often useful to examine the raw residuals *-------------------------------------------------------------------------------*; * Fit a linear trend to the scatterplot(!!!) to obtain the raw residuals; PROC REG DATA=BAYLEY; M1: MODEL IQ = AGE; * Output the raw residuals to a diagnostic dataset; OUTPUT OUT=DIAGNOSE RESIDUAL=RAWRES; * Don't believe the statistical inference, but plot the raw resids vs. the predictor(s); PROC PLOT DATA=DIAGNOSE; PLOT RAWRES*AGE = '+'; RUN; Why not take a look? Read in, and label, the BAYLEY data Produce a bivariate plot of outcome versus predictor You can “magnify” any nonlinearity present in an outcome vs. predictor relationship, making it easier to detect, by fitting a linear trend to the potentially curvilinear relationship and examining the raw residuals, as follows: Regress IQ on T, and output the RAWRES into the DIAGNOSE dataset. Inspect a bivariate plot of raw residuals versus the predictor. Data-Analytic Handout I.2(a).1
7
© Willett, Harvard University Graduate School of Education, 1/19/2016S052/I.2(a) – Slide 6 S052/§I.2(a): Dealing “Empirically” With Non-Linear Relationships What Is Revealed By Bivariate Plots of Outcome & Raw Residuals vs the Predictor? Bayley IQ 250 ˆ ‚ ‚ + ‚ + + ‚ ‚ + 200 ˆ ‚ ‚ + ‚ ‚ + + ‚ 150 ˆ + ‚ ‚ + 100 ˆ ++ ‚ + ‚ ++ ‚ ‚ + ‚ 50 ˆ ‚ ‚ + ‚ ‚ + ‚ + + 0 ˆ Šˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆ 0 10 20 30 40 50 60 Age (Months) Bayley IQ 250 ˆ ‚ ‚ + ‚ + + ‚ ‚ + 200 ˆ ‚ ‚ + ‚ ‚ + + ‚ 150 ˆ + ‚ ‚ + 100 ˆ ++ ‚ + ‚ ++ ‚ ‚ + ‚ 50 ˆ ‚ ‚ + ‚ ‚ + ‚ + + 0 ˆ Šˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆ 0 10 20 30 40 50 60 Age (Months) Raw Residual 30 ˆ + ‚ + ‚ ‚ + ‚ 20 ˆ ‚ ‚ + + ‚ + + + 10 ˆ ++ ‚ + ‚ 0 ˆ ‚ + + ‚ -10 ˆ ‚ -20 ˆ ‚ + ‚ -30 ˆ ‚ ‚ + + ‚ + -40 ˆ + Šˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆ 0 10 20 30 40 50 60 Age (Months) Raw Residual 30 ˆ + ‚ + ‚ ‚ + ‚ 20 ˆ ‚ ‚ + + ‚ + + + 10 ˆ ++ ‚ + ‚ 0 ˆ ‚ + + ‚ -10 ˆ ‚ -20 ˆ ‚ + ‚ -30 ˆ ‚ ‚ + + ‚ + -40 ˆ + Šˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆ 0 10 20 30 40 50 60 Age (Months) Make sure you don’t believe any of the associated statistical inference Make sure you don’t believe any of the associated statistical inference – SSE, MSE, R 2 statistic, standard errors, t-statistics & p-values -- as they depend on the residuals, which have been estimated incorrectly because of the failure to model the curvilinear trend appropriately:
8
© Willett, Harvard University Graduate School of Education, 1/19/2016S052/I.2(a) – Slide 7 S052/§I.2(a): Dealing “Empirically” With Non-Linear Relationships Two Approaches For Dealing With A Failure of the Linearity Assumption Use theory, or knowledge of the field, to postulate a non-linear model for the hypothesized relationship between outcome and predictor. Use nonlinear regression analysis to fit the postulated trend in the real world, and conduct all of your statistical inference there. Interpret the parameter estimates directly, and produce relevant prototypical plots. Use theory, or knowledge of the field, to postulate a non-linear model for the hypothesized relationship between outcome and predictor. Use nonlinear regression analysis to fit the postulated trend in the real world, and conduct all of your statistical inference there. Interpret the parameter estimates directly, and produce relevant prototypical plots. Next Class!!!! Hard to apply, easy to interpret!!! Rational Approach Find an ad-hoc transformation of either the outcome or the predictor, or both, that renders their relationship linear. Use regular linear regression analysis to fit a linear trend in the transformed world, and conduct all of your statistical inference there. De-transform the fitted model to produce plots of he findings for prototypical cases and tell the substantive story in the untransformed world. Find an ad-hoc transformation of either the outcome or the predictor, or both, that renders their relationship linear. Use regular linear regression analysis to fit a linear trend in the transformed world, and conduct all of your statistical inference there. De-transform the fitted model to produce plots of he findings for prototypical cases and tell the substantive story in the untransformed world. Data-Analytic Handout I_2a_2 Easy to apply, hard to interpret!!! Empirical Approach
9
© Willett, Harvard University Graduate School of Education, 1/19/2016S052/I.2(a) – Slide 8 S052/§I.2(a): Dealing “Empirically” With Non-Linear Relationships Implementing the Empirical Approach – Transformations & Trial and Error? What kinds of transformations are available? How do you know which transformation to choose? How do you know whether to apply the transformation to the outcome or the predictor? In an infinite universe, there are an infinite number of equally effective transformations… Squares, cubes, fourth power, fifth power … Square roots, cube roots, fourth roots … Logarithms & antilogarithms. Inverses … Trigonometric functions … Hyperbolic functions … Any combination of the above … Functions humans have yet to conceive of … In an infinite universe, there are an infinite number of equally effective transformations… Squares, cubes, fourth power, fifth power … Square roots, cube roots, fourth roots … Logarithms & antilogarithms. Inverses … Trigonometric functions … Hyperbolic functions … Any combination of the above … Functions humans have yet to conceive of … Guided Heurism?
10
© Willett, Harvard University Graduate School of Education, 1/19/2016S052/I.2(a) – Slide 9 S052/§I.2(a): Dealing “Empirically” With Non-Linear Relationships Implementing the Empirical Approach – Tukey’s Ladder of Transformations UP Bigger Impact Bigger Impact Middle rung: No transformation (exponent of V = 1) Middle rung: No transformation (exponent of V = 1) DOWN two simple rules Amidst infinite impossibilities, implementing the Empirical Approach is easier if you know two simple rules … Tukey’s Ladder of Transformations organize the available transformations Tukey’s Ladder of Transformations helps you organize the available transformations
11
© Willett, Harvard University Graduate School of Education, 1/19/2016S052/I.2(a) – Slide 10 + + + + + + + + + + + + + + + + + + + + + ++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + ++ + + + + + + + + + + + + + + Up Ladder on Y Up Ladder on Y Down ladder on X Down Ladder on Y Down Ladder on Y Up ladder on X S052/§I.2(a): Dealing “Empirically” With Non-Linear Relationships Implementing the Empirical Approach – The Rule of the Bulge Rule of the Bulge choose a transformation Tukey’s Ladder The Rule of the Bulge helps you choose a transformation from Tukey’s Ladder Just find the quadrant that matches the shape of the curve you wish to linearize, and then follow either of the instructions that border it.
12
© Willett, Harvard University Graduate School of Education, 1/19/2016S052/I.2(a) – Slide 11 S052/§I.2(a): Dealing “Empirically” With Non-Linear Relationships Applying Tukey’s Ladder/Rule of the Bulge to the Bayley Data 250 ˆ ‚ ‚ + ‚ + + ‚ ‚ + B 200 ˆ a ‚ y ‚ + l ‚ e ‚ y ‚ + + ‚ I 150 ˆ + n ‚ f ‚ a ‚ n ‚ + t ‚ + ‚ + I 100 ˆ ++ Q ‚ + ‚ ++ S ‚ c ‚ o ‚ + r ‚ e 50 ˆ ‚ ‚ + ‚ ‚ + ‚ + + 0 ˆ Šˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆ 0 10 20 30 40 50 60 Age (Months) 250 ˆ ‚ ‚ + ‚ + + ‚ ‚ + B 200 ˆ a ‚ y ‚ + l ‚ e ‚ y ‚ + + ‚ I 150 ˆ + n ‚ f ‚ a ‚ n ‚ + t ‚ + ‚ + I 100 ˆ ++ Q ‚ + ‚ ++ S ‚ c ‚ o ‚ + r ‚ e 50 ˆ ‚ ‚ + ‚ ‚ + ‚ + + 0 ˆ Šˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆ 0 10 20 30 40 50 60 Age (Months) + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Up Ladder on Y Up Ladder on Y Down ladder on X Down ladder on X Down Ladder on Y Down Ladder on Y Up ladder on X Up ladder on X
13
© Willett, Harvard University Graduate School of Education, 1/19/2016S052/I.2(a) – Slide 12 S052/§I.2(a): Dealing “Empirically” With Non-Linear Relationships How Do You Program the Selected Transformations and Plots in PC-SAS? *--------------------------------------------------------------------* Input the data, name and label the variables in the dataset *--------------------------------------------------------------------*; DATA BAYLEY; INFILE 'C:\DATA\S052\BAYLEY.txt'; INPUT IQ AGE; LABEL IQ = 'Bayley Infant IQ Score' AGE = 'Age (Months)'; * Create Transformations of IQ, going up Tukey's Ladder; IQ2= IQ**2; IQ2_5= IQ**2.5; IQ3= IQ**3; IQ3_5= IQ**3.5; * Create Transformations of AGE, going down Tukey's Ladder; AGE_RT2= AGE**(1/2); AGE_RT2_5= AGE**(1/2.5); AGE_RT3= AGE**(1/3); AGE_RT3_5= AGE**(1/3.5); Log_AGE= LOG(AGE); Log2_AGE= LOG2(AGE); Inv_AGE= -1/AGE; Inv_AGE2= -1/(AGE**2); *---------------------------------------------------------------------* Inspect the plotted bivariate relationship for potential non-linearity *---------------------------------------------------------------------*; PROC PLOT DATA=BAYLEY; * Check out the success of the IQ transforms; PLOT (IQ2 IQ2_5 IQ3 IQ3_5)*AGE = '+'; * Check out the success of the AGE transforms; PLOT IQ*(AGE_RT2 AGE_RT2_5 AGE_RT3 AGE_RT3_5) = '+'; PLOT IQ*(Log_AGE Log2_AGE Inv_AGE Inv_AGE2) = '+'; *--------------------------------------------------------------------* Input the data, name and label the variables in the dataset *--------------------------------------------------------------------*; DATA BAYLEY; INFILE 'C:\DATA\S052\BAYLEY.txt'; INPUT IQ AGE; LABEL IQ = 'Bayley Infant IQ Score' AGE = 'Age (Months)'; * Create Transformations of IQ, going up Tukey's Ladder; IQ2= IQ**2; IQ2_5= IQ**2.5; IQ3= IQ**3; IQ3_5= IQ**3.5; * Create Transformations of AGE, going down Tukey's Ladder; AGE_RT2= AGE**(1/2); AGE_RT2_5= AGE**(1/2.5); AGE_RT3= AGE**(1/3); AGE_RT3_5= AGE**(1/3.5); Log_AGE= LOG(AGE); Log2_AGE= LOG2(AGE); Inv_AGE= -1/AGE; Inv_AGE2= -1/(AGE**2); *---------------------------------------------------------------------* Inspect the plotted bivariate relationship for potential non-linearity *---------------------------------------------------------------------*; PROC PLOT DATA=BAYLEY; * Check out the success of the IQ transforms; PLOT (IQ2 IQ2_5 IQ3 IQ3_5)*AGE = '+'; * Check out the success of the AGE transforms; PLOT IQ*(AGE_RT2 AGE_RT2_5 AGE_RT3 AGE_RT3_5) = '+'; PLOT IQ*(Log_AGE Log2_AGE Inv_AGE Inv_AGE2) = '+'; Transform IQ (“Y”) by going up Tukey’s Ladder, in steps of 0.5 Transform AGE (“X”) by going down Tukey’s Ladder, into the roots, in steps of 0.5 Transform AGE (“X”) by going down Tukey’s Ladder, into the logs. Transform AGE (“X”) by going down Tukey’s Ladder, into the inverses. Plotting it all out, for inspection. Let’s try some of them out … Data-Analytic Handout I.2(a).2
14
© Willett, Harvard University Graduate School of Education, 1/19/2016S052/I.2(a) – Slide 13 S052/§I.2(a): Dealing “Empirically” With Non-Linear Relationships Implementing the Empirical Approach – Which Is Best for the Bayley Data? Do these transformations do the job? IQ2 ‚ 60000 ˆ ‚ ‚ + 50000 ˆ ‚ + + ‚ ‚ + 40000 ˆ ‚ ‚ + ‚ 30000 ˆ ‚ ‚ + + ‚ ‚ + 20000 ˆ ‚ ‚ + 10000 ˆ ++ ‚ ++ ‚ + ‚ ‚ + 0 ˆ + ++ Šƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒ 0 10 20 30 40 50 60 Age (Months) IQ2 ‚ 60000 ˆ ‚ ‚ + 50000 ˆ ‚ + + ‚ ‚ + 40000 ˆ ‚ ‚ + ‚ 30000 ˆ ‚ ‚ + + ‚ ‚ + 20000 ˆ ‚ ‚ + 10000 ˆ ++ ‚ ++ ‚ + ‚ ‚ + 0 ˆ + ++ Šƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒ 0 10 20 30 40 50 60 Age (Months) IQ2_5 ‚ 800000 ˆ ‚ + ‚ ‚ + + ‚ 600000 ˆ + ‚ ‚ + ‚ 400000 ˆ ‚ ‚ + + ‚ ‚ + ‚ 200000 ˆ ‚ + ‚ ++ + ‚ ++ ‚ + 0 ˆ + ++ + ‚ Šƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒ 0 10 20 30 40 50 60 Age (Months) IQ2_5 ‚ 800000 ˆ ‚ + ‚ ‚ + + ‚ 600000 ˆ + ‚ ‚ + ‚ 400000 ˆ ‚ ‚ + + ‚ ‚ + ‚ 200000 ˆ ‚ + ‚ ++ + ‚ ++ ‚ + 0 ˆ + ++ + ‚ Šƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒ 0 10 20 30 40 50 60 Age (Months)
15
© Willett, Harvard University Graduate School of Education, 1/19/2016S052/I.2(a) – Slide 14 S052/§I.2(a): Dealing “Empirically” With Non-Linear Relationships Implementing the Empirical Approach – Which Is Best for the Bayley Data? IQ_3 ‚ 12000000 ˆ + ‚ ‚ + + 10000000 ˆ ‚ ‚ + ‚ 8000000 ˆ ‚ ‚ + 6000000 ˆ ‚ ‚ + 4000000 ˆ + ‚ ‚ + ‚ 2000000 ˆ ‚ + + ‚ + ++ ‚ + 0 ˆ ++ + + ‚ Šƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒ 0 10 20 30 40 50 60 Age (Months) IQ_3 ‚ 12000000 ˆ + ‚ ‚ + + 10000000 ˆ ‚ ‚ + ‚ 8000000 ˆ ‚ ‚ + 6000000 ˆ ‚ ‚ + 4000000 ˆ + ‚ ‚ + ‚ 2000000 ˆ ‚ + + ‚ + ++ ‚ + 0 ˆ ++ + + ‚ Šƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒ 0 10 20 30 40 50 60 Age (Months) IQ3_5 ‚ 200000000 ˆ ‚ ‚ + 175000000 ˆ ‚ 150000000 ˆ + + ‚ 125000000 ˆ + ‚ 100000000 ˆ ‚ ‚ + ‚ 75000000 ˆ ‚ ‚ + 50000000 ˆ + ‚ ‚ + ‚ 25000000 ˆ ‚ + ‚ + +++ ‚ +++ 0 ˆ ++ + + + ‚ Šƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒ 0 10 20 30 40 50 60 Age (Months) IQ3_5 ‚ 200000000 ˆ ‚ ‚ + 175000000 ˆ ‚ 150000000 ˆ + + ‚ 125000000 ˆ + ‚ 100000000 ˆ ‚ ‚ + ‚ 75000000 ˆ ‚ ‚ + 50000000 ˆ + ‚ ‚ + ‚ 25000000 ˆ ‚ + ‚ + +++ ‚ +++ 0 ˆ ++ + + + ‚ Šƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒ 0 10 20 30 40 50 60 Age (Months) Do these transformations do the job?
16
© Willett, Harvard University Graduate School of Education, 1/19/2016S052/I.2(a) – Slide 15 S052/§I.2(a): Dealing “Empirically” With Non-Linear Relationships Implementing the Empirical Approach – Which Is Best for the Bayley Data? IQ 250 ˆ ‚ ‚ + ‚ + + ‚ ‚ + B 200 ˆ a ‚ y ‚ + l ‚ e ‚ y ‚ + + ‚ I 150 ˆ + n ‚ f ‚ a ‚ n ‚ + t ‚ + ‚ + I 100 ˆ + + Q ‚ + ‚ + + S ‚ c ‚ o ‚ + r ‚ e 50 ˆ ‚ ‚ + ‚ ‚ + ‚+ + 0 ˆ Šˆƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒˆƒ 1 2 3 4 5 6 7 8 AGE_RT2 IQ 250 ˆ ‚ ‚ + ‚ + + ‚ ‚ + B 200 ˆ a ‚ y ‚ + l ‚ e ‚ y ‚ + + ‚ I 150 ˆ + n ‚ f ‚ a ‚ n ‚ + t ‚ + ‚ + I 100 ˆ + + Q ‚ + ‚ + + S ‚ c ‚ o ‚ + r ‚ e 50 ˆ ‚ ‚ + ‚ ‚ + ‚+ + 0 ˆ Šˆƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒˆƒ 1 2 3 4 5 6 7 8 AGE_RT2 Do these transformations do the job? IQ 250 ˆ ‚ ‚ + ‚ + + ‚ ‚ + B 200 ˆ a ‚ y ‚ + l ‚ e ‚ y ‚ + + ‚ I 150 ˆ + n ‚ f ‚ a ‚ n ‚ + t ‚ + ‚ + I 100 ˆ ++ Q ‚ + ‚ + + S ‚ c ‚ o ‚ + r ‚ e 50 ˆ ‚ ‚ + ‚ ‚ + ‚ + + 0 ˆ Šƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒ 1 2 3 4 5 6 AGE_RT2_5 IQ 250 ˆ ‚ ‚ + ‚ + + ‚ ‚ + B 200 ˆ a ‚ y ‚ + l ‚ e ‚ y ‚ + + ‚ I 150 ˆ + n ‚ f ‚ a ‚ n ‚ + t ‚ + ‚ + I 100 ˆ ++ Q ‚ + ‚ + + S ‚ c ‚ o ‚ + r ‚ e 50 ˆ ‚ ‚ + ‚ ‚ + ‚ + + 0 ˆ Šƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒ 1 2 3 4 5 6 AGE_RT2_5
17
© Willett, Harvard University Graduate School of Education, 1/19/2016S052/I.2(a) – Slide 16 S052/§I.2(a): Dealing “Empirically” With Non-Linear Relationships Implementing the Empirical Approach – Which Is Best for the Bayley Data? IQ 250 ˆ ‚ ‚ + ‚ + + ‚ ‚ + B 200 ˆ a ‚ y ‚ + l ‚ e ‚ y ‚ + + ‚ I 150 ˆ + n ‚ f ‚ a ‚ n ‚ + t ‚ + ‚ + I 100 ˆ ++ Q ‚ + ‚ + + S ‚ c ‚ o ‚ + r ‚ e 50 ˆ ‚ ‚ + ‚ ‚ + ‚+ + 0 ˆ Šˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆ 1.0 1.5 2.0 2.5 3.0 3.5 4.0 AGE_RT3 IQ 250 ˆ ‚ ‚ + ‚ + + ‚ ‚ + B 200 ˆ a ‚ y ‚ + l ‚ e ‚ y ‚ + + ‚ I 150 ˆ + n ‚ f ‚ a ‚ n ‚ + t ‚ + ‚ + I 100 ˆ ++ Q ‚ + ‚ + + S ‚ c ‚ o ‚ + r ‚ e 50 ˆ ‚ ‚ + ‚ ‚ + ‚+ + 0 ˆ Šˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆ 1.0 1.5 2.0 2.5 3.0 3.5 4.0 AGE_RT3 Do these transformations do the job? IQ 250 ˆ ‚ ‚ + ‚ + + ‚ ‚ + B 200 ˆ a ‚ y ‚ + l ‚ e ‚ y ‚ + + ‚ I 150 ˆ + n ‚ f ‚ a ‚ n ‚ + t ‚ + ‚ + I 100 ˆ ++ Q ‚ + ‚ ++ S ‚ c ‚ o ‚ + r ‚ e 50 ˆ ‚ ‚ + ‚ ‚ + ‚ + + 0 ˆ Šƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒ 1.0 1.5 2.0 2.5 3.0 3.5 AGE_RT3_5 IQ 250 ˆ ‚ ‚ + ‚ + + ‚ ‚ + B 200 ˆ a ‚ y ‚ + l ‚ e ‚ y ‚ + + ‚ I 150 ˆ + n ‚ f ‚ a ‚ n ‚ + t ‚ + ‚ + I 100 ˆ ++ Q ‚ + ‚ ++ S ‚ c ‚ o ‚ + r ‚ e 50 ˆ ‚ ‚ + ‚ ‚ + ‚ + + 0 ˆ Šƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒ 1.0 1.5 2.0 2.5 3.0 3.5 AGE_RT3_5
18
© Willett, Harvard University Graduate School of Education, 1/19/2016S052/I.2(a) – Slide 17 S052/§I.2(a): Dealing “Empirically” With Non-Linear Relationships Implementing the Empirical Approach – Which Is Best for the Bayley Data? IQ 250 ˆ ‚ ‚ + ‚ + + ‚ ‚ + B 200 ˆ a ‚ y ‚ + l ‚ e ‚ y ‚ + + ‚ I 150 ˆ + n ‚ f ‚ a ‚ n ‚ + t ‚ + ‚ + I 100 ˆ ++ Q ‚ + ‚ ++ S ‚ c ‚ o ‚ + r ‚ e 50 ˆ ‚ ‚ + ‚ ‚ + ‚ + + 0 ˆ Šƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒ 0 1 2 3 4 5 Log_AGE IQ 250 ˆ ‚ ‚ + ‚ + + ‚ ‚ + B 200 ˆ a ‚ y ‚ + l ‚ e ‚ y ‚ + + ‚ I 150 ˆ + n ‚ f ‚ a ‚ n ‚ + t ‚ + ‚ + I 100 ˆ ++ Q ‚ + ‚ ++ S ‚ c ‚ o ‚ + r ‚ e 50 ˆ ‚ ‚ + ‚ ‚ + ‚ + + 0 ˆ Šƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒ 0 1 2 3 4 5 Log_AGE IQ 250 ˆ ‚ ‚ + ‚ + + ‚ ‚ + B 200 ˆ a ‚ y ‚ + l ‚ e ‚ y ‚ + + ‚ I 150 ˆ + n ‚ f ‚ a ‚ n ‚ + t ‚ + ‚ + I 100 ˆ ++ Q ‚ + ‚ + + S ‚ c ‚ o ‚ + r ‚ e 50 ˆ ‚ ‚ + ‚ ‚ + ‚+ + 0 ˆ Šˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆ 0 1 2 3 4 5 6 Log2_AGE IQ 250 ˆ ‚ ‚ + ‚ + + ‚ ‚ + B 200 ˆ a ‚ y ‚ + l ‚ e ‚ y ‚ + + ‚ I 150 ˆ + n ‚ f ‚ a ‚ n ‚ + t ‚ + ‚ + I 100 ˆ ++ Q ‚ + ‚ + + S ‚ c ‚ o ‚ + r ‚ e 50 ˆ ‚ ‚ + ‚ ‚ + ‚+ + 0 ˆ Šˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆ 0 1 2 3 4 5 6 Log2_AGE Do these transformations do the job?
19
© Willett, Harvard University Graduate School of Education, 1/19/2016S052/I.2(a) – Slide 18 S052/§I.2(a): Dealing “Empirically” With Non-Linear Relationships Implementing the Empirical Approach – Which Is Best for the Bayley Data? IQ 250 ˆ ‚ ‚ + ‚ ‚ + B 200 ˆ a ‚ y ‚ + l ‚ e ‚ y ‚ ++ ‚ I 150 ˆ + n ‚ f ‚ a ‚ n ‚ + t ‚ + ‚ + I 100 ˆ + Q ‚ + ‚ ++ S ‚ c ‚ o ‚ + r ‚ e 50 ˆ ‚ ‚ + ‚ ‚ + ‚ + + 0 ˆ Šƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒ -1.0 -0.8 -0.6 -0.4 -0.2 0.0 Inv_AGE IQ 250 ˆ ‚ ‚ + ‚ ‚ + B 200 ˆ a ‚ y ‚ + l ‚ e ‚ y ‚ ++ ‚ I 150 ˆ + n ‚ f ‚ a ‚ n ‚ + t ‚ + ‚ + I 100 ˆ + Q ‚ + ‚ ++ S ‚ c ‚ o ‚ + r ‚ e 50 ˆ ‚ ‚ + ‚ ‚ + ‚ + + 0 ˆ Šƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒ -1.0 -0.8 -0.6 -0.4 -0.2 0.0 Inv_AGE IQ 250 ˆ ‚ ‚ + ‚ ‚ + B 200 ˆ a ‚ y ‚ + l ‚ e ‚ y ‚ + ‚ I 150 ˆ + n ‚ f ‚ a ‚ n ‚ + t ‚ + ‚ + I 100 ˆ + Q ‚ + ‚ + S ‚ c ‚ o ‚ + r ‚ e 50 ˆ ‚ ‚ + ‚ ‚ + ‚ + + 0 ˆ Šƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒ -1.0 -0.8 -0.6 -0.4 -0.2 0.0 Inv_AGE2 IQ 250 ˆ ‚ ‚ + ‚ ‚ + B 200 ˆ a ‚ y ‚ + l ‚ e ‚ y ‚ + ‚ I 150 ˆ + n ‚ f ‚ a ‚ n ‚ + t ‚ + ‚ + I 100 ˆ + Q ‚ + ‚ + S ‚ c ‚ o ‚ + r ‚ e 50 ˆ ‚ ‚ + ‚ ‚ + ‚ + + 0 ˆ Šƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒ -1.0 -0.8 -0.6 -0.4 -0.2 0.0 Inv_AGE2 Do these transformations do the job?
20
© Willett, Harvard University Graduate School of Education, 1/19/2016S052/I.2(a) – Slide 19 S052/§I.2(a): Dealing “Empirically” With Non-Linear Relationships You Can “Fine-Tune” Your Transformation by Adding a Small Constant Before Transforming “Fine-Tune” your transformation by adding a “small” constant “Fine-Tune” your transformation by adding a “small” constant to a variable before you transform it!! most of the time With some transformations, it’s often a good idea to add a constant most of the time … 1/(V+1) is often better than 1/V Log(V+1) is often better than log(V). But, why? most of the time With some transformations, it’s often a good idea to add a constant most of the time … 1/(V+1) is often better than 1/V Log(V+1) is often better than log(V). But, why? Sometimes, there is a constant that works better than 1: When “counts” have been measured, for instance! Sometimes, there is a constant that works better than 1: When “counts” have been measured, for instance! With other transformations, it’s just an empirical question… Data-Analytic Handout I.2(a).3 When, and How Much?
21
© Willett, Harvard University Graduate School of Education, 1/19/2016S052/I.2(a) – Slide 20 S052/§I.2(a): Dealing “Empirically” With Non-Linear Relationships Programming Fine-Tunimg in PC-SAS *-----------------------------------------------------------* Input the data, name and label the variables in the dataset *-----------------------------------------------------------*; DATA BAYLEY; INFILE 'C:\DATA\S052\BAYLEY.txt'; INPUT IQ AGE; LABEL IQ = 'Bayley Infant IQ Score' AGE = 'Age (Months)'; * Create a set of "started" transformations; IQ3_S1= IQ**3; IQ3_S2= (IQ+10)**3; IQ3_S3= (IQ+50)**3; IQ3_S4= (IQ+100)**3; IQ3_S5= (IQ+200)**3; *-----------------------------------------------------------* Inspect the plotted bivariate relationship for linearity *-----------------------------------------------------------*; PROC PLOT DATA=BAYLEY; * Check out the success of the fine tuning; PLOT (IQ3_S1 IQ3_S2 IQ3_S3 IQ3_S4 IQ3_S5)*AGE = '+'; *-----------------------------------------------------------* Input the data, name and label the variables in the dataset *-----------------------------------------------------------*; DATA BAYLEY; INFILE 'C:\DATA\S052\BAYLEY.txt'; INPUT IQ AGE; LABEL IQ = 'Bayley Infant IQ Score' AGE = 'Age (Months)'; * Create a set of "started" transformations; IQ3_S1= IQ**3; IQ3_S2= (IQ+10)**3; IQ3_S3= (IQ+50)**3; IQ3_S4= (IQ+100)**3; IQ3_S5= (IQ+200)**3; *-----------------------------------------------------------* Inspect the plotted bivariate relationship for linearity *-----------------------------------------------------------*; PROC PLOT DATA=BAYLEY; * Check out the success of the fine tuning; PLOT (IQ3_S1 IQ3_S2 IQ3_S3 IQ3_S4 IQ3_S5)*AGE = '+'; by adding constants of gradually increasing sizes Fine-tuning the “best” transformation so far, by adding constants of gradually increasing sizes Plotting it all out, for inspection Data-Analytic Handout I.2(a).3
22
© Willett, Harvard University Graduate School of Education, 1/19/2016S052/I.2(a) – Slide 21 S052/§I.2(a): Dealing “Empirically” With Non-Linear Relationships Implementing the Fine-Tuning – Which Is Best for the Bayley Data? IQ3_S2 14000000 ˆ ‚ + ‚ 12000000 ˆ + + ‚ 10000000 ˆ + ‚ IQ3_S2 ‚ ‚ 8000000 ˆ ‚ + ‚ 6000000 ˆ ‚ ‚ + + ‚ 4000000 ˆ + ‚ ‚ + 2000000 ˆ + ‚ ++ ‚ + 0 ˆ ++ + + Šƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒ 0 10 20 30 40 50 60 Age (Months) IQ3_S2 14000000 ˆ ‚ + ‚ 12000000 ˆ + + ‚ 10000000 ˆ + ‚ IQ3_S2 ‚ ‚ 8000000 ˆ ‚ + ‚ 6000000 ˆ ‚ ‚ + + ‚ 4000000 ˆ + ‚ ‚ + 2000000 ˆ + ‚ ++ ‚ + 0 ˆ ++ + + Šƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒ 0 10 20 30 40 50 60 Age (Months) Do these transformations do the job? IQ3_S3 25000000 ˆ ‚ ‚ + ‚ 20000000 ˆ ‚ + + ‚ IQ3_S3 ‚ ‚ ‚ + ‚ 15000000 ˆ ‚ ‚ + ‚ 10000000 ˆ + ‚ + ‚ ‚ + ‚ 5000000 ˆ + ‚ + ‚ + ++ ‚ ++ ‚ + 0 ˆ ++ + Šƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒ 0 10 20 30 40 50 60 Age (Months) IQ3_S3 25000000 ˆ ‚ ‚ + ‚ 20000000 ˆ ‚ + + ‚ IQ3_S3 ‚ ‚ ‚ + ‚ 15000000 ˆ ‚ ‚ + ‚ 10000000 ˆ + ‚ + ‚ ‚ + ‚ 5000000 ˆ + ‚ + ‚ + ++ ‚ ++ ‚ + 0 ˆ ++ + Šƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒ 0 10 20 30 40 50 60 Age (Months)
23
© Willett, Harvard University Graduate School of Education, 1/19/2016S052/I.2(a) – Slide 22 S052/§I.2(a): Dealing “Empirically” With Non-Linear Relationships Implementing the Fine-Tuning – Which Is Best for the Bayley Data? Do these transformations do the job? IQ3_S4 35000000 ˆ + ‚ ‚ + + ‚ 30000000 ˆ ‚ ‚ + ‚ 25000000 ˆ ‚ + IQ3_S4 ‚ ‚ 20000000 ˆ ‚ + ‚ 15000000 ˆ + ‚ ‚ + 10000000 ˆ + ‚ + ‚ + + ‚ + 5000000 ˆ ‚ + ‚ ++ 0 ˆ Šƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒ 0 10 20 30 40 50 60 Age (Months) IQ3_S4 35000000 ˆ + ‚ ‚ + + ‚ 30000000 ˆ ‚ ‚ + ‚ 25000000 ˆ ‚ + IQ3_S4 ‚ ‚ 20000000 ˆ ‚ + ‚ 15000000 ˆ + ‚ ‚ + 10000000 ˆ + ‚ + ‚ + + ‚ + 5000000 ˆ ‚ + ‚ ++ 0 ˆ Šƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒ 0 10 20 30 40 50 60 Age (Months) IQ3_S5 ‚ 80000000 ˆ ‚ + ‚ ‚ + + 70000000 ˆ ‚ + ‚ 60000000 ˆ ‚ + ‚ 50000000 ˆ ‚ + + ‚ ‚ + 40000000 ˆ ‚ ‚ + 30000000 ˆ ++ ‚ + + ‚ + 20000000 ˆ ‚ + ‚ ‚ + 10000000 ˆ + + ‚ + ‚ 0 ˆ ‚ Šƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒ 0 10 20 30 40 50 60 Age (Months) IQ3_S5 ‚ 80000000 ˆ ‚ + ‚ ‚ + + 70000000 ˆ ‚ + ‚ 60000000 ˆ ‚ + ‚ 50000000 ˆ ‚ + + ‚ ‚ + 40000000 ˆ ‚ ‚ + 30000000 ˆ ++ ‚ + + ‚ + 20000000 ˆ ‚ + ‚ ‚ + 10000000 ˆ + + ‚ + ‚ 0 ˆ ‚ Šƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒ 0 10 20 30 40 50 60 Age (Months)
24
© Willett, Harvard University Graduate School of Education, 1/19/2016S052/I.2(a) – Slide 23 S052/§I.2(a): Dealing “Empirically” With Non-Linear Relationships Interpreting the Final Fitted Model? And the answer is … erk! You might be tempted to ask what this fitted model actually means, after all this transformation? It’s “Easy to Apply, Hard to Interpret!!!” actually mean but on the other hand we can be confident that any statistical inference has been conducted appropriately in the transformed world 1.Let’s admit that it’s hard to interpret what the fitted intercept and slope actually mean in the transformed specification, but on the other hand we can be confident that any statistical inference has been conducted appropriately in the transformed world. we can de-transform the fitted model and examine the fitted relationship back in the untransformed world 2.Now, protected against failures of inference, we can de-transform the fitted model and examine the fitted relationship back in the untransformed world … Data-Analytic Handout 1.2(a).4 … Data-Analytic Handout 1.2(a).4 It’s “Easy to Apply, Hard to Interpret!!!” actually mean but on the other hand we can be confident that any statistical inference has been conducted appropriately in the transformed world 1.Let’s admit that it’s hard to interpret what the fitted intercept and slope actually mean in the transformed specification, but on the other hand we can be confident that any statistical inference has been conducted appropriately in the transformed world. we can de-transform the fitted model and examine the fitted relationship back in the untransformed world 2.Now, protected against failures of inference, we can de-transform the fitted model and examine the fitted relationship back in the untransformed world … Data-Analytic Handout 1.2(a).4 … Data-Analytic Handout 1.2(a).4
25
© Willett, Harvard University Graduate School of Education, 1/19/2016S052/I.2(a) – Slide 24 S052/§I.2(a): Dealing “Empirically” With Non-Linear Relationships Interpreting the Final Fitted Model? *---------------------------------------------------------------------* Complete the linear regression analysis in the transformed world *--------------------------------------------------------------------- *; PROC REG DATA=BAYLEY; M1: MODEL IQ3_S4 = AGE; * Output the predicted values in order to plot a fitted trend line; OUTPUT OUT=BAYLEY PREDICTED=P_IQ3_S4; * Plot the observed & fitted relationships,in the transformed world; PROC PLOT DATA=BAYLEY; PLOT IQ3_S4*AGE = '0' P_IQ3_S4*AGE = 'P'/ OVERLAY; *---------------------------------------------------------------------* Complete the linear regression analysis in the transformed world *--------------------------------------------------------------------- *; PROC REG DATA=BAYLEY; M1: MODEL IQ3_S4 = AGE; * Output the predicted values in order to plot a fitted trend line; OUTPUT OUT=BAYLEY PREDICTED=P_IQ3_S4; * Plot the observed & fitted relationships,in the transformed world; PROC PLOT DATA=BAYLEY; PLOT IQ3_S4*AGE = '0' P_IQ3_S4*AGE = 'P'/ OVERLAY; Fit the model in the transformed world, & obtain predicted values. Plot, & overlay, the observed and predicted values of the outcome against the predictor, in the transformed world IQ3_S4 ‚ ‚ 40000000 ˆ ‚ P ‚ ‚ 0 ‚ P ‚ 0 0 ‚ 30000000 ˆ P ‚ 0 ‚ P ‚ ‚ 0 ‚ P ‚ 20000000 ˆ ‚ 0 ‚ 0 P ‚ P ‚ ‚ 0 10000000 ˆ 0 ‚ 00 0 ‚ 00 ‚ 0 ‚ P 0 ‚ 0 00 0 ˆ ‚ Šƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒ 0 10 20 30 40 50 60 Age (Months) IQ3_S4 ‚ ‚ 40000000 ˆ ‚ P ‚ ‚ 0 ‚ P ‚ 0 0 ‚ 30000000 ˆ P ‚ 0 ‚ P ‚ ‚ 0 ‚ P ‚ 20000000 ˆ ‚ 0 ‚ 0 P ‚ P ‚ ‚ 0 10000000 ˆ 0 ‚ 00 0 ‚ 00 ‚ 0 ‚ P 0 ‚ 0 00 0 ˆ ‚ Šƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒ 0 10 20 30 40 50 60 Age (Months) Data-Analytic Handout I.2(a).4
26
© Willett, Harvard University Graduate School of Education, 1/19/2016S052/I.2(a) – Slide 25 S052/§I.2(a): Dealing “Empirically” With Non-Linear Relationships Interpreting the Final Fitted Model? *---------------------------------------------------------------------------------* Detransform the fitted values and examine the fitted relationship back in the untransformed world *---------------------------------------------------------------------------------*; * First reverse the transformation of the fitted values in a data step; DATA BAYLEY; SET BAYLEY; * Here's the reverse transformation; P_IQ = (P_IQ3_S4)**(1/3)-100; * Now plot the observed & fitted relationships for comparison, but in the untransformed world; PROC PLOT DATA=BAYLEY; PLOT IQ*AGE = '0' P_IQ*AGE ='P'/ HAXIS=0 TO 60 BY 10 VAXIS=0 TO 250 BY 25 OVERLAY; *---------------------------------------------------------------------------------* Detransform the fitted values and examine the fitted relationship back in the untransformed world *---------------------------------------------------------------------------------*; * First reverse the transformation of the fitted values in a data step; DATA BAYLEY; SET BAYLEY; * Here's the reverse transformation; P_IQ = (P_IQ3_S4)**(1/3)-100; * Now plot the observed & fitted relationships for comparison, but in the untransformed world; PROC PLOT DATA=BAYLEY; PLOT IQ*AGE = '0' P_IQ*AGE ='P'/ HAXIS=0 TO 60 BY 10 VAXIS=0 TO 250 BY 25 OVERLAY; De-transform the predicted values. Plot, & overlay, the observed and predicted values of the outcome against the predictor, in the untransformed world ‚ 250 ˆ ‚ ‚ P 225 ˆ P 0 B ‚ 0 0 a ‚ 0 P y 200 ˆ P l ‚ e ‚ 0 y 175 ˆ ‚ 0 I ‚ 0 P n 150 ˆ 0 P f ‚ P a ‚ n 125 ˆ 0 t ‚ 0 ‚ P0 I 100 ˆ 0 0 Q ‚ 0 0 ‚ 0 S 75 ˆ c ‚ 0 o ‚ P r 50 ˆ e ‚ ‚ P 0 25 ˆ P ‚ 0 ‚ P0 0 ˆ 0 ‚ Šˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆ 0 10 20 30 40 50 60 Age (Months) ‚ 250 ˆ ‚ ‚ P 225 ˆ P 0 B ‚ 0 0 a ‚ 0 P y 200 ˆ P l ‚ e ‚ 0 y 175 ˆ ‚ 0 I ‚ 0 P n 150 ˆ 0 P f ‚ P a ‚ n 125 ˆ 0 t ‚ 0 ‚ P0 I 100 ˆ 0 0 Q ‚ 0 0 ‚ 0 S 75 ˆ c ‚ 0 o ‚ P r 50 ˆ e ‚ ‚ P 0 25 ˆ P ‚ 0 ‚ P0 0 ˆ 0 ‚ Šˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆ 0 10 20 30 40 50 60 Age (Months) Data-Analytic Handout I.2(a).4
27
© Willett, Harvard University Graduate School of Education, 1/19/2016S052/I.2(a) – Slide 26 S052/§I.2(a): Dealing “Empirically” With Non-Linear Relationships Appendix I: What Do Transformations Actually Do? Question Question:Why do transformations do what they do? Answer Answer:Because they affect different sized numbers differently! Question Question:Why do transformations do what they do? Answer Answer:Because they affect different sized numbers differently! Square Root Root 0---1---2---3---4---5---6---7---8---9---10---11---12---13---14---15---16---17---18---19---20---21---22---23---24---25 Original Transformed Square Original Transformed
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.