Testing statistical significance of differences between coefficients Jane E. Miller, PhD The Chicago Guide to Writing about Multivariate Analysis, 2nd.

Slides:



Advertisements
Similar presentations
Multiple Regression W&W, Chapter 13, 15(3-4). Introduction Multiple regression is an extension of bivariate regression to take into account more than.
Advertisements

Chapter 14, part D Statistical Significance. IV. Model Assumptions The error term is a normally distributed random variable and The variance of  is constant.
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
PSY 307 – Statistics for the Behavioral Sciences
The Simple Regression Model
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Correlation and Regression Analysis
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.
Example of Simple and Multiple Regression
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Organizing data in tables and charts: Different criteria for different tasks Jane.
Logarithmic specifications Jane E. Miller, PhD The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Chapter 13: Inference in Regression
Hypothesis Testing in Linear Regression Analysis
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Creating effective tables and charts Jane E. Miller, PhD.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Calculating interaction patterns from logit coefficients: Interaction between two.
Understanding Multivariate Research Berry & Sanders.
Comparing overall goodness of fit across models
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 11 Section 2 – Slide 1 of 25 Chapter 11 Section 2 Inference about Two Means: Independent.
Ms. Khatijahhusna Abd Rani School of Electrical System Engineering Sem II 2014/2015.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Calculating the shape of a polynomial from regression coefficients Jane E. Miller,
1 1 Slide © 2005 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved Chapter 13 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
1 1 Slide Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple Coefficient of Determination n Model Assumptions n Testing.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Differentiating between statistical significance and substantive importance Jane.
The Chicago Guide to Writing about Numbers, 2 nd edition. Summarizing a pattern involving many numbers: Generalization, example, exception (“GEE”) Jane.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
INTRODUCTORY LINEAR REGRESSION SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.
1 1 Slide Simple Linear Regression Coefficient of Determination Chapter 14 BA 303 – Spring 2011.
The Chicago Guide to Writing about Numbers, 2 nd edition. Differentiating between statistical significance and substantive importance Jane E. Miller, PhD.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Writing prose to present results of interactions Jane E. Miller, PhD.
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Criteria for choosing a reference category Jane E. Miller, PhD.
Difference Between Means Test (“t” statistic) Analysis of Variance (“F” statistic)
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Defining the Goldilocks problem Jane E. Miller, PhD.
Chapter 13 Multiple Regression
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. Conducting post-hoc tests of compound coefficients using simple slopes for a categorical.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Interpreting multivariate OLS and logit coefficients Jane E. Miller, PhD.
Standardized coefficients Jane E. Miller, PhD The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Choosing tools to present numbers: Tables, charts, and prose Jane E. Miller, PhD.
Chapter Seventeen. Figure 17.1 Relationship of Hypothesis Testing Related to Differences to the Previous Chapter and the Marketing Research Process Focus.
The Chicago Guide to Writing about Numbers, 2 nd edition. Choosing a comparison group Jane E. Miller, PhD.
Environmental Modeling Basic Testing Methods - Statistics III.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Resolving the Goldilocks problem: Variables and measurement Jane E. Miller, PhD.
Introduction to testing statistical significance of interactions Jane E. Miller, PhD The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Visualizing shapes of interaction patterns between two categorical independent.
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. Conducting post-hoc tests of compound coefficients using simple slopes for a categorical.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Visualizing shapes of interaction patterns with continuous independent variables.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Resolving the Goldilocks problem: Presenting results Jane E. Miller, PhD.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Creating charts to present interactions Jane E. Miller, PhD.
Approaches to testing statistical significance of interactions Jane E. Miller, PhD The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Multiple Regression Learning Objectives n Explain the Linear Multiple Regression Model n Interpret Linear Multiple Regression Computer Output n Test.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 11 Section 3 – Slide 1 of 27 Chapter 11 Section 3 Inference about Two Population Proportions.
1 1 Slide © 2011 Cengage Learning Assumptions About the Error Term  1. The error  is a random variable with mean of zero. 2. The variance of , denoted.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Resolving the Goldilocks problem: Model specification Jane E. Miller, PhD.
Appendix I A Refresher on some Statistical Terms and Tests.
The 2 nd to last topic this year!!.  ANOVA Testing is similar to a “two sample t- test except” that it compares more than two samples to one another.
INTRODUCTION TO MULTIPLE REGRESSION MULTIPLE REGRESSION MODEL 11.2 MULTIPLE COEFFICIENT OF DETERMINATION 11.3 MODEL ASSUMPTIONS 11.4 TEST OF SIGNIFICANCE.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Calculating interaction effects from OLS coefficients: Interaction between 1 categorical.
CHAPTER 15: THE NUTS AND BOLTS OF USING STATISTICS.
Overview of categorical by categorical interactions: Part I: Concepts, definitions, and shapes Interactions in regression models occur when the association.
Calculating interaction effects from OLS coefficients: Interaction between two categorical independent variables Jane E. Miller, PhD As discussed in the.
Using alternative reference categories to test statistical significance of an interaction This podcast is the last in the series on testing statistical.
Creating variables and specifying models to test for interactions between two categorical independent variables This lecture is the third in the series.
CHAPTER 29: Multiple Regression*
Introduction to interactions in regression models: Concepts and equations Jane E. Miller, PhD Interactions in regression models occur when the association.
Overview of categorical by continuous interactions: Part II: Variables, specifications, and calculations Interactions in regression models occur when.
Testing whether a multivariate specification can be simplified
Presentation transcript:

Testing statistical significance of differences between coefficients Jane E. Miller, PhD The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Overview Review: Inferential statistical tests for coefficients Testing statistical significance of differences – Between coefficients in the same model – Between coefficients in independent models Standard error of the difference Presenting results of tests of differences between coefficients The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Review: Statistical significance of βs In the standard output from a regression model, inferential statistics provide the information to test whether the coefficient on an independent variable is statistically significantly different from zero For continuous independent variables – Whether the marginal effect of a one-unit increase in that IV is different from zero For categorical independent variables – Whether difference between the mean of the DV for the specified group and the reference category is different from zero The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Estimated coefficients from an OLS model of birth weight in grams Coefficient Standard error Intercept3,317.8**25.1 Mother’s age at child’s birth (years)10.7**1.2 Mother’s education < High school (<HS)–55.5**19.3 = High school (=HS)–53.9**14.8 (> High school; >HS) ** denotes p < 0.01 Reference category in parenthesis The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Example: β on a continuous IV OLS model of birth weight in grams includes mother’s age in years as an independent variable β mother’s age = 10.7 with a standard error (s.e.) of 1.2, p < – Thus we reject the null hypothesis H 0: β mother’s age = 0 We conclude that the slope of the association between mother’s age and birth weight is statistically significantly different from zero at p < The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Example: β on a categorical IV The birth weight model includes an ordinal measure of mother’s educational attainment > HS is the reference category β <HS = –55.5 with a standard error (s.e.) of 19.3, p < – Thus we reject the null hypothesis H 0: β <HS = 0 H 0: mean birth weight for HS Mean birth weight for infants born to mothers with HS The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Testing other hypotheses For some research questions, you might need to test a hypothesis in addition to  i = 0. E.g., whether – Two  s in a given model are statistically significantly different from one another E.g.,  <HS =  =HS – The size and statistical significance of a  changes across models when additional covariates such as confounders or mediators are included in the model E.g., H 0 :  non-Hispanic black (I) =  non-Hispanic black (II) – The effect of a covariate differs across models estimated for independent subgroups (stratified models) E.g., H 0 :  <HS is the same for males as for females The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Testing statistical significance of differences between coefficients To formally test statistical significance of differences between coefficients, e.g., H 0 : β j = β k – Divide the difference between the estimated coefficients (  j −  i ) by the standard error of the difference to obtain the test statistic – Compare the calculated test statistic against the pertinent critical value with one degree of freedom The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Standard error of the difference The standard error of the difference is calculated: √[var(  j ) + (2 × cov(  j,  k )) + var(  k ) ] – var(  j ) and var(  k ) are the variances of  j and  k – cov(  j,  k ) is the covariance between  j and  k When  j and  k are from different models – Considered statistically independent of one another cov(  j,  k ) = 0 When  j and  k are from within one regression model – Not independent of one another cov(  j,  k ) ≠ 0 The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. Square root

Testing differences of  s from one model When  j and  k are from the same model, must include the covariance in the calculation of the standard error of the difference √[var(  j ) + (2 × cov(  j,  k )) + var(  k ) ] The complete variance-covariance matrix for a regression can be requested as part of the output The variance of each coefficient can be calculated from its standard error (s.e.) var(  j ) = [s.e.(  j )] 2 The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Example: Testing whether β <HS = β =HS From the table,  <HS = –55.5 and  =HS = –53.9 The difference between β <HS and β =HS is calculated β <HS – β =HS = –55.5 –(–53.9) = 1.6 For that model, var(  <HS ) = var(  =HS ) = cov(  <HS,  =HS ) = Plugging those values into the formula for the standard error of the difference yields = √[ (2 × 137.8) ] = The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Example, cont.: Testing β <HS = β =HS To calculate the test statistic, divide the difference between  <HS and  =HS by the standard error of the difference: (β <HS – β =HS )/s.e. (β <HS – β =HS ) = 1.6/17.7 = < 1.96 (the critical value of 1.96 for a t-test with ∞ degrees of freedom at p < 0.05) Cannot reject the null hypothesis that β <HS = β =HS The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

TEST statement Many software packages can do these calculations for you To test other contrasts among categories, request the test statistic for equality of coefficients for pairs of coefficients: H 0 : β j = β k – E.g., to test whether predicted birth weight is statistically significantly different for infants born to mothers with < HS than for those with = HS Specify “TEST ‘<HS’ = ‘=HS’” in your SAS syntax Output for H 0 : β <HS = β =HS reports an F-statistic of 0.01 with a p-value of 0.93 The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. Neither is the reference category

Testing differences of  s from independent models When  j and  k are from different models they can be assumed to be independent of one another cov(  j,  k ) = 0 Thus the formula for the standard error of the difference √[var(  j ) + (2 × cov(  j,  k )) + var(  k ) ] simplifies to √[var(  j ) + var(  k ) ] Reminder: var(  j ) and var(  k ) can be calculated from the standard error reported in the regression output var(  j ) = [s.e.(  j )] 2 The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Example: Change in βs across nested models In nested models I and II,  s on non-Hispanic black are  NHB(I) = –244.5, s.e. = 16.7  NHB (II) = –147.2, s.e. = 17.6 The change in β between models I and II: –244.5 –(–147.2 ) = 97.3 Plugging the standard errors for  NHB(I) and  NHB(II) into the formula for standard error of the difference yields (s.e. difference) = √ [(16.7) 2 + (17.6) 2 ] = 24.3 The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Change in βs across nested models, cont. The t statistic for the difference in β is calculated: (difference in β)/ s.e.(difference in β) Plugging in the values from the previous slide: 97.3 ÷ 24.3 = exceeds the critical value of 2.56 for p < 0.01, so we conclude that the change in  NHB between models I and II is statistically significant at p < 0.01 The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Tables to present multivariate results In the table of multivariate statistics, for each independent variable in the model, present – The estimated coefficient (  ) – The standard error See chapters 5 and 11 of Writing about Multivariate Analysis, 2nd Edition for guidelines and examples of multivariate tables The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Prose to present results of differences between coefficients Introduce the substantive reason behind the test for difference between  s, given your – Research question – Variables (categories, units) Report and interpret the results of the formal statistical test of difference between coefficients – Test statistic – Accompanying degrees of freedom Explain the conclusions you draw from that test about specification of your model The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Poor presentation: Results of test differences between  s “From table 15.3, Model III we have  <HS = –55.5 and  =HS = –53.9, so the difference between β <HS and β =HS is β <HS – β =HS = –55.5 – (–53.9) = 1.6. For that model, var(  <HS ) = 370.9, var(  =HS ) = 218.8, and cov(  <HS,  =HS ) = Plugging those values into the formula for the standard error of the difference yields √[ (2 × 137.8) ] = To calculate the test statistic, divide the difference between  <HS and  =HS by the standard error of the difference: (β <HS – β =HS )/s.e. (β <HS – β =HS ) = 1.6/17.7 = 0.09, which is less than the critical value of 1.96 for a t-test with ∞ degrees of freedom at p < 0.05). Thus we cannot reject the null hypothesis that β <HS = β =HS.” – Except for an assignment in a course where you must demonstrate that you know this logic, skip the statistics lesson to your readers! The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Better presentation: Results of test differences between  s “The 1.6 unit (gram) birth weight difference between the estimated coefficients for ‘less than high school’ and ‘high school graduate’ in Model III is not statistically significant (F-statistic for the test of difference = 0.01; p = 0.93).” – Mentions the Dependent variable Independent variable (educational attainment) Units or categories Purpose of the test for a change in  NHB across nested models Magnitude Statistical significance Direction (not mentioned because trivially small) The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Example presentation: Change in  across nested models “As shown in table 15.3, the coefficient on non-Hispanic black decreases 93 points (grams), from –244.5 in model I to –147.2 in model II (t = 4.01; p < 0.01). Thus, the addition of controls for socioeconomic characteristics is associated with a large, statistically significant decrease in the birth weight deficit for non-Hispanic black compared to non-Hispanic white infants.” – Mentions the Dependent variable Independent variables and their units or categories Purpose of the test for a change in  NHB across nested models Direction Magnitude Statistical significance

Summary To test hypotheses other than H 0: β i = 0, calculate a test statistic from the difference in coefficients and the standard error of the difference Compare that test statistic against the critical value βs from different models are considered statistically independent of one another, so the covariance is not needed to compute standard error of the difference E.g., nested models, stratified models βs from the same model are not statistically independent of one another, so the covariance is needed to compute standard error of the difference The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Summary, cont. If coefficients are not statistically significantly different from one another, the model specification often can be simplified by combining terms Then test effect of simplified specification on overall fit using model GOF statistics Present results of difference between coefficients – Use a combination of tables and prose – Describe conclusions, not process – Relate to topic at hand

Suggested resources Miller, J. E The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. University of Chicago Press. Chapters 11 and 15. Freedman, David, Robert Pisani, and Roger Purves Statistics, 4th Edition. New York: W. W. Norton. Gujarati, Damodar N Basic Econometrics, 4th Edition. New York: McGraw-Hill/Irwin. The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Suggested online resources Podcasts on – Interpreting coefficients from OLS and logit models – Comparing overall goodness of fit across models – Testing whether a multivariate specification can be simplified The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Suggested practice exercises Study guide to The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. – Questions #2, 3, and 5 in the problem set for chapter 11 – Suggested course extensions for chapter 11 “Reviewing” exercise #2 “Applying statistics and writing” exercise #3 The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Contact information Jane E. Miller, PhD Online materials available at The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.