The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Resolving the Goldilocks problem: Model specification Jane E. Miller, PhD.

Slides:



Advertisements
Similar presentations
Prediction, Goodness-of-Fit, and Modeling Issues ECON 6002 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes.
Advertisements

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. Getting to know your variables Jane E. Miller, PhD The Chicago Guide to Writing.
Qualitative Variables and
The Use and Interpretation of the Constant Term
LINEAR REGRESSION MODEL
Choosing a Functional Form
Econ Prof. Buckles1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 4. Further Issues.
1Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 4. Further Issues.
Organizing data in tables and charts: Criteria for effective presentation Jane E. Miller, Ph.D. Rutgers University.
Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.
1. Homework #2 2. Inferential Statistics 3. Review for Exam.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Organizing data in tables and charts: Different criteria for different tasks Jane.
9 - 1 Intrinsically Linear Regression Chapter Introduction In Chapter 7 we discussed some deviations from the assumptions of the regression model.
Nonlinear Regression Functions
Logarithmic specifications Jane E. Miller, PhD The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Paper versus speech versus poster: Different formats for communicating research.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Creating effective tables and charts Jane E. Miller, PhD.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Calculating interaction patterns from logit coefficients: Interaction between two.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Numbers as evidence: Applying expository writing techniques to writing about numbers.
Comparing overall goodness of fit across models
© 2004 Prentice-Hall, Inc.Chap 15-1 Basic Business Statistics (9 th Edition) Chapter 15 Multiple Regression Model Building.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Calculating the shape of a polynomial from regression coefficients Jane E. Miller,
The Chicago Guide to Writing about Numbers, 2nd Edition. Getting to know your variables Jane E. Miller, PhD The Chicago Guide to Writing about Multivariate.
Types of quantitative comparisons Jane E. Miller, PhD The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Differentiating between statistical significance and substantive importance Jane.
The Chicago Guide to Writing about Numbers, 2 nd edition. Summarizing a pattern involving many numbers: Generalization, example, exception (“GEE”) Jane.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Preparing speaker’s notes and practicing your talk Jane E. Miller, PhD.
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. Implementing “generalization, example, exception”: Behind-the-scenes work for summarizing.
Writing about ratios Jane E. Miller, PhD The Chicago Guide to Writing about Numbers, 2nd Edition.
The Chicago Guide to Writing about Numbers, 2 nd edition. Basics of writing about numbers: Reporting one number Jane E. Miller, PhD.
The Chicago Guide to Writing about Numbers, 2 nd edition. Differentiating between statistical significance and substantive importance Jane E. Miller, PhD.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Writing prose to present results of interactions Jane E. Miller, PhD.
May 2004 Prof. Himayatullah 1 Basic Econometrics Chapter 6 EXTENSIONS OF THE TWO-VARIABLE LINEAR REGRESSION MODEL.
Multiple Regression Lab Chapter Topics Multiple Linear Regression Effects Levels of Measurement Dummy Variables 2.
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. Data structure for a discrete-time event history analysis Jane E. Miller, PhD.
Planning how to create the variables you need from the variables you have Jane E. Miller, PhD The Chicago Guide to Writing about Numbers, 2 nd edition.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Criteria for choosing a reference category Jane E. Miller, PhD.
Choosing tools to present numbers: Tables, charts, and prose Jane E. Miller, PhD The Chicago Guide to Writing about Numbers, 2nd Edition.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Defining the Goldilocks problem Jane E. Miller, PhD.
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. Conducting post-hoc tests of compound coefficients using simple slopes for a categorical.
Prediction, Goodness-of-Fit, and Modeling Issues Prepared by Vera Tabakova, East Carolina University.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Interpreting multivariate OLS and logit coefficients Jane E. Miller, PhD.
Standardized coefficients Jane E. Miller, PhD The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Choosing tools to present numbers: Tables, charts, and prose Jane E. Miller, PhD.
The Chicago Guide to Writing about Numbers, 2 nd edition. Choosing a comparison group Jane E. Miller, PhD.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Resolving the Goldilocks problem: Variables and measurement Jane E. Miller, PhD.
Introduction to testing statistical significance of interactions Jane E. Miller, PhD The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Testing statistical significance of differences between coefficients Jane E. Miller, PhD The Chicago Guide to Writing about Multivariate Analysis, 2nd.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Visualizing shapes of interaction patterns between two categorical independent.
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. Conducting post-hoc tests of compound coefficients using simple slopes for a categorical.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Visualizing shapes of interaction patterns with continuous independent variables.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Resolving the Goldilocks problem: Presenting results Jane E. Miller, PhD.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Creating charts to present interactions Jane E. Miller, PhD.
Approaches to testing statistical significance of interactions Jane E. Miller, PhD The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
4-1 MGMG 522 : Session #4 Choosing the Independent Variables and a Functional Form (Ch. 6 & 7)
PSY 325 AID Education Expert/psy325aid.com FOR MORE CLASSES VISIT
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Calculating interaction effects from OLS coefficients: Interaction between 1 categorical.
Overview of categorical by categorical interactions: Part I: Concepts, definitions, and shapes Interactions in regression models occur when the association.
Bivariate & Multivariate Regression Analysis
Calculating interaction effects from OLS coefficients: Interaction between two categorical independent variables Jane E. Miller, PhD As discussed in the.
Multiple Regression: I
Using alternative reference categories to test statistical significance of an interaction This podcast is the last in the series on testing statistical.
Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.
Creating variables and specifying models to test for interactions between two categorical independent variables This lecture is the third in the series.
Introduction to interactions in regression models: Concepts and equations Jane E. Miller, PhD Interactions in regression models occur when the association.
Overview of categorical by continuous interactions: Part II: Variables, specifications, and calculations Interactions in regression models occur when.
Testing whether a multivariate specification can be simplified
Presentation transcript:

The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Resolving the Goldilocks problem: Model specification Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Overview Model specification approaches to resolving the Goldilocks problem include – Standardized coefficients – Logarithmic transformation – Other specification issues

The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Standardized coefficients

The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Unstandardized coefficients Unstandardized βs estimate the effect of a 1- unit increase in X i on Y, where the effect size is measured in the original units of Y. A “one-size-fits-all” approach to interpreting βs can be misleading because variables – Represent different levels of measurement, – Have different units of measurement, – Have varying distributions of values, – Occur in different real-world circumstances.

The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Standardized coefficients A standardized coefficient estimates the effect of a one-standard-deviation increase in X i on Y – Measured in standard deviation units of Y e.g., an effect size of 0.3 would mean 30% of a standard deviation in the dependent variable – Similar to standardized scores or z-scores Standardized βs provide a consistent metric in which to compare the relative sizes of the βs on continuous independent variables with different ranges and scales. – Contrast for each IV is its standard deviation

The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Using standardized coefficients Commonly used for psychological or attitudinal scales for which the units have no inherent meaning. Should not be used for variables for which a one-standard-deviation increase lacks an intuitive interpretation. E.g., – dummy variables – interaction terms

The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Specifying a model with standardized coefficients Easily specified as an option to an OLS model in most statistical packages. Identify the dependent and independent variables as usual. – Enter them in the model specification in their original, untransformed versions. Do not create versions in the metric of standard deviations. The software will do that for you! Request “standardized betas”

The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Descriptive statistics to report if you use standardized coefficients In table of descriptive statistics, report the mean, minimum and maximum values and standard deviation in the original units for – each independent variable (IV) – the dependent variable (DV)

The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Describing standardized coefficients in prose In the results section, interpret the effect sizes for different IVs in terms of multiples or percentages of the standard deviation in the DV – E.g., “A one-standard-deviation increase in the income-to-poverty ratio (IPR) is associated with an increase of 19.6% of a standard deviation in birth weight (about 38 grams), roughly twice the size of the corresponding standardized coefficient on mother’s age (9.7%).”

The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Reporting the effect size in original units “A one-standard-deviation increase in the income-to- poverty ratio (IPR) is associated with an increase of 19.6% of a standard deviation in birth weight (about 38 grams), roughly twice the size of the corresponding standardized coefficient on mother’s age (9.7%).” Note that the effect size is also reported back in the original units of the DV (grams in this case), to facilitate intuitive understanding in the context of the specific research question and variables.

The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Logarithmic specifications

The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Logarithmic specifications Another approach to comparing βs across variables with different ranges and scales is to take logarithms of the – dependent variable (Y), – independent variable(s) (X i s), – or both. The βs on the transformed variable(s) lend themselves to straightforward interpretations such as percentage change.

The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Types of logarithmic specifications Lin-lin Lin-log Log-lin Log-log – Also known as “double log”

The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Lin-lin specifications Review: For OLS models in which neither the IV nor the DV is logged, β measures the change in Y for a 1-unit increase in X 1, – the changes are measured in the respective units of the IV and DV. In the lingo of logarithmic specifications, these models are termed “lin-lin” models because they are linear in both the IV and DV Y = β 0 + β 1 X 1

The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Lin-log specifications Lin-log models are of the form Y = β 0 + β 1 lnX 1. Where lnX 1 is the natural log (base e) of X 1 For such models, β 1 ÷ 100 gives the change in the original units of the DV for a 1 percent increase in the IV. E.g., in a model of earnings, β log(hours worked) = 5,905.3: – “Each 1 percent increase in monthly hours worked is associated with a NT$ 59 increase in monthly earnings.”

The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Log-lin specifications Log-lin models are of the form lnY = β 0 + β 1 X 1. For such models, 100  (e β – 1) gives the percentage change in Y for a 1-unit increase in X 1, – Where the increase in X 1 is in its original units. E.g., “For each additional child a woman has, her monthly earnings are reduced by 3.6 percent.”

The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Log-log specifications Log-log models are of the form lnY = β 0 + β 1 lnX 1 For such models, β 1 estimates the percentage change in the Y for a one percent increase in X 1. – This measure is known in economics as the elasticity (Gujarati 2002). E.g., “A 1 percent increase in monthly hours worked is associated with a 0.6% increase in monthly earnings.”

The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Choice of contrast size for logarithmic models Caveat: The scale of the logged variable must be taken into account when choosing an appropriate-sized contrast. E.g., a 1-unit increase in ln(monthly hours worked) from 5.3 to 6.3 is equivalent to an increase from 200 to 544 hours per month. – That contrast is nearly a 2.5 fold increase in hours. – Implies working three-quarters of all day and night-time hours, 7 days a week.

The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Review: Assess whether a 1-unit increase in the variable is the right sized contrast Always consider whether a 1-unit increase in the variable as specified in the model makes sense in its real world context! – Topic – Distribution in the data If not, use theoretical and empirical criteria for choosing a fitting sized contrast. – See podcast on measurement and variables approaches to resolving the Goldilocks problem

The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Descriptive statistics to report if you use a logarithmic specification In a table of descriptive statistics, report the mean and range both – In the original, untransformed units, such as income in dollars, which are more intuitively understandable easier than the logged version to compare with values from other samples. – In the logged units, so readers know the range and scale of values to apply to the estimated coefficients.

The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Interpreting coefficients from logarithmic specifications Taking logs of the IV(s) and/or DV affects interpretation of the estimated coefficients. If your models include any logged variables, report the pertinent units as you write about the βs, especially if – your specifications include a mixture of logged and non-logged variables; – you are testing the sensitivity of your findings to different logarithmic specifications.

The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Goldilocks issues for other types of specifications

The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Polynomial: Quadratic specification of IPR/ birth weight pattern

The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Goldilocks issues for polynomials In models involving polynomials such as X i and X i 2, the effect of a 1-unit increase in X i on Y varies for different values of X i. – E.g., cannot generalize the size of the effect of X i on Y for all values of X i. To convey shape of the association between X i and Y. – In the text, present change in Y for each of several contrasts in values of X i. – Create a graph. See podcast on polynomials for more information.

The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Goldilocks issues for interactions In models involving interactions, βs on main effect and interaction terms for two or more IVs must be combined to calculate the overall effect on the DV. Cannot examine the effect of a 1-unit change in only one of those variables based on its β alone. See chapter and podcasts on interactions.

The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Summary Certain model specifications can help reduce Goldilocks problems by imposing a consistent metric to facilitate comparison of βs across independent variables with different levels and ranges. E.g., – A 1-standard deviation increase, from standardized coefficients – A 1% increase from log-log coefficients. Models involving non-linear functions or interactions complicate the Goldilocks issue because the effect of each variable involves several terms.

The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Suggested resources Miller, J. E., The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. – Chapter 10 on Goldilocks problem, standardized coefficients, and polynomials – Chapter 8, on standardized scores and z-scores – Chapter 16, on interactions

The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. More suggested resources Miller, J. E. and Y. V. Rodgers, “Economic Importance and Statistical Significance: Guidelines for Communicating Empirical Research.” Feminist Economics 14 (2): 117–49. Kachigan, Sam Kash Multivariate Statistical Analysis: A Conceptual Introduction. 2nd Edition. New York: Radius Press. on standardized coefficients. Gujarati, Damodar N Basic Econometrics. 4th ed. New York: McGraw-Hill/Irwin, on logarithmic specifications.

The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Supplemental online resources Podcasts on – Defining the Goldilocks problem – Resolving the Goldilocks problem Measurement and variables Presenting results – Calculating the shape of a polynomial – Calculating the shape of an interaction pattern Online appendix on interpreting coefficients from logarithmic specifications.

The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Suggested practice exercises Study guide to The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. – Suggested course extensions for chapter 10 “Applying statistics and writing” question #5. “Revising” questions #1, 2, 3, and 9.

The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Contact information Jane E. Miller, PhD Online materials available at