General Linear Models The theory of general linear models posits that many statistical tests can be solved as a regression analysis, including t-tests.

Slides:



Advertisements
Similar presentations
Lecture 2 ANALYSIS OF VARIANCE: AN INTRODUCTION
Advertisements

Biostatistics Unit 5 Samples Needs to be completed. 12/24/13.
McGraw-Hill/Irwin McGraw-Hill/Irwin Copyright © 2009 by The McGraw-Hill Companies, Inc. All rights reserved.
Contingency tables enable us to compare one characteristic of the sample, e.g. degree of religious fundamentalism, for groups or subsets of cases defined.
Central Tendency- Nominal Variable (1)
A bar chart of a quantitative variable with only a few categories (called a discrete variable) communicates the relative number of subjects with each of.
SW388R7 Data Analysis & Computers II Slide 1 Copying SPSS Output Into Microsoft Word Copying syntax commands from SPSS output to Word Copying a statistics.
Level of Measurement Problems
Data Analysis using SPSS By Dr. Shaik Shaffi Ahamed Ph. D
ANALYSIS OF VARIANCE (ONE WAY)
Computing Transformations
Types of selection structures
1 Interpreting a Model in which the slopes are allowed to differ across groups Suppose Y is regressed on X1, Dummy1 (an indicator variable for group membership),
By Hui Bian Office for Faculty Excellence Spring
9. Two Functions of Two Random Variables
4/4/2015Slide 1 SOLVING THE PROBLEM A one-sample t-test of a population mean requires that the variable be quantitative. A one-sample test of a population.
SW388R7 Data Analysis & Computers II Slide 1 Solving Problems in SPSS The data sets Options for variable lists in statistical procedures Options for variable.
Analysis of variance (ANOVA)-the General Linear Model (GLM)
One-sample T-Test of a Population Mean
5/15/2015Slide 1 SOLVING THE PROBLEM The one sample t-test compares two values for the population mean of a single variable. The two-sample test of a population.
Outliers Split-sample Validation
Detecting univariate outliers Detecting multivariate outliers
A Simple Guide to Using SPSS© for Windows
Chi-square Test of Independence
Outliers Split-sample Validation
Multiple Regression – Assumptions and Outliers
Multiple Regression – Basic Relationships
Assumption of Homoscedasticity
SW388R6 Data Analysis and Computers I Slide 1 One-sample T-test of a Population Mean Confidence Intervals for a Population Mean.
SW388R7 Data Analysis & Computers II Slide 1 Multiple Regression – Basic Relationships Purpose of multiple regression Different types of multiple regression.
Correlation Question 1 This question asks you to use the Pearson correlation coefficient to measure the association between [educ4] and [empstat]. However,
SW388R7 Data Analysis & Computers II Slide 1 Multiple Regression – Split Sample Validation General criteria for split sample validation Sample problems.
Assumption of linearity
SW388R7 Data Analysis & Computers II Slide 1 Analyzing Missing Data Introduction Problems Using Scripts.
SW388R6 Data Analysis and Computers I Slide 1 Chi-square Test of Goodness-of-Fit Key Points for the Statistical Test Sample Homework Problem Solving the.
8/15/2015Slide 1 The only legitimate mathematical operation that we can use with a variable that we treat as categorical is to count the number of cases.
Stepwise Binary Logistic Regression
Sampling Distribution of the Mean Problem - 1
SW318 Social Work Statistics Slide 1 Estimation Practice Problem – 1 This question asks about the best estimate of the mean for the population. Recall.
SW388R7 Data Analysis & Computers II Slide 1 Logistic Regression – Hierarchical Entry of Variables Sample Problem Steps in Solving Problems.
8/23/2015Slide 1 The introductory statement in the question indicates: The data set to use: GSS2000R.SAV The task to accomplish: a one-sample test of a.
SW388R7 Data Analysis & Computers II Slide 1 Assumption of Homoscedasticity Homoscedasticity (aka homogeneity or uniformity of variance) Transformations.
Hierarchical Binary Logistic Regression
SW388R6 Data Analysis and Computers I Slide 1 Central Tendency and Variability Sample Homework Problem Solving the Problem with SPSS Logic for Central.
Chi-Square Test of Independence Practice Problem – 1
110/10/2015Slide 1 The homework problems on comparing central tendency and variability extend our focus on central tendency and variability to a comparison.
SW388R7 Data Analysis & Computers II Slide 1 Multinomial Logistic Regression: Complete Problems Outliers and Influential Cases Split-sample Validation.
SW388R7 Data Analysis & Computers II Slide 1 Logistic Regression – Hierarchical Entry of Variables Sample Problem Steps in Solving Problems Homework Problems.
SW388R6 Data Analysis and Computers I Slide 1 Independent Samples T-Test of Population Means Key Points about Statistical Test Sample Homework Problem.
6/4/2016Slide 1 The one sample t-test compares two values for the population mean of a single variable. The two-sample t-test of population means (aka.
SW388R6 Data Analysis and Computers I Slide 1 Multiple Regression Key Points about Multiple Regression Sample Homework Problem Solving the Problem with.
SW318 Social Work Statistics Slide 1 Frequency: Nominal Variable Practice Problem This question asks the frequency of widowed respondents of the survey.
Chi-square Test of Independence
SW318 Social Work Statistics Slide 1 One-way Analysis of Variance  1. Satisfy level of measurement requirements  Dependent variable is interval (ordinal)
SW388R6 Data Analysis and Computers I Slide 1 One-way Analysis of Variance and Post Hoc Tests Key Points about Statistical Test Sample Homework Problem.
SW318 Social Work Statistics Slide 1 Percentile Practice Problem (1) This question asks you to use percentile for the variable [marital]. Recall that the.
SW388R6 Data Analysis and Computers I Slide 1 Percentiles and Standard Scores Sample Percentile Homework Problem Solving the Percentile Problem with SPSS.
SW388R7 Data Analysis & Computers II Slide 1 Detecting Outliers Detecting univariate outliers Detecting multivariate outliers.
ONE-WAY BETWEEN-GROUPS ANOVA Psyc 301-SPSS Spring 2014.
12/23/2015Slide 1 The chi-square test of independence is one of the most frequently used hypothesis tests in the social sciences because it can be used.
1/5/2016Slide 1 We will use a one-sample test of proportions to test whether or not our sample proportion supports the population proportion from which.
SW388R7 Data Analysis & Computers II Slide 1 Incorporating Nonmetric Data with Dummy Variables The logic of dummy-coding Dummy-coding in SPSS.
SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample.
SW388R7 Data Analysis & Computers II Slide 1 Solving Homework Problems in SPSS The data sets Options for variable lists in statistical procedures Options.
ANOVA and Multiple Comparison Tests
(Slides not created solely by me – the internet is a wonderful tool) SW388R7 Data Analysis & Compute rs II Slide 1.
BINARY LOGISTIC REGRESSION
Multiple Regression – Split Sample Validation
Multinomial Logistic Regression: Complete Problems
Presentation transcript:

General Linear Models The theory of general linear models posits that many statistical tests can be solved as a regression analysis, including t-tests and ANOVA’s. General linear models become even more useful when our analysis includes both numeric (interval level) and categorical variables (nominal level), since both can directly be entered into the analysis, and SPSS will do any needed dummy coding. In this example, we will demonstrate the equivalence of regression and ANOVA. We will use the SPSS General Linear Models procedure for a variety of tests in the future.

Homework problems: One-way Analysis of Variance – Specific Relationship Tested This problem uses the data set GSS2000R.Sav to compare the average score on the variable "highest year of school completed" [educ] for groups of survey respondents defined by the variable "subjective class identification" [class]. Using a one-way analysis of variance and a post hoc test with an alpha of .05, is the following statement true, true with caution, false, or an incorrect application of a statistic? Survey respondents who said they belonged in the working class completed fewer years of school (M = 12.58, SD = 2.50) than survey respondents who said they belonged in the middle class (M = 13.83, SD = 3.14). True True with caution False Incorrect application of a statistic In the PowerPoint for One-Way ANOVA, we solved this problem, using SPSS’ One-Way ANOVA command. Applying the theory of general linear models, we will solve this problem with linear regression.

Converting the One-Way ANOVA problem to a Regression problem To solve this problem with regression, we need to dummy code the independent variable. Since the problem includes, a specific comparison, we need to select the reference group that makes this comparison possible. This problem uses the data set GSS2000R.Sav to compare the average score on the variable "highest year of school completed" [educ] for groups of survey respondents defined by the variable "subjective class identification" [class]. Using a one-way analysis of variance and a post hoc test with an alpha of .05, is the following statement true, true with caution, false, or an incorrect application of a statistic? Survey respondents who said they belonged in the working class completed fewer years of school (M = 12.58, SD = 2.50) than survey respondents who said they belonged in the middle class (M = 13.83, SD = 3.14). True True with caution False Incorrect application of a statistic Specifically, we will use the working class category as the reference group, so that we can compare the difference between the middle class and the working class. We could just as easily have chose the middle class as the reference category.

Coding scheme for new variables The coding scheme for the new variables in shown in the table below. Original Variable Coding Coding for New Variables lowerClass middleClass upperClass 1 = lower class 1 2 = working class 3 = middle class 4 = upper class The class variable contained the four categories in the first column. We will create three new dichotomous variables: lowerClass, middleClass, and upperClass. Each new variable will have a 1 in the matching category from the original variable and zeros for all of the other categories.

Using Recoding in SPSS to Create New Variables Select the Recode > Into Different Variables command from the Transform menu.

Creating the lowerClass variable Second, type in the name for the new variable. First, select the variable to be dummy-coded, class, from the list of variables and move it to the Numeric Variable -> Output Variable list box. Third, click on the Change button to replace the ? with this new variable name.

Assigning values to new variable Next, click on the Old and New Values button to assign values to the new variable.

Preserving missing values First, mark the System- or user-missing option button on the Old Value panel. Second, mark the System-missing option button on the New Value panel. Third, click on the Add button to include this recoding for the variable If we forget to explicitly assign missing values, cases with missing data will be recoded with a 0 and become part of the reference group.

Coding the lowerClass category First, to recode the 1 = lower class category to the dummy variable, mark the Value option button and type a 1 in the text box on the Old Value panel. Second, mark the Value option button and type a 1 in the text box on the New Value panel. This coding says: if they were originally in the lower class category, they are assigned a value of 1 for the lowerClass dummy variable. Third, click on the Add button to include this recoding for the variable

Coding the other categories Second, mark the Value option button and type a 0 in the text box on the New Value panel. This coding says: if they were originally NOT in the lower class category, they are assigned a value of 0 for the lowerClass dummy variable. First, to identify subjects in the categories other than lower class, mark the All other values option button on the Old Value panel. Third, click on the Add button to include this recoding for the variable

Completing the recoding When we have completed the coding for the new variable, click on the Continue button.

Completing the lowerClass variable Click on the OK button to create the new variable in the data editor.

Dummy variable coding for middleClass variable Following the same steps, we create the dummy variable for subjects who were 3 = middle class on the original class variable. The coding is similar to that for married subjects, except the category that was originally coded 3 = middle class is translated into a 1 on the new variable.

Dummy variable coding for upperClass variable Following the same steps, we create the dummy variable for subjects who were 4 = upper class on the original class variable. The coding is similar to that for married subjects, except the category that was originally coded 4 = upper class is translated into a 1 on the new variable.

Dummy-coded variables for class - 1 Subjects with a code value of 3 on the original class variable now have a 1 for middleClass and a 0 for the other new variables. Subjects with a code value of 2 on the original class variable now have a 0 for all the new variables.

Dummy-coded variables for class - 2 Subjects with a code value of 1 on the original class variable now have a 1 for lowerClass and a 0 for the other new variables. Subjects with a code value of 4 on the original class variable now have a 1 for upperClass and a 0 for the other new variables. Since it is very easy to make a mistake in recoding, it is imperative that we check the results of our recoding.

Regression of education on class variables - 1 Select the Regression > Linear command from the Analyze menu.

Regression of education on class variables - 2 First, we move the dependent variable to the Dependent Variable text box. Third, click on the OK button to produce the output. Second, we move the three dummy coded variables to the list of Independents.

Results of regression of education on class variables – overall relationship The overall relationship is statistically significant, (F(3, 264) = 4.97, p < .01).

Comparison to One-way ANOVA of education by class – overall relationship The overall relationship is statistically significant, (F(3, 264) = 4.97, p < .01). Moreover, all of the statistical values in the ANOVA table are identical to the results from regression.

Results of regression of education on class variables – individual relationships The tests of individual relationships are a comparison each group to the reference group. The difference between the middle class group and the working group is statistically significant.

Results of regression of education on class variables – individual relationships B coefficients are interpreted as the increase or decrease in the estimate of the dependent variable associated with the change from the reference group to the dummy-coded group. Subjects in the middle class had, on average, 1.249 more years of education than the working class.

Comparison to One-way ANOVA of education by class – individual relationship In the post hoc test, the difference between the middle class and the working class was also 1.249 years of education, and was a statistically significant relationship.

Comparison to One-way ANOVA of education by class – individual relationship However, the calculations for the post hoc test are completely different from the test of the b coefficient in the regression, which is reasonable since they are very different tests. The test of the b coefficient is a test of the hypothesis that b is not equal to 0. Post hoc tests are not hypothesis tests. The only hypothesis tested in the One-Way ANOVA was that one of the group means was different from the others. The post hoc test provided additional information about the differences, but it is not a hypothesis test because no hypothesis test was specified in advance of the statistical calculations. The significance of the test of the b coefficient was .001, while the significance of the post hoc test was .005. In this example we would make a similar interpretation, but that is not always the case.

Using linear contrasts to test specific group hypotheses - 1 It is possible to include a hypothesis test of differences between specific groups within the one-way ANOVA, using linear contrasts. Using the notation from the text, we would specify the linear contrast as the difference between the working class and the middle class. Since the problem indicated that middle class respondents had more education than working class respondents, we would write the contrast as: l = μmiddle class – μworking class where l is a linear contrast and μ’s are group means

Using linear contrasts to test specific group hypotheses - 2 If we explicitly include coefficients for the population means in the contrast equation l = μmiddle class – μworking class becomes l = +1 × μmiddle class –1 × μworking class and if we add in the means for the other groups +0 × μlower class +0 × μupper class which is the contrast we will enter into SPSS

Testing a hypothesis comparing groups within One-Way ANOVA - 1 Select the Compare Means > One-Way ANOVA command from the Analyze menu.

Testing a hypothesis comparing groups within One-Way ANOVA - 2 First, move the dependent variable educ and the independent variable class into the list boxes. Second, click on the Contrasts button to add the linear contrast.

Testing a hypothesis comparing groups within One-Way ANOVA - 3 The contrast coefficients were: 0 for lower class -1 for working class +1 for middle class 0 for upper class The contrasts must be entered in the same order that the variable is coded, i.e. from low to high codes for categories. First, type the contrast coefficient for the lower class group, 0, into the Coefficients text box. Second, click on the Add button to add the coefficent to the list box.

Testing a hypothesis comparing groups within One-Way ANOVA - 1 Add the contrast coefficients for the working class (-1), the middle class (+1), and the upper class (0) to the list box. Click on the Continue button to close the dialog box.

Testing a hypothesis comparing groups within One-Way ANOVA - 5 Click on the OK button to request the output.

Testing a hypothesis comparing groups within One-Way ANOVA - 6 The value and significance of the F-test are identical to the results obtained in the regression, as well as the one-way ANOVA with the post hoc tests. Moreover, the results for the contrast test match the test of the b coefficient in the regression analysis (β(264) =3.372, p < .01)

SPSS’ general linear models procedure SPSS has a command for directly computing general linear models that is much more versatile that the regression command that we just used. The procedure contains options and diagnostic statistics that are not available in its linear regression command. The default for group comparisons with this command is to compute contrasts with group with the highest numeric code. Since we want the comparison to be with the working class group, we will first change the numeric code for the group from 2 to 5 so that it is the highest numeric value.

Recoding the class variable - 1 To change the numeric coding for the working category so it is the highest numeric value, we again select Recode > Into Different Variables command from the Transform variable.

Recoding the class variable - 2 Second, type in the name for the new variable. First, select the variable to be dummy-coded, class, from the list of variables and move it to the Numeric Variable -> Output Variable list box. Third, click on the Change button to replace the ? with this new variable name.

Recoding the class variable - 3 Next, click on the Old and New Values button to assign values to the new variable.

Recoding the class variable - 4 First, mark the System- or user-missing option button on the Old Value panel. Second, mark the System-missing option button on the New Value panel. Third, click on the Add button to include this recoding for the variable

Recoding the class variable - 5 First, to recode the 2 = working class category to the dummy variable, mark the Value option button and type a 2 in the text box on the Old Value panel. Second, mark the Value option button and type a 5 in the text box on the New Value panel. This coding says: if they were originally in the working class category, they are assigned a value of 5 for the new variable. Third, click on the Add button to include this recoding for the variable

Recoding the class variable - 5 Second, mark the Copy old values option button to retain the codes for the remaining groups. First, since we want all of the other codes to remain the same, we click on the All other values option button. Third, click on the Add button to include this recoding for the variable

Recoding the class variable - 6 When we have completed the coding for the new variable, click on the Continue button.

Recoding the class variable - 7 Click on the OK button to create the new variable in the data editor.

Recoding the class variable - 8 We check the values in the data editor to make sure the recode worked as anticipated. In this example, we see that the 2’s for class are correctly recoded as 5’s.

Using SPSS’ general linear models - 1 To solve the problem using SPSS’ General Linear Model command, select General Linear Model > Univariate from the Analyze menu. The univariate command indicates that we have a single dependent variable.

Using SPSS’ general linear models - 2 First, we move the dependent variable to the Dependent Variable text box. Second, we move the newly created independent variable to the Fixed Factors list box. Fixed factors are those for which all possible codes are represented in the data set. Random Factors are categorical variables which can take on values different from those in our data set. Third, click on the Options button to specify additional output. While the univariate GLM command has numerous specifications, we only need one request for this problem. Covariates are interval level variables or variables we wish to treat as interval level.

Using SPSS’ general linear models - 3 Second, click on the Continue button to close the dialog box. First, mark the check box for Parameter estimates. This will compute and test the coefficients.

Using SPSS’ general linear models - 4 Click on the OK button to produce the output.

SPSS’ general linear models output The value and significance of the F-test are identical to the results obtained in the regressionand the one-way ANOVA with the post hoc tests. Subjects in the middle class (code 3) had, on average, 1.249 more years of education than the working class. The difference is statistically significant and identical to the findings from the other comparisons, (β(264) =3.372, p < .01)