Presentation is loading. Please wait.

Presentation is loading. Please wait.

Categorical Variables in Regression

Similar presentations


Presentation on theme: "Categorical Variables in Regression"— Presentation transcript:

1 Categorical Variables in Regression
Let’s categorize students. What’s the purpose of ANOVA? If I wanted to know PSGE 7211

2 Guiding Questions How are ANOVA and regression related?
How do I analyze categorical variables in regression?

3 Reading Test Scores by Sex (T-test)

4 Reading Test Scores by Sex (BR)

5 What’s your birthday? Let’s divide the room according to birthdays
Count Winter birthday Spring birthday Summer birthday Autumn birthday

6 vs. Methods of Coding When is your birthday? 1. Winter 2. Spring
3. Summer 4. Fall Is your b-day: YES NO 1. Winter? 1 2. Spring? 3. Summer? 4. Fall? vs.

7 vs. Methods of Coding What is your religious affiliation?
1. Protestant 2. Catholic 3. Jewish 4. Muslim 5. Other (or none) Are you: YES NO 1. Protestant? 1 2. Catholic? 3. Jewish? 4. Muslim? 5. Other? vs.

8 Methods of Coding What is your religious affiliation? 1. Protestant
2. Catholic 3. Jewish 4. Muslim 5. Other (or none)

9 DUMMY CODES Methods of Coding Are you: YES NO 1. Protestant? 1
2. Catholic? 3. Jewish? 4. Muslim? 5. Other? DUMMY CODES

10 False Memory and Sexual Abuse
Bremner, Shobe, & Kihlstrom, 2000

11 False Memory ANOVA

12 For one-way ANOVA (one factor)
ANOVA in SPSS For one-way ANOVA (one factor)

13 ANOVA in SPSS

14 What about in Regression?
Step 1: Create categorical variables for group membership Need to create as many dummy variables as there are categories (g) minus 1 3 categories = we need to create 2 dummy variables

15 Intercept: the predicted value when the other variables=0
Converting to Dummy Variables Group VARIABLE 1: Abused, PTSD VARIABLE 2: Abused, Non-PTSD 1 Nonabused, Non-PTSD Intercept: the predicted value when the other variables=0 Intercept reflects Non-abused, Non-PTSD group; examining effect of Abused PTSD while controlling for Abused, non-PTSD Intercept reflects Non-abused, Non-PTSD group; examining effect of Abused Non-PTSD while controlling for Abused, PTSD

16 group abusepts no_PTSD.
Remember to check your recoding List var group abusepts no_PTSD.

17 the means for all three groups are provided
False Memory with Dummy Vars Post-hoc info and the means for all three groups are provided

18 Need for g - 1 Dummy Variables
DV: Group Membership IVs: 2 dummy variables Note: 100% of variance is accounted for!

19 Was Regression Necessary?
Doesn’t make much sense to run a multiple regression with only categorical IVs (much easier to run an ANOVA) But regression is great if you want to include both categorical and continuous variables

20 intercept reflects grand mean (or overall mean) of all three groups
Effect Coding Group Abused, PTSD (EFFECT 1) Abused, Non-PTSD (EFFECT 2) Abused, PTSD 1 Abused, Non-PTSD Nonabused, Non-PTSD -1 Similar to dummy coding but the variable that won’t be included in the analysis (contrast variable) is assigned -1 intercept reflects grand mean (or overall mean) of all three groups

21 Criterion Scaling Do you have a categorical variable with a lot of categories? Instead of creating many dummy-coded variables, you can use criterion scaling where you form a single variable but each member of each group is coded with that group’s mean score

22 Summary

23 Recoding in SPSS Step 1

24 Recoding in SPSS Step 2 – select the variable you want to dummy code (gender) and then enter info about new variable in “output variable” and click “change” Variable name for new dummy coded variable

25 Step 3: Select “Old and New Values”
Recoding in SPSS Step 3: Select “Old and New Values”

26 Step 4: Change old values(1,2) into new ones (0,1)
Recoding in SPSS Step 4: Change old values(1,2) into new ones (0,1) OLD NEW

27 Step 4: Change old values(1,2) into new ones (0,1)
Recoding in SPSS Step 4: Change old values(1,2) into new ones (0,1) Press continue, and okay – Done!

28 RECODE gender (MISSING=SYSMIS) (2=0) (1=1) INTO DummyGender. EXECUTE.
Recoding in SPSS RECODE gender (MISSING=SYSMIS) (2=0) (1=1) INTO DummyGender. EXECUTE.

29 HW 7 The purpose of this HW is to get a better understanding of the interrelationship between ANOVA and MR. To this end, run the following analyses on a dataset of your choice: A one-way ANOVA with an IV with at least three categories or factors (e.g., Repeat this same analysis using Multiple Regression with dummy-coded variables Demonstrate mathematically that these analyses are essentially the same. Make frequent references to the output (specific stats) to demonstrate equivalence across the ANOVA and your regression analysis.

30 Moderation = Interaction
Interaction effects = Moderation, or when the magnitude of the effect of one variable depends on another There is also a handout on this topic in the files containing answers to exercises

31 Self-esteem, Sex, & Achv’t

32 Self-esteem on Sex & Achv’t
NEWSEX = 1 = FEMALES

33 Example of a cross-over or disordinal interaction
Possible Interaction? Example of a cross-over or disordinal interaction Example of how gender moderates effect of achievement on self-esteem

34 Step 1 - Center Center the CONTINUOUS IV of interest (the one that will be used in the interaction term)

35 Centering Multicollinearity occurs when IVs are highly correlated, r > .8 or .9 Multicollinearity makes regression equations unstable It also violates one of the main assumptions of regression (independence of IVs)

36 Centering To center, you create a new variable by subtracting mean from original achievement variable Compute ACH_CENT=BYTESTS – EXECUTE.

37 Step 2: Create Cross-Product Term
To create interaction (or cross-product) term, you multiply the two variables (gender x achievement), using the centered continuous variable Compute SEX_ACH=Sex * ACH_CENT. EXECUTE.

38 Without centering, this r would be higher
Self-esteem, Sex, and Achv’t Centering reduces multicollinearity Without centering, this r would be higher

39 Step 3: Run Regression Analysis
Model 1 SEX ACH_CENT Model 2 SEX_ACH

40 Self-esteem, Sex, & Achv’t
No significant R2 change - interaction not significant... (regression lines for boys and girls are parallel)

41 Note: Interpretation on p. 136
Self-esteem, Sex, & Achv’t Since interaction is not significant, concentrate on interpreting coefficients from Model 1...What does the intercept now represent? Note: Interpretation on p. 136

42 Testing Interactions in MR
The procedure is generally the same for testing interactions with two continuous variables Interactions are less stable than main effects; replication of interaction effects are somewhat rare Don’t throw out the model just because you don’t get a statistically significant main effect or interaction! Lack of statistical significance is sometimes equally important

43 Ethnic background & Achv’t on Self-esteem
A significant interaction? Ethnic background & Achv’t on Self-esteem

44 Ethnicity x Achievement
Conduct follow up to see where the statistical significance comes from: Is regression of SE on Achv’t significant for Whites or Non-Whites?

45 Self-esteem, Ethnicity & Achv’t
Majority = 1 Since interaction was statistically significant, interpret coefficients from Model 2...What does the intercept represent?

46 Recap Center the IV of interest
Create cross-products by multiplying centered IV x the dummy variable Regress DV on IVs - use centered IV Add the cross-products sequentially Is step statistically significant? If so, graph and conduct follow up analyses If not, then interpret the findings without the cross- product

47 HW 8 The purpose of this homework is to investigate interaction effects, specifically interactions between categorical and continuous variables. Said differently, the purpose of this assignment is to investigate moderator effects. For this assignment, conduct a sequential (or hierarchical) MR that tests for a specific interaction. Remember to center any continuous independent variables. Note that you need not report a statistically significant interaction effect; however, if you do find statistical significance, make sure you follow the procedures for interpreting interactions as outlined in your textbook. For the write up: Tweak the introduction to justify looking at interaction Make sure your research questions/hypotheses indicate examining moderation Write up results (formal) Interpret results (discussion)

48 SPSS - gender, SE on Achv’t
Step 1 - Center the IV of interest (self-efficacy) Run descriptives to determine mean of self-efficacy DESCRIPTIVES VARIABLES=efficyw2 /STATISTICS=MEAN STDDEV MIN MAX.

49 SPSS - gender, SE on Achv’t
Step 1 - Center the IV of interest (self-efficacy) Compute centered IV (you should also center all other IVs in your model) COMPUTE cefficyw2=efficyw EXECUTE. DESCRIPTIVES VARIABLES=cefficyw2 /STATISTICS=MEAN STDDEV MIN MAX.

50 SPSS - gender, SE on Achv’t
Step 2 -Create cross-products by multiplying centered IV x the dummy variable (gender) COMPUTE efficyXgender = cefficyw2 * gender.. EXECUTE.

51 SPSS - gender, SE on Achv’t
Step 3 - Regress DV on IVs - use centered IV Add the cross-products sequentially

52 SPSS - gender, SE on Achv’t
Step 4 - Regress DV on IVs - use centered IV Is step significant? If so, graph and conduct follow up analyses Graph (plot self-efficacy on Achievement by gender) Split file, run regressions by gender If not, then interpret the findings without the cross- product

53 Graphing

54 Make sure you fit line at subtotal
Graphing Make sure you fit line at subtotal

55 Then run your bivariate regression analysis:
Split File Then run your bivariate regression analysis: DV – TotalCoursepts IV – efficyw2

56 Split File No gender effects!

57 Questions and Clarification
What is still confusing?


Download ppt "Categorical Variables in Regression"

Similar presentations


Ads by Google