Department of Cognitive Science Michael J. Kalsher Adv. Experimental Methods & Statistics PSYC 4310 / COGS 6310 Dummy Coding 1 PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher
Dummy Coding: What is it? Linear regression works over score level variables (e.g. age; income level; weight, etc.) But … not all data are score level: –Sex (male vs. female) –Location (RPI vs. Sage vs. HVCC) –Political affiliation (Democrat, Independent, Republican) Dummy coding is a way of representing groups of people using only zeros and ones.
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher Nominal values as binary numbers Binary: 0 = off/absent, 1 = on/present Can treat nominal variables as presence (or absence) of features Example: Presence of maleness /presence of femaleness
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher Creating Dummy Variables Dummy coding produces “dummy variables” equal to N-1 (where N = the original number of categories). Consider Political Affiliation: Democrat, Independent, Republican –Here there are three categories that will be recoded into two dummy coded variables Steps: –1. Assign one category as the “Baseline” group –2. Create two variables representing the other groups: X and Y –3. X=1 if in group X; Y=1 if in group Y; both are 0 if in baseline group
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher Example: Presidential Ratings During class in the Fall 2010 semester, students in PSYC 4310 were asked to indicate: –Rating of President Obama’s performance (1=Strongly Disapprove; 5=Strongly Approval) –Party affiliation (1=Democrat; 2=Independent; 3=Republican) –One approach is to analyze the data using an “experimental technique” such as ANOVA. –But suppose you wanted to analyze in the context of regression with several other “predictor” variables? –Another approach is to recode Party Affiliation into Dummy Coded Variables so that we can use this variable in a regression analysis. 5
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher 6 President’s performance ratings as a function of party affiliation. Key Affiliation: 1 = Democrat 2= Independent 3= Republican Rating: 1= Poor Performance 5 = Great Performance
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher 7
8 #1: Select the Variable #2: Name the 1 st Dummy Code, then click on “Change”
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher 9 Next, Click on “Old and New Values”
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher 10 In this step, we want all “Democrats” to get a code of “1” and everyone else to get a code of “0”. To do this, we type in “1” for the “Old Value” and “1” for the “New Value”, then click on “Add”. Add “1” here
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher 11 #2 #3 #1 Next, we need to change the remaining groups to have a value of “0” for the first dummy variable. 1. Select “All other values”, 2. type in “0” adjacent to “Value” in the section labeled “New Value”. 3. Click on “Add”. 4. Click on “Continue” #4
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher 12 You should now be back at the main “Recode into Different Variables” dialogue box, then click “Ok”. Check the Data View and you will see the “Democrat Dummy Code”
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher Click “Reset” 2. Name the 2 nd Dummy Code (Republican). 3.Click on “Change”. 4.Click on “Old and New Values”
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher 14 After changing the “Old Value” of Republicans (3) to the “New Value” (1) and “All other values” to “0”, click on “Continue”, which takes you back to the main dialogue box, then click “Ok”. Check the Data View and you will see the “Republican Dummy Code”
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher 15 The Data View now shows the two columns of the recoded “Democrat” and “Republican” Dummy Codes.
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher 16 Now we can run the regression analysis using the Dummy Coded variables as predictors and Rating as the DV
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher 17
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher 18
PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher 19