DTC Quantitative Methods Three (or more) Variables: Extensions to Analyses Using Cross- tabulations or ANOVA Thursday 1 st March 2012.

Slides:



Advertisements
Similar presentations
CHAPTER TWELVE ANALYSING DATA I: QUANTITATIVE DATA ANALYSIS.
Advertisements

Soc 3306a Lecture 6: Introduction to Multivariate Relationships Control with Bivariate Tables Simple Control in Regression.
Analysis of variance (ANOVA)-the General Linear Model (GLM)
Chapter 11 Contingency Table Analysis. Nonparametric Systems Another method of examining the relationship between independent (X) and dependant (Y) variables.
(Hierarchical) Log-Linear Models Friday 18 th March 2011.
One way-ANOVA Analysis of Variance Let’s say we conduct this experiment: effects of alcohol on memory.
Multiple Regression Models
Analysing Cross-Tabulations
(Correlation and) (Multiple) Regression Friday 5 th March (and Logistic Regression too!)
Social Research Methods
DTC Quantitative Research Methods Three (or more) Variables: Extensions to Cross- tabular Analyses Thursday 13 th November 2014.
Data Analysis Statistics. Levels of Measurement Nominal – Categorical; no implied rankings among the categories. Also includes written observations and.
Chapter 7 Correlational Research Gay, Mills, and Airasian
Slide 1 Testing Multivariate Assumptions The multivariate statistical techniques which we will cover in this class require one or more the following assumptions.
Multiple Linear Regression A method for analyzing the effects of several predictor variables concurrently. - Simultaneously - Stepwise Minimizing the squared.
Two-Way Analysis of Variance STAT E-150 Statistical Methods.
Scot Exec Course Nov/Dec 04 Ambitious title? Confidence intervals, design effects and significance tests for surveys. How to calculate sample numbers when.
Beyond Bivariate: Exploring Multivariate Analysis.
Chapter 15 – Elaborating Bivariate Tables
Analysis of Variance. ANOVA Probably the most popular analysis in psychology Why? Ease of implementation Allows for analysis of several groups at once.
Psy B07 Chapter 1Slide 1 ANALYSIS OF VARIANCE. Psy B07 Chapter 1Slide 2 t-test refresher  In chapter 7 we talked about analyses that could be conducted.
Inferential Statistics: SPSS
Chapter 13: Inference in Regression
DTC Quantitative Research Methods Comparing Means II: Nonparametric Tests and Bivariate and Multivariate Analysis of Variance (ANOVA) Thursday 20 th November.
LEARNING PROGRAMME Hypothesis testing Intermediate Training in Quantitative Analysis Bangkok November 2007.
Lecture 14: Factorial ANOVA Practice Laura McAvinue School of Psychology Trinity College Dublin.
Copyright © 2010 Pearson Education, Inc. Chapter 22 Comparing Two Proportions.
Cross-Tabular Analysis Cross-Tabular Analysis
Soc 3306a Multiple Regression Testing a Model and Interpreting Coefficients.
t(ea) for Two: Test between the Means of Different Groups When you want to know if there is a ‘difference’ between the two groups in the mean Use “t-test”.
© Copyright McGraw-Hill CHAPTER 12 Analysis of Variance (ANOVA)
Chapter 9 Analyzing Data Multiple Variables. Basic Directions Review page 180 for basic directions on which way to proceed with your analysis Provides.
ANOVA and Linear Regression ScWk 242 – Week 13 Slides.
Hypothesis testing Intermediate Food Security Analysis Training Rome, July 2010.
Inferential Statistics
Agenda Review Homework 5 Review Statistical Control Do Homework 6 (In-class group style)
INTRODUCTION TO ANALYSIS OF VARIANCE (ANOVA). COURSE CONTENT WHAT IS ANOVA DIFFERENT TYPES OF ANOVA ANOVA THEORY WORKED EXAMPLE IN EXCEL –GENERATING THE.
6/2/2016Slide 1 To extend the comparison of population means beyond the two groups tested by the independent samples t-test, we use a one-way analysis.
SW388R6 Data Analysis and Computers I Slide 1 Multiple Regression Key Points about Multiple Regression Sample Homework Problem Solving the Problem with.
Chapter 13 - ANOVA. ANOVA Be able to explain in general terms and using an example what a one-way ANOVA is (370). Know the purpose of the one-way ANOVA.
University of Warwick, Department of Sociology, 2012/13 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Week 5 Multiple Regression.
September 18-19, 2006 – Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development Conducting and interpreting multivariate analyses.
Experimental Research Methods in Language Learning Chapter 10 Inferential Statistics.
Statistics in Applied Science and Technology Supplemental: Elaborating Crosstabs: Adding a Third Variable.
Chapter 10: Cross-Tabulation Relationships Between Variables  Independent and Dependent Variables  Constructing a Bivariate Table  Computing Percentages.
ONE-WAY BETWEEN-GROUPS ANOVA Psyc 301-SPSS Spring 2014.
DTC Quantitative Methods Bivariate Analysis: t-tests and Analysis of Variance (ANOVA) Thursday 14 th February 2013.
ANOVA, Regression and Multiple Regression March
Quantitative Methods in Social Research 2012/13 Week 7 (morning) session 22 nd February 2013 Analysing Cross-Tabulations.
Slide 1 Regression Assumptions and Diagnostic Statistics The purpose of this document is to demonstrate the impact of violations of regression assumptions.
DTC Quantitative Research Methods Regression I: (Correlation and) Linear Regression Thursday 27 th November 2014.
University of Warwick, Department of Sociology, 2012/13 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Logistic Regression II/ (Hierarchical)
© 2006 by The McGraw-Hill Companies, Inc. All rights reserved. 1 Chapter 11 Testing for Differences Differences betweens groups or categories of the independent.
T-tests Chi-square Seminar 7. The previous week… We examined the z-test and one-sample t-test. Psychologists seldom use them, but they are useful to understand.
Chapter 17 Basic Multivariate Techniques Winston Jackson and Norine Verberg Methods: Doing Social Research, 4e.
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Week 3 Multivariate analysis.
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Week 6 Regression: ‘Loose Ends’
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard)   Week 5 Multiple Regression  
Regression Analysis.
DTC Quantitative Methods Bivariate Analysis: t-tests and Analysis of Variance (ANOVA) Thursday 20th February 2014  
Bivariate & Multivariate Regression Analysis
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Logistic Regression II/ (Hierarchical)
The Correlation Coefficient (r)
DTC Quantitative Methods Three (or more) Variables: Extensions to Analyses Using Cross- tabulations or ANOVA Thursday 27th February 2014  
DTC Quantitative Methods Three (or more) Variables: Extensions to Analyses Using Cross- tabulations or ANOVA Thursday 28th February 2013  
Theme 4 Elementary Analysis
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Analysing Means I: (Extending) Analysis.
Exercise 1 Use Transform  Compute variable to calculate weight lost by each person Calculate the overall mean weight lost Calculate the means and standard.
Hypothesis testing Imagine that we know that the mean income of university graduates is £16,500. We then do a survey of 64 sociology graduates and find.
The Correlation Coefficient (r)
Presentation transcript:

DTC Quantitative Methods Three (or more) Variables: Extensions to Analyses Using Cross- tabulations or ANOVA Thursday 1 st March 2012

Multivariate analysis So far we have tended to concentrate on two-way relationships (e.g. between gender and participation in sports). But we have started to look at about three-way relationships (e.g. the gendering of the relationship between age and participation in sports). Social relationships and phenomena are usually more complex than is allowed for in a bivariate analysis. Multivariate analyses are thus commonly used as a reflection of this complexity. Hence, this week we will look briefly about the rationale for multivariate analysis and have a think about both cross-tabular and Analysis of Variance (ANOVA) techniques for conducting this form of analysis.

Multivariate analysis De Vaus (1996: 198) suggests that we can use multivariate analysis to elaborate bivariate relationships, in order to answer the following questions: 1.Why does the relationship [between two variables] exist? What are the mechanisms and processes by which one variable is linked to another? 2.What is the nature of the relationship? Is it causal or non-causal? 3.How general is the relationship? Does it hold for people in general, or is it specific to certain subgroups? This is because multivariate analysis enables the identification of: Spurious relationships Intervening variables The replication of relationships The specification of relationships

Age Height Reading ability Spurious relationship Spurious relationships A spurious relationship exists where two variables are not related but a relationship between them is generated by their relationships with a third variable. For example:

Intervening variables Sometimes, although there is a real (non-spurious) relationship between two variables, we want to establish why that relationship exists. For example, if we discover that there is a relationship between risk of unemployment and ethnicity, we want to know why that is the case. One possibility is that some ethnic groups have lower educational levels and that this has implications for their ability to get work. In this case education would be an intervening variable. Intervening variables enable us to answer questions about the bivariate relationship between two variables – suggesting that (in this case) the relationship between ethnicity and unemployment is not direct but (at least in part) occurs via educational levels. EducationEthnicityUnemployment

Is it spurious or intervening? When we do statistical tests we will obtain similar results for a spurious variable and an intervening variable: In both cases the effect of the independent variable on the dependent variable will be moderated by the third variable. So how do we know whether this third variable provides evidence of a spurious relationship or is an intervening variable? –There is no hard-and-fast statistical rule for deciding this. –But if we are suggesting that a variable is intervening, the logic of the process must make sense – i.e. you must have a cogent theoretical reason for thinking that your independent variable affects the intervening variable which in turn affects the dependent variable. –This kind of causal process is easiest to argue for when the timing of events supports it, i.e. when the intervening variable can be seen to occur in between the independent and dependent variables (e.g. education in the earlier example of the relationship between ethnicity and unemployment).

Replication Sometimes when we have found a basic (‘zero-order’) relationship between two variables (e.g. ethnicity and unemployment), we want to demonstrate that this relationship exists within different subgroups of the population (e.g. for both men and women; for those of different ages…). Where the relationship is replicated we can rule out the possibility that it is produced by the variable in question, either as an intervening variable or in a spurious way.

Specification Sometimes a particular variable only has an effect in specific situations. The variable that determines these situations is said to interact with the independent variable. For example, an example in De Vaus’s book suggests that going to a religious school makes boys more religious but has little or no effect on girls. In this case type of school interacts with gender: religious education only affects students’ religiosity in combination with being male.

Specification (interactions) Not at allVery How religious was your education? Religiousness high low boys girls Not at allVery How religious was your education? Religiousness high low boys girls Interaction between No interaction sex and religiousness of school Graphical representation of the relationship between religious education and religiousness, controlling for sex:

Using Cramér’s V to classify a multivariate situation If the Cramér’s V values for the layers are all similar, then we have a situation of replication. If the Cramér’s V values are smaller for the layered cross- tabulation than the value for the original cross-tabulation, then we either have a situation where the third variable is acting as an intervening variable, or one where it is inducing a spurious relationship between the original two variables. Deciding between these two options involves reflecting on whether the third variable makes sense conceptually as part of some causal mechanism linking the original two variables. If we use SPSS to produce a cross-tabulation of two variables, then we can elaborate this relationship by introducing a third variable as a layer variable. Examining the Cramér’s V values for the original cross-tabulation and for the layers of the elaborated cross-tabulation tells us what kind of situation we are looking at:

If the Cramér’s V values for the layered cross-tabulation vary in size, perhaps with some being smaller than the original value and some being as large or larger than it, then the situation is one of specification. However, if one or more of the Cramér’s V values is larger than the original value, then a failure to take account of the third variable in the first instance may also have been suppressing an underlying relationship between the two variables. This latter situation is a variation on the theme of spuriousness: in this case, the absence of a bivariate relationship is spurious rather than the presence of one!) Using Cramér’s V to classify a multivariate situation (continued)

Multivariate analyses can utilise a variety of techniques (depending on the form of the data, research questions to be addressed, etc. – we will be looking at multiple (linear) regression, but other ‘popular’ techniques include logistic regression and log-linear models), in order to determine whether the relationship between two variables persists or is altered when we ‘control for’ a third (or fourth, or fifth...) variable. Multivariate analysis can also enable us to establish which variable(s) has/have the greatest impact on a dependent variable – e.g. Is sex more important than ‘race’ in determining income? It is often important for a multivariate analysis to check for interactions between the effects of independent variables, as discussed earlier under the heading of specification. More generally…

An example (from BSA 2006) View on whether pre-marital sex wrong AlwaysMostlySometimesRarelyNot at allTotal Has religion? No %3.2%8.8%10.5%76.1% 100.0% Yes %10.8%16.2%10.3%52.3%100.0% Total %7.3%12.8%10.4%63.3%100.0%  2 4 = (p < 0.001) Cramér’s V = 0.290

But if we split the crosstabulation by age... Under 45:  2 4 = (p < 0.001) Cramér’s V = or over:  2 4 = (p < 0.001) Cramér’s V = Hence there is an extent to which part of the bivariate relationship was a spurious consequence of age (since = 72.66, which is less than 86.97, and the Cramér’s V values show elements both of replication (since there is a statistically significant relationship for both age groups), and also of specification (since the relationship appears weaker for the younger age group, i.e. the effects of religion and age interact).

Testing for interactions Unfortunately, as mentioned last week, testing for an interaction in a three-way cross- tabulations requires knowledge of an additional technique (log-linear models). Testing for an interaction within an Analysis of Variance involving one dependent variable and two independent variables (Two-way ANOVA) is rather more straightforward…

Starting with some means… BSA 2006: At what age did you retire work? (Q296) NS- SEC class NMean Employers in large org.; higher manag. & pr Lower profess & manag; higher techn. & su Intermediate occupations Employers in small org.; own account work Lower supervisory & technical occupation Semi-routine occupations Routine occupations Total

… and then a One-Way ANOVA BSA 2006: At what age did you retire work? (Q296) Sum of Squares dfMean Square F Sig. Between Groups Within Groups Total Since p=0.001 < 0.05, there is a significant relationship between occupational class (NS-SEC) and retirement age. … but we need to remember to reflect on whether the assumptions of ANOVA are met in this case!

Assumptions: a reminder ANOVA make an assumption of homogeneity of variance (i.e. that the spread of values is the same in each of the groups). Furthermore, ANOVA assumes that the variable has (approximately) a normal distribution within each of the groups. Levene’s test of the former assumption results in p<0.001, i.e. the assumption is not plausible. … and it is also not self-evident that retirement ages would have a normal distribution!

Nevertheless… We might ask ourselves the question whether some of the class difference in retirement ages reflects gender. And hence there is a motivation to carry out a Two-way ANOVA to look at the effects of class and gender simultaneously.

Two-way ANOVA results BSA2006: At what age did you retire work Q296 (Type III) Source Sum of Sq. dfMean Sq. F Sig. Corrected Model RClass RSex RClass * RSex Error Corrected Total

… so what do the results mean? The overall variation explained by the two variables is greater ( compared to ). But the between-groups variation which is unique to class is no longer significant (p=0.210 > 0.05) Whereas the between-groups variation which is unique to sex is significant (p<0.001) … but sex and class do not have interacting effects (p=0.332) Note that the class, sex and interaction sums of squares don’t add up to the overall ‘explained’ sum of squares because some of the effects of class and sex overlap.

A multivariate conclusion! The class differences in retirement age observed in the One-way ANOVA are shown by the Two-way ANOVA to be a spurious consequence of the relationships between gender and class and between gender and retirement age!