Sociology 690 Multivariate Analysis Log Linear Models.

Slides:



Advertisements
Similar presentations
Selecting a Data Analysis Technique: The First Steps
Advertisements

Dr. Satyendra Singh, Department of Adminstrative Studies Welcome to the Class of Bivariate Data Analysis.
Chapter 16 Goodness-of-Fit Tests and Contingency Tables
Statistical Analysis SC504/HS927 Spring Term 2008
STATISTICAL ANALYSIS. Your introduction to statistics should not be like drinking water from a fire hose!!
Sociology 680 Multivariate Analysis Logistic Regression.
Loglinear Models for Contingency Tables. Consider an IxJ contingency table that cross- classifies a multinomial sample of n subjects on two categorical.
Log-linear Analysis - Analysing Categorical Data
(Hierarchical) Log-Linear Models Friday 18 th March 2011.
Basic Data Analysis for Quantitative Research
LINEAR REGRESSION: Evaluating Regression Models Overview Assumptions for Linear Regression Evaluating a Regression Model.
LINEAR REGRESSION: Evaluating Regression Models. Overview Assumptions for Linear Regression Evaluating a Regression Model.
Statistical Methods Chichang Jou Tamkang University.
Chi Square Test Dealing with categorical dependant variable.
Handling Categorical Data. Learning Outcomes At the end of this session and with additional reading you will be able to: – Understand when and how to.
Data Analysis Statistics. Inferential statistics.
Data Analysis Statistics. Levels of Measurement Nominal – Categorical; no implied rankings among the categories. Also includes written observations and.
Summary of Quantitative Analysis Neuman and Robson Ch. 11
Discriminant Analysis Testing latent variables as predictors of groups.
SW388R7 Data Analysis & Computers II Slide 1 Multiple Regression – Basic Relationships Purpose of multiple regression Different types of multiple regression.
Analyzing Data: Bivariate Relationships Chapter 7.
1 of 27 PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2013, Michael Kalsher Michael J. Kalsher Department of Cognitive Science Adv. Experimental.
Categorical Data Prof. Andy Field.
Fundamentals of Statistical Analysis DR. SUREJ P JOHN.
CHP400: Community Health Program - lI Research Methodology. Data analysis Hypothesis testing Statistical Inference test t-test and 22 Test of Significance.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Sociology 680 Multivariate Analysis: Analysis of Variance.
Social Science Research Design and Statistics, 2/e Alfred P. Rovai, Jason D. Baker, and Michael K. Ponton Pearson Chi-Square Contingency Table Analysis.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
Week 5: Logistic regression analysis Overview Questions from last week What is logistic regression analysis? The mathematical model Interpreting the β.
Chapter 13 Multiple Regression
Chapter Seventeen. Figure 17.1 Relationship of Hypothesis Testing Related to Differences to the Previous Chapter and the Marketing Research Process Focus.
Logistic regression. Recall the simple linear regression model: y =  0 +  1 x +  where we are trying to predict a continuous dependent variable y from.
Statistical Analysis. Z-scores A z-score = how many standard deviations a score is from the mean (-/+) Z-scores thus allow us to transform the mean to.
N318b Winter 2002 Nursing Statistics Specific statistical tests Chi-square (  2 ) Lecture 7.
Multivariate Analysis: Analysis of Variance
Non-parametric Tests e.g., Chi-Square. When to use various statistics n Parametric n Interval or ratio data n Name parametric tests we covered Tuesday.
University of Warwick, Department of Sociology, 2012/13 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Logistic Regression II/ (Hierarchical)
T-tests Chi-square Seminar 7. The previous week… We examined the z-test and one-sample t-test. Psychologists seldom use them, but they are useful to understand.
Jump to first page Inferring Sample Findings to the Population and Testing for Differences.
Nonparametric Statistics
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
LOGISTIC REGRESSION. Purpose  Logistical regression is regularly used when there are only two categories of the dependent variable and there is a mixture.
Logistic Regression Binary response variable Y (1 – Success, 0 – Failure) Continuous, Categorical independent Variables –Similar to Multiple Regression.
I. ANOVA revisited & reviewed
Cross Tabulation with Chi Square
Nonparametric Statistics
Test of independence: Contingency Table
Chapter 12 Chi-Square Tests and Nonparametric Tests
Chapter 9: Non-parametric Tests
Logistic Regression APKC – STATS AFAC (2016).
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Logistic Regression II/ (Hierarchical)
REGRESSION (R2).
Dr. Siti Nor Binti Yaacob
Chapter 11 Chi-Square Tests.
Multivariate Analysis
INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE
Hypothesis Testing Review
Discrete Multivariate Analysis
Categorical Data Aims Loglinear models Categorical data
Multiple logistic regression
Nonparametric Statistics
Chapter 11 Chi-Square Tests.
Reasoning in Psychology Using Statistics
Multivariate Analysis: Analysis of Variance
Parametric versus Nonparametric (Chi-square)
Reasoning in Psychology Using Statistics
Chapter 11 Chi-Square Tests.
Multivariate Analysis: Analysis of Variance
Presentation transcript:

Sociology 690 Multivariate Analysis Log Linear Models

The Analysis of Categories Category Quantity Category IV DV Linear Models Category Models 2) Structural Equation Models (SEM) 1) Analysis of Variance Models (ANOVA) 4) Logistic Regression Models (LRM) 3) Log Linear Models (LLM)

Cross-classification Ironically, while categorical data are among the most prevalent form of information collected in sociology, until recently the most dominant types of statistical analysis have been based on continuous data: e.g. t-tests, ANOVA, correlation, regressionin short the general linear model.

Typical Goodness of Fit Model The analysis of effects among categorical variables has been traditionally accomplished through cross-tabulation tables, utilizing a goodness of fit method such as chi square. To the extent the observed frequencies deviate from expected cell frequencies, we would reject the assumption that the variables are independent and accept the alternative that they are related.

Example of Chi Square Suppose we have the following cross-classification of observed frequencies for two categorical variables: Male Female NoYesSexTotal Attend College Total Chi Square would be determined by the following formula: Where the expected frequencies are determined by the formula (fc x fr) / ft

Chi Square Calculation: Male Female NoYesSexTotal Attend College Total Here chi square would be calculated as follows: (25-40) 2 /25 + (50-35) 2 /50 + (25-10) 2 /25 + (50-65) 2 /50 = = 27. With 1 d.f. (r-1 x c-1) SignificanceSignificance And the measure of association would be derived from chi square (e.g. )

What chi square does not cover But what if we wanted to examine more than two categorical variables (as in a 2 x 2 x 2 cross- classification table). This kind of multi-way frequency analysis (sometimes called MFA) could be done by calculating chi-squares on all the possible two-way tables. However, that would (among other things), prevent us from calculations of any interactions between the variables.

Purpose of Log Linear Analysis Log-linear models are typically used with multi-way dichotomous or categorical variables. They focus on a procedure for accounting for the distribution of cases in a cross-tabulation of categorical variables. Based on the association of categorical data (rather than the causal sequencing of independent and dependent variables), LLA looks at all levels of possible interaction effects. In this sense, Log-linear analysis is a type of multi-way frequency analysis (MFA) and sometimes log-linear analysis is labeled MFA.

Definitions in Log linear Analysis Ln(Fij) = + iA + jB + ijAB, where: Ln(Fij) = is the log of the expected cell frequency of the cases for cell ij in the contingency table. = is the overall mean of the natural log of the expected frequencies = terms each represent effects which the variables have on the cell frequencies A and B = the variables i and j = refer to the categories within the variables

Procedure for Log Linear Analysis Choosing the model Fitting the model Estimating the Parameters Testing the Goodness of Fit

Choosing the Model Saturated vs. Unsaturated If all possible effects are included in the model, is it considered saturated. Unsaturated models are useful when the number of effects equals the number of cell (as would be the case in a 2 x 2 table). Hierarchical vs. Non-Hierarchical The former implies that if we have a higher interaction effect in our model (e.g. AxBXC), we must include a lower interaction effect (e.g. AxB)

Estimating Parameters Odds and Odds Ratios: In our original cross-tabulation table, the odds of being female is 75/75 or 1.0. The odds of being in college is 40/10 or 4.0 and the odds of no being in college are 35/65 or.54. An odds ratio is the conditional odds of one category divided by the conditional odds of the of the other category. Hence the odds ratio for women being in college is 4.0/.54 or Odds ratios greater than one = a relationship Male Female NoYesSexTotal Attend College Total

SPSS Input

SPSS Output