Log-linear Models Please read Chapter Two. We are interested in relationships between variables White VictimBlack Victim White Prisoner151 (151/160=0.94)

Slides:



Advertisements
Similar presentations
CHI-SQUARE(X2) DISTRIBUTION
Advertisements

 2 Test of Independence. Hypothesis Tests Categorical Data.
Finish Anova And then Chi- Square. Fcrit Table A-5: 4 pages of values Left-hand column: df denominator df for MSW = n-k where k is the number of groups.
1 1 Slide © 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Three-dimensional tables (Please Read Chapter 3).
Inference about the Difference Between the
Chap 8: Estimation of parameters & Fitting of Probability Distributions Section 6.1: INTRODUCTION Unknown parameter(s) values must be estimated before.
1 1 Slide © 2009 Econ-2030-Applied Statistics-Dr. Tadesse. Chapter 11: Comparisons Involving Proportions and a Test of Independence n Inferences About.
Chap 9-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 9 Estimation: Additional Topics Statistics for Business and Economics.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 14 Goodness-of-Fit Tests and Categorical Data Analysis.
11-3 Contingency Tables In this section we consider contingency tables (or two-way frequency tables), which include frequency counts for categorical data.
Inferences About Process Quality
Chi-Square Tests and the F-Distribution
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
INFERENTIAL STATISTICS – Samples are only estimates of the population – Sample statistics will be slightly off from the true values of its population’s.
Presentation 12 Chi-Square test.
+ Quantitative Statistics: Chi-Square ScWk 242 – Session 7 Slides.
The Chi-square Statistic. Goodness of fit 0 This test is used to decide whether there is any difference between the observed (experimental) value and.
Log-linear Models For 2-dimensional tables. Two-Factor ANOVA (Mean rot of potatoes) Bacteria Type Temp123 1=Cool 2=Warm.
1 Tests with two+ groups We have examined tests of means for a single group, and for a difference if we have a matched sample (as in husbands and wives)
Maximum Likelihood See Davison Ch. 4 for background and a more thorough discussion. Sometimes.
Copyright © 2010 Pearson Education, Inc. Warm Up- Good Morning! If all the values of a data set are the same, all of the following must equal zero except.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Statistical Inferences Based on Two Samples Chapter 9.
1 G Lect 6b G Lecture 6b Generalizing from tests of quantitative variables to tests of categorical variables Testing a hypothesis about a.
1 1 Slide IS 310 – Business Statistics IS 310 Business Statistics CSU Long Beach.
1 1 Slide © 2005 Thomson/South-Western Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial Population Goodness of.
Chapter 11: Applications of Chi-Square. Count or Frequency Data Many problems for which the data is categorized and the results shown by way of counts.
Chapter 11 Chi-Square Procedures 11.3 Chi-Square Test for Independence; Homogeneity of Proportions.
Multinomial Distribution
Ch9. Inferences Concerning Proportions. Outline Estimation of Proportions Hypothesis concerning one Proportion Hypothesis concerning several proportions.
1 1 Slide © 2006 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 In this case, each element of a population is assigned to one and only one of several classes or categories. Chapter 11 – Test of Independence - Hypothesis.
1 1 Slide Chapter 11 Comparisons Involving Proportions n Inference about the Difference Between the Proportions of Two Populations Proportions of Two Populations.
Slide 26-1 Copyright © 2004 Pearson Education, Inc.
Chi- square test x 2. Chi Square test Symbolized by Greek x 2 pronounced “Ki square” A Test of STATISTICAL SIGNIFICANCE for TABLE data.
Data Analysis for Two-Way Tables. The Basics Two-way table of counts Organizes data about 2 categorical variables Row variables run across the table Column.
Copyright © 2010 Pearson Education, Inc. Slide
1 1 Slide © 2009 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
Chapter 13 Inference for Counts: Chi-Square Tests © 2011 Pearson Education, Inc. 1 Business Statistics: A First Course.
Chapter Outline Goodness of Fit test Test of Independence.
The table shows a random sample of 100 hikers and the area of hiking preferred. Are hiking area preference and gender independent? Hiking Preference Area.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics S eventh Edition By Brase and Brase Prepared by: Lynn Smith.
Copyright © 2010 Pearson Education, Inc. Warm Up- Good Morning! If all the values of a data set are the same, all of the following must equal zero except.
11.2 Tests Using Contingency Tables When data can be tabulated in table form in terms of frequencies, several types of hypotheses can be tested by using.
STA 312 Fall 2010 Categorical Data Analysis (Discrete random variables)
Section 12.2: Tests for Homogeneity and Independence in a Two-Way Table.
Chapter 15 The Chi-Square Statistic: Tests for Goodness of Fit and Independence PowerPoint Lecture Slides Essentials of Statistics for the Behavioral.
Chapter 13- Inference For Tables: Chi-square Procedures Section Test for goodness of fit Section Inference for Two-Way tables Presented By:
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
Chapter 14 – 1 Chi-Square Chi-Square as a Statistical Test Statistical Independence Hypothesis Testing with Chi-Square The Assumptions Stating the Research.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial.
1 Statistical Analysis Professor Lynne Stokes Department of Statistical Science Lecture #1 Chi-square Contingency Table Test.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 11 Inference for Distributions of Categorical.
Slide 1 Copyright © 2004 Pearson Education, Inc. Chapter 11 Multinomial Experiments and Contingency Tables 11-1 Overview 11-2 Multinomial Experiments:
Chapter 11: Categorical Data n Chi-square goodness of fit test allows us to examine a single distribution of a categorical variable in a population. n.
Class Seven Turn In: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 For Class Eight: Chapter 20: 18, 20, 24 Chapter 22: 34, 36 Read Chapters 23 &
Comparing Observed Distributions A test comparing the distribution of counts for two or more groups on the same categorical variable is called a chi-square.
AP Stats Check In Where we’ve been… Chapter 7…Chapter 8… Where we are going… Significance Tests!! –Ch 9 Tests about a population proportion –Ch 9Tests.
CHI SQUARE DISTRIBUTION. The Chi-Square (  2 ) Distribution The chi-square distribution is the probability distribution of the sum of several independent,
Chi Square Test of Homogeneity. Are the different types of M&M’s distributed the same across the different colors? PlainPeanutPeanut Butter Crispy Brown7447.
Lecture Nine - Twelve Tests of Significance.
CHAPTER 11 Inference for Distributions of Categorical Data
AP Stats Check In Where we’ve been… Chapter 7…Chapter 8…
AP Stats Check In Where we’ve been… Chapter 7…Chapter 8…
Chapter 11: Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
Presentation transcript:

Log-linear Models Please read Chapter Two

We are interested in relationships between variables White VictimBlack Victim White Prisoner151 (151/160=0.94) 9 Black Prisoner 63 (63/166=0.40) 103

Pearson Chi-square test of Independence Based on P(A,B) = P(A) P(B) p 11 p 12 p 13 p 14 p 1+ p 21 p 22 p 23 p 24 p 2+ p +1 p +2 p +3 p +4 p ++ = 1 x 11 x 12 x 13 x 14 x 1+ x 21 x 22 x 23 x 24 x 2+ x +1 x +2 x +3 x +4 x ++ = N Under H 0 of independence, p ij = p i+ p +j

Computing the Pearson chisquare test of independence Calculate (estimated) expected frequencies Calculate For large samples, has an approximate Chisquare distribution if H 0 is true Degrees of freedom (I-1)(J-1)

Numerical example of Pearson chisquare White Victim Black Victim Total White Prisoner 151 (105) 9 (55) 160 Black Prisoner 63 (109) 103 (57) 166 Total

With R

Conclusions X 2 = 115, df = (2-1) (2-1) = 1 Critical value at alpha = 0.05 is 3.84 Reject H 0 Conclude race of prisoner and race of victim are not independent. That’s not good enough! Murder victims and the persons convicted of murdering them tend to be of the same race. (Say what happened!)

Two treatments for Kidney Stones Treatment ATreatment B Effective Ineffective7761 X 2 = , df = 1, p = These results are consistent with no difference in effectiveness between treatments.

All this applies to the multinomial, but there are 3 main sampling models Multinomial Poisson Product Multinomial Fortunately, the same statistical methods work with all.

Poisson Independent Poisson processes generate the counts in each category (for ex., traffic accidents). In homework you proved that conditionally upon the total number of events, the joint distribution of the counts is multinomial. Justifies use of multinomial theory But in hard cases, Poisson probability calculations can be easier.

Product multinomial Take independent random samples of sizes N 1, N 2, …, N I from I sub-populations. In each, observe a multinomial with J categories. Compare. Examples: Vitamin C study, Kidney stone study. Likelihood: A product of I multinomial likelihoods, because of independent sampling from sub-populations. This is almost always the right model for experimental studies.

Suppose the null hypothesis is no differences among the I vectors of multinomial probabilities Then under H 0, the MLE of the (common) p j is the sample proportion, pooling data across the I rows: x +j /N. And the expected cell frequency is x 11 x 12 x 13 x 14 x 1+ = N 1 x 21 x 22 x 23 x 24 x 2+ = N 2 x +1 x +2 x +3 x +4 x ++ = N Same as for the usual chisquare test of independence

So let’s concentrate on the multinomial

Assume a multinomial and test independence? Messy! p1p1 p2p2 p 1 +p 2 p3p3 p4p4 p 3 +p 4 p 1 +p 3 p 2 +p 4

Log-linear models Linear model for the (natural) logs of the expected frequencies Looks like ANOVA notation (STA332) First, one-factor (not in the text) Then two-factor (in the text) Start with the familiar normal example, testing for differences among means.

Compare 3 means Grand Mean Effects are deviations from the grand mean –

Single categorical variable, k categories Linear model for log of expected frequencies No probability can equal zero!

This is a Re-Parameterization Substitute into likelihood function and do maximum likelihood How many parameters, k or k-1?

There are still k-1 parameters All “effects” zero corresponds to equal probabilities

Maximum Likelihood

Log Likelihood

k = 3 Categories Numerical MLE

Remember the employment study? 106 Employed in a job related to field of study 74 Employed in a job unrelated to their field of study 20 Unemployed Use R to –Estimate the effects –Test equal probabilities (senseless)

Generic MLE with R

Estimate the probabilities and test

This seems like a lot of trouble just to estimate some probabilities and test if they are equal. But the payoff comes for tables of two or more dimensions.