Conditional Test Statistics

Conditional Test Statistics

Suppose that we are considering two Log-linear models and that Model 2 is a special case of Model 1.
That is the parameters of Model 2 are a subset of the parameters of Model 1. Also assume that Model 1 has been shown to adequately fit the data.

In this case one is interested in testing if the differences in the expected frequencies between Model 1 and Model 2 is simply due to random variation] The likelihood ratio chi-square statistic that achieves this goal is:

Example

Goodness of Fit test for the all k-factor models
Conditional tests for zero k-factor interactions

Conclusions The four factor interaction is not significant G2(3|4) = 0.7 (p = 0.705) The all three factor model provides a significant fit G2(3) = 0.7 (p = 0.705) All the three factor interactions are not significantly different from 0, G2(2|3) = 9.2 (p = 0.239). The all two factor model provides a significant fit G2(2) = 9.9 (p = 0.359) There are significant 2 factor interactions G2(1|2) = 33.0 (p = Conclude that the model should contain main effects and some two-factor interactions

There also may be a natural sequence of progressively complicated models that one might want to identify. In the laundry detergent example the variables are: Softness of Laundry Used Previous use of Brand M Temperature of laundry water used Preference of brand X over brand M

A natural order for increasingly complex models which should be considered might be:
[1][2][3][4] [1][3][24] [1][34][24] [13][34][24] [13][234] [134][234] The all-Main effects model Independence amongst all four variables Since previous use of brand M may be highly related to preference for brand M, add first the 2-4 interaction Brand M is recommended for hot water add 2nd the 3-4 interaction brand M is also recommended for Soft laundry add 3rd the 1-3 interaction Add finally some possible 3-factor interactions

Likelihood Ratio G2 for various models
d]f] G2 [1][3][24] 17 22.4 [1][24][34] 16 18 [13][24][34] 14 11.9 [13][23][24][34] 13 11.2 [12][13][23][24][34] 11 10.1 [1][234] 14.5 [134][24] 10 12.2 [13][234] 12 8.4 [24][34][123] 9 [123][234] 8 5.6

Stepwise selection procedures
Forward Selection Backward Elimination

Forward Selection: Starting with a model that under fits the data, log-linear parameters that are not in the model are added step by step until a model that does fit is achieved. At each step the log-linear parameter that is most significant is added to the model: To determine the significance of a parameter added we use the statistic: G2(2|1) = G2(2) – G2(1) Model 1 contains the parameter. Model 2 does not contain the parameter

Backward Selection: Starting with a model that over fits the data, log-linear parameters that are in the model are deleted step by step until a model that continues to fit the model and has the smallest number of significant parameters is achieved. At each step the log-linear parameter that is least significant is deleted from the model: To determine the significance of a parameter deleted we use the statistic: G2(2|1) = G2(2) – G2(1) Model 1 contains the parameter. Model 2 does not contain the parameter

K = knowledge N = Newspaper R = Radio S = Reading L = Lectures

Continuing after 10 steps

The final step

The best model was found a the previous step
[LN][KLS][KR][KN][LR][NR][NS]

Logit Models To date we have not worried whether any of the variables were dependent of independent variables. The logit model is used when we have a single binary dependent variable.

The variables Type of seedling (T) Depth of planting (D)
Longleaf seedling Slash seedling Depth of planting (D) Too low. Too high Mortality (M) (the dependent variable) Dead Alive

The Log-linear Model Note: mij1 = # dead when T = i and D = j.
mij2 = # alive when T = i and D = j. = mortality ratio when T = i and D = j.

Hence since

The logit model: where

Thus corresponding to a loglinear model there is logit model predicting log ratio of expected frequencies of the two categories of the independent variable. Also k +1 factor interactions with the dependent variable in the loglinear model determine k factor interactions in the logit model k + 1 = constant term in logit model k + 1 = 2, main effects in logit model

1 = Depth, 2 = Mort, 3 = Type

Log-Linear parameters for Model: [TM][TD][DM]

Logit Model for predicting the Mortality

The best model was found by forward selection was
[LN][KLS][KR][KN][LR][NR][NS] To fit a logit model to predict K (Knowledge) we need to fit a loglinear model with important interactions with K (knowledge), namely [LNRS][KLS][KR][KN] The logit model will contain Main effects for L (Lectures), N (Newspapers), R (Radio), and S (Reading) Two factor interaction effect for L and S

The Logit Parameters for the Model : LNSR, KLS, KR, KN
( Multiplicative effects are given in brackets, Logit Parameters = 2 Loglinear parameters) The Constant term: (0.798) The Main effects on Knowledge: Lectures Lect (1.307) None (0.765) Newspaper News (1.383) None (0.723) Reading Solid (1.405) Not (0.712) Radio Radio (1.162) None (0.861) The Two-factor interaction Effect of Reading and Lectures on Knowledge

Conditional Test Statistics

Similar presentations

Presentation on theme: "Conditional Test Statistics"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Conditional Test Statistics

Similar presentations

Presentation on theme: "Conditional Test Statistics"— Presentation transcript:

Similar presentations

About project

Feedback