Conditional Test Statistics. Suppose that we are considering two Log- linear models and that Model 2 is a special case of Model 1. That is the parameters.

Slides:



Advertisements
Similar presentations
All Possible Regressions and Statistics for Comparing Models
Advertisements

General Linear Model With correlated error terms  =  2 V ≠  2 I.
Computational Statistics. Basic ideas  Predict values that are hard to measure irl, by using co-variables (other properties from the same measurement.
Automated Regression Modeling Descriptive vs. Predictive Regression Models Four common automated modeling procedures Forward Modeling Backward Modeling.
Probability & Statistical Inference Lecture 9
Regression: (2) Multiple Linear Regression and Path Analysis Hal Whitehead BIOL4062/5062.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
1 Multiple Regression A single numerical response variable, Y. Multiple numerical explanatory variables, X 1, X 2,…, X k.
Log-linear Analysis - Analysing Categorical Data
Multiple Logistic Regression RSQUARE, LACKFIT, SELECTION, and interactions.
(Hierarchical) Log-Linear Models Friday 18 th March 2011.
The Multiple Regression Model Prepared by Vera Tabakova, East Carolina University.
Linear statistical models 2008 Binary and binomial responses The response probabilities are modelled as functions of the predictors Link functions: the.
Statistics for Managers Using Microsoft® Excel 5th Edition
Part I – MULTIVARIATE ANALYSIS C3 Multiple Linear Regression II © Angel A. Juan & Carles Serrat - UPC 2007/2008.
Predicting the Semantic Orientation of Adjective Vasileios Hatzivassiloglou and Kathleen R. McKeown Presented By Yash Satsangi.
Multiple Linear Regression Introduction to Business Statistics, 5e Kvanli/Guynes/Pavur (c)2000 South-Western College Publishing.
BIOST 536 Lecture 9 1 Lecture 9 – Prediction and Association example Low birth weight dataset Consider a prediction model for low birth weight (< 2500.
1 Chapter 9 Variable Selection and Model building Ray-Bing Chen Institute of Statistics National University of Kaohsiung.
Chapter 16 Chi Squared Tests.
Log-linear and logistic models Generalised linear model ANOVA revisited Log-linear model: Poisson distribution logistic model: Binomial distribution Deviances.
Handling Categorical Data. Learning Outcomes At the end of this session and with additional reading you will be able to: – Understand when and how to.
Linear Programming Applications
An Introduction to Logistic Regression
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 11 th Edition.
Chapter 15: Model Building
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
Log-linear Models For 2-dimensional tables. Two-Factor ANOVA (Mean rot of potatoes) Bacteria Type Temp123 1=Cool 2=Warm.
Categorical Data Prof. Andy Field.
1 Chapter 11 Analysis of Variance Introduction 11.2 One-Factor Analysis of Variance 11.3 Two-Factor Analysis of Variance: Introduction and Parameter.
LOG-LINEAR MODEL FOR CONTIGENCY TABLES Mohd Tahir Ismail School of Mathematical Sciences Universiti Sains Malaysia.
EIPB 698E Lecture 10 Raul Cruz-Cano Fall Comments for future evaluations Include only output used for conclusions Mention p-values explicitly (also.
The Examination of Residuals. Examination of Residuals The fitting of models to data is done using an iterative approach. The first step is to fit a simple.
Testing Hypotheses about Differences among Several Means.
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Week 7 Logistic Regression I.
Lesson Multiple Regression Models. Objectives Obtain the correlation matrix Use technology to find a multiple regression equation Interpret the.
1 Multiple Regression A single numerical response variable, Y. Multiple numerical explanatory variables, X 1, X 2,…, X k.
1 11 Simple Linear Regression and Correlation 11-1 Empirical Models 11-2 Simple Linear Regression 11-3 Properties of the Least Squares Estimators 11-4.
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
Multiple Logistic Regression STAT E-150 Statistical Methods.
IE241: Introduction to Design of Experiments. Last term we talked about testing the difference between two independent means. For means from a normal.
Using SPSS Note: The use of another statistical package such as Minitab is similar to using SPSS.
Université d’Ottawa / University of Ottawa 2001 Bio 4118 Applied Biostatistics L14.1 Lecture 14: Contingency tables and log-linear models Appropriate questions.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University.
University of Warwick, Department of Sociology, 2012/13 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Logistic Regression II/ (Hierarchical)
Nonparametric Statistics
Chi-Två Test Kapitel 6. Introduction Two statistical techniques are presented, to analyze nominal data. –A goodness-of-fit test for the multinomial experiment.
Log-linear Models Please read Chapter Two. We are interested in relationships between variables White VictimBlack Victim White Prisoner151 (151/160=0.94)
LOGISTIC REGRESSION. Purpose  Logistical regression is regularly used when there are only two categories of the dependent variable and there is a mixture.
Chapter Outline EMPIRICAL MODELS 11-2 SIMPLE LINEAR REGRESSION 11-3 PROPERTIES OF THE LEAST SQUARES ESTIMATORS 11-4 SOME COMMENTS ON USES OF REGRESSION.
Logistic Regression Binary response variable Y (1 – Success, 0 – Failure) Continuous, Categorical independent Variables –Similar to Multiple Regression.
Logistic Regression: Regression with a Binary Dependent Variable.
1 BUSI 6220 By Dr. Nick Evangelopoulos, © 2012 Brief overview of Linear Regression Models (Pre-MBA level)
Yandell – Econ 216 Chap 15-1 Chapter 15 Multiple Regression Model Building.
Stats Methods at IC Lecture 3: Regression.
BINARY LOGISTIC REGRESSION
LINEAR REGRESSION 1.
26134 Business Statistics Week 5 Tutorial
Logistic Regression.
Discrete Multivariate Analysis
Discrete Multivariate Analysis
Conditional Test Statistics
CHAPTER 29: Multiple Regression*
Hidden Markov Autoregressive Models
Rainfall Example The data set contains cord yield (bushes per acre) and rainfall (inches) in six US corn-producing states (Iowa, Nebraska, Illinois, Indiana,
Linear Model Selection and regularization
Regression Analysis.
Presentation transcript:

Conditional Test Statistics

Suppose that we are considering two Log- linear models and that Model 2 is a special case of Model 1. That is the parameters of Model 2 are a subset of the parameters of Model 1. Also assume that Model 1 has been shown to adequately fit the data.

In this case one is interested in testing if the differences in the expected frequencies between Model 1 and Model 2 is simply due to random variation] The likelihood ratio chi-square statistic that achieves this goal is:

Example

Goodness of Fit test for the all k-factor models Conditional tests for zero k-factor interactions

Conclusions 1.The four factor interaction is not significant G 2 (3|4) = 0.7 (p = 0.705) 2.The all three factor model provides a significant fit G 2 (3) = 0.7 (p = 0.705) 3.All the three factor interactions are not significantly different from 0, G 2 (2|3) = 9.2 (p = 0.239). 4.The all two factor model provides a significant fit G 2 (2) = 9.9 (p = 0.359) 5.There are significant 2 factor interactions G 2 (1|2) = 33.0 (p = Conclude that the model should contain main effects and some two-factor interactions

There also be a natural sequence of progressively complicated models that one might want to identify. In the laundry detergent example the variables are: 1.Softness of Laundry Used 2.Previous use of Brand M 3.Temperature of laundry water used 4.Preference of brand X over brand M

A natural order for increasingly complex models which should be considered might be: 1.[1][2][3][4] 2.[1][3][24] 3.[1][34][24] 4.[13][34][24] 5.[13][234] 6.[134][234] The all-Main effects model Independence amongst all four variables Since previous use of brand M may be highly related to preference for brand M][ add first the 2-4 interaction Brand M is recommended for hot water add 2 nd the 3-4 interaction brand M is also recommended for Soft laundry add 3 rd the 1-3 interaction Add finally some possible 3- factor interactions

Models d.f.G2G2 [1][3][24] [1][24][34]1618 [13][24][34] [13][23][24][34] [12][13][23][24][34] [1][234] [134][24] [13][234]128.4 [24][34][123]98.4 [123][234]85.6 Likelihood Ratio G 2 for various models

Stepwise selection procedures Forward Selection Backward Elimination

Forward Selection: Starting with a model that under fits the data, log-linear parameters that are not in the model are added step by step until a model that does fit is achieved. At each step the log-linear parameter that is most significant is added to the model: To determine the significance of a parameter added we use the statistic: G 2 (2|1) = G 2 (2) – G 2 (1) Model 1 contains the parameter. Model 2 does not contain the parameter

Backward Selection: Starting with a model that over fits the data, log-linear parameters that are in the model are deleted step by step until a model that continues to fit the model and has the smallest number of significant parameters is achieved. At each step the log-linear parameter that is least significant is deleted from the model: To determine the significance of a parameter deleted we use the statistic: G 2 (2|1) = G 2 (2) – G 2 (1) Model 1 contains the parameter. Model 2 does not contain the parameter

K = knowledge N = Newspaper R = Radio S = Reading L = Lectures

Continuing after 10 steps

The final step

The best model was found a the previous step [LN][KLS][KR][KN][LR][NR][NS]

Logit Models To date we have not worried whether any of the variables were dependent of independent variables. The logit model is used when we have a single binary dependent variable.

The variables 1.Type of seedling (T) a.Longleaf seedling b.Slash seedling 2.Depth of planting (D) a.Too low. b.Too high 3.Mortality (M) (the dependent variable) a.Dead b.Alive

The Log-linear Model Note:  ij1 = # dead when T = i and D = j.  ij2 = # alive when T = i and D = j. = mortality ratio when T = i and D = j.

Hence since

The logit model: where

Thus corresponding to a loglinear model there is logit model predicting log ratio of expected frequencies of the two categories of the independent variable. Also k +1 factor interactions with the dependent variable in the loglinear model determine k factor interactions in the logit model k + 1 = 1 constant term in logit model k + 1 = 2, main effects in logit model

1 = Depth, 2 = Mort, 3 = Type

Log-Linear parameters for Model: [TM][TD][DM]

Logit Model for predicting the Mortality or

The best model was found by forward selection was [LN][KLS][KR][KN][LR][NR][NS] To fit a logit model to predict K (Knowledge) we need to fit a loglinear model with important interactions with K (knowledge), namely [LNRS][KLS][KR][KN] The logit model will contain Main effects for L (Lectures), N (Newspapers), R (Radio), and S (Reading) Two factor interaction effect for L and S

The Logit Parameters for the Model : LNSR, KLS, KR, KN ( Multiplicative effects are given in brackets, Logit Parameters = 2 Loglinear parameters) The Constant term: (0.798) The Main effects on Knowledge: LecturesLect0.268 (1.307) None (0.765) NewspaperNews0.324 (1.383) None (0.723) ReadingSolid0.340 (1.405) Not (0.712) RadioRadio0.150 (1.162) None (0.861) The Two-factor interaction Effect of Reading and Lectures on Knowledge