Advanced statistics for master students Loglinear models II The best model selection and models for ordinal variables.

Slides:

Advertisements

Similar presentations

1 A gender and helping study with a different outcome.

Advertisements

Lecture 10 Feb Added-variable Added variable plots give you a visual sense of whether x2 is a useful addition to the model: E(y|x1) = a + b x1.

© Department of Statistics 2012 STATS 330 Lecture 32: Slide 1 Stats 330: Lecture 32.

Brief introduction on Logistic Regression

SOC 681 James G. Anderson, PhD

Cal State Northridge  320 Andrew Ainsworth PhD Regression.

Log-linear Analysis - Analysing Categorical Data

1 Lecture 2: ANOVA, Prediction, Assumptions and Properties Graduate School Social Science Statistics II Gwilym Pryce

1 Lecture 2: ANOVA, Prediction, Assumptions and Properties Graduate School Social Science Statistics II Gwilym Pryce

Statistics for Managers Using Microsoft® Excel 5th Edition

Econ 140 Lecture 131 Multiple Regression Models Lecture 13.

Multivariate Data Analysis Chapter 11 - Structural Equation Modeling.

Multiple Regression Models

Lecture 24: Thurs. Dec. 4 Extra sum of squares F-tests (10.3) R-squared statistic (10.4.1) Residual plots (11.2) Influential observations (11.3,

Log-linear and logistic models

Linear statistical models 2008 Count data, contingency tables and log-linear models Expected frequency: Log-linear models are linear models of the log.

Handling Categorical Data. Learning Outcomes At the end of this session and with additional reading you will be able to: – Understand when and how to.

Week Lecture 3Slide #1 Minimizing e 2 : Deriving OLS Estimators The problem Deriving b 0 Deriving b 1 Interpreting b 0 and b 1.

Linear statistical models 2009 Count data  Contingency tables and log-linear models  Poisson regression.

Chapter 15: Model Building

Multiple Regression – Basic Relationships

Crosstabs. When to Use Crosstabs as a Bivariate Data Analysis Technique For examining the relationship of two CATEGORIC variables  For example, do men.

SW388R7 Data Analysis & Computers II Slide 1 Multiple Regression – Basic Relationships Purpose of multiple regression Different types of multiple regression.

SW388R7 Data Analysis & Computers II Slide 1 Multiple Regression – Split Sample Validation General criteria for split sample validation Sample problems.

Logistic Regression Logistic Regression - Dichotomous Response variable and numeric and/or categorical explanatory variable(s) –Goal: Model the probability.

1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.

Statistics for the Social Sciences Psychology 340 Fall 2013 Thursday, November 21 Review for Exam #4.

Categorical Data Prof. Andy Field.

Categorical Data Analysis School of Nursing “Categorical Data Analysis 2x2 Chi-Square Tests and Beyond (Multiple Categorical Variable Models)” Melinda.

Lecture 12 Model Building BMTRY 701 Biostatistical Methods II.

Week 8 Chapter 8 - Hypothesis Testing I: The One-Sample Case.

1 1 Slide © 2007 Thomson South-Western. All Rights Reserved OPIM 303-Lecture #9 Jose M. Cruz Assistant Professor.

1 1 Slide © 2007 Thomson South-Western. All Rights Reserved Chapter 13 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.

1 1 Slide Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple Coefficient of Determination n Model Assumptions n Testing.

LOG-LINEAR MODEL FOR CONTIGENCY TABLES Mohd Tahir Ismail School of Mathematical Sciences Universiti Sains Malaysia.

Tests of Signiﬁcance June 11, 2008 Ivan Katchanovski, Ph.D. POL 242Y-Y.

Logistic Regression Database Marketing Instructor: N. Kumar.

Advanced statistics for master students Loglinear models.

Nonparametric Tests: Chi Square   Lesson 16. Parametric vs. Nonparametric Tests n Parametric hypothesis test about population parameter (  or  2.

Measures of Fit David A. Kenny January 25, Background Introduction to Measures of Fit.

Regression Analysis Part C Confidence Intervals and Hypothesis Testing

 Relationship between education level, income, and length of time out of school  Our new regression equation: is the predicted value of the dependent.

Chapter 22: Building Multiple Regression Models Generalization of univariate linear regression models. One unit of data with a value of dependent variable.

Chapter 11: Chi-Square  Chi-Square as a Statistical Test  Statistical Independence  Hypothesis Testing with Chi-Square The Assumptions Stating the Research.

© Department of Statistics 2012 STATS 330 Lecture 19: Slide 1 Stats 330: Lecture 19.

1 G Lect 7a G Lecture 7a Comparing proportions from independent samples Analysis of matched samples Small samples and 2  2 Tables Strength.

1 Week 3 Association and correlation handout & additional course notes available at Trevor Thompson.

More on regression Petter Mostad More on indicator variables If an independent variable is an indicator variable, cases where it is 1 will.

1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial.

(Slides not created solely by me – the internet is a wonderful tool) SW388R7 Data Analysis & Compute rs II Slide 1.

4-1 MGMG 522 : Session #4 Choosing the Independent Variables and a Functional Form (Ch. 6 & 7)

Remember the equation of a line: Basic Linear Regression As scientists, we find it an irresistible temptation to put a straight line though something that.

POLS 7000X STATISTICS IN POLITICAL SCIENCE CLASS 9 BROOKLYN COLLEGE-CUNY SHANG E. HA Leon-Guerrero and Frankfort-Nachmias, Essentials of Statistics for.

 Check the Random, Large Sample Size and Independent conditions before performing a chi-square test  Use a chi-square test for homogeneity to determine.

I. ANOVA revisited & reviewed

Cross Tabulation with Chi Square

BINARY LOGISTIC REGRESSION

LINEAR REGRESSION 1.

Categorical Data Aims Loglinear models Categorical data

Statistics in SPSS Lecture 4

Reasoning in Psychology Using Statistics

Hypothesis testing and Estimation

Statistics in SPSS Lecture 9

Inference for Relationships

Lecture 12 Model Building

Lecture 42 Section 14.4 Wed, Apr 17, 2007

Lecture 43 Sections 14.4 – 14.5 Mon, Nov 26, 2007

Reasoning in Psychology Using Statistics

Reasoning in Psychology Using Statistics

SEM evaluation and rules

Presentation transcript:

Advanced statistics for master students Loglinear models II The best model selection and models for ordinal variables

3 procedures in SPSS: 1)Loglinear (today and next lecture models with ordinal variables) 2)Model selection (today) 3)Logit (not included) Ordinal Loglinear Models Literature: Agresti (2002), Wiley; Simonof (2003), Springer;Ishii-Kuntz (1994), Sage

Model selection procedure -try to find „the best“hierarchical model -logic: based on chi-squre tests which compare LR criteria for 2 nested models -approach: Start with saturated model and through backward go to the best model (quite opposite strategy can be applied - from model of independence forward method, sometimes not reccomended in literature) -all variables are treated as nominal

Model selection -2 tests 1)Test that k-way interaction are all zero 2)Test, that k-way and higher interactions are all zero Model selection procedure in SPSS: 1.Saturated model estimates 2.2 tests (see above) 3.Proposal for removing nonsignificant the highest order interactions and computations of new model estimate 4.Again step 2 and 3 (finish-the best model) 5.Computation of parameters of the best model

Model selection Limits of procedure 1)Only hierarchical models 2)Based on LR tests only, not including principle of parsimony (see below AIC a BIC etc.) 3)Only models for nominal variables BUT: For most analytical tasks it can be usefull. The procedure is very quick. For the first insight into your data this procedure can be reccomended.

Ordinal Loglinear Models - One or more variables is treated as ordinal - We save number of parameters, higher degrees of freedom (e.g. instead of parameters for every row, only parameter for one variable can be used, the same can be applied for interactions) - There are many models in literature, this lecture only two and three varibles models -SPSS is limited with the work with ordinal models

Ordinal Loglinear Models - Row and column effect model – one variable ordinal, one nominal - Row effect model – row variable is nominal and column variable is ordinal, interactions are created by values of column variable instead of columns (Example table 3x3: for two nominal variables 4 interaction parameters, for row effect model only 2) - Uniform association- two ordinal variables, interaction is composed by multiplying values of these variables (Example table 3x3: for two nominal variables 4 interaction parameters, for row effect model only 1)

Ordinal Loglinear Models Formulas, equations - Row and column effect model - Uniform association Interpretation of parameters and odds in models - Row and column effect model - Uniform association Model for three variables Model of independence Model of constant fluidity, partial asscociation saturated model

Ordinal Loglinear Models „The best model! - Tests for LR criterias Goodman, AIC BIC criteria Goodman index G = G 2 /df, where G 2 is LR criteria (overall test of fit) df-degrees of freedom Akaike information criteria AIC = G 2 + 2p, Where p is number of parameters in model

Ordinal Loglinear Models Goodman, AIC, BIC – continue: Bayes Schwartz information criteria BIC = G 2 -df (ln n), Where n is number of respondents The lower the better –logic of all criterias Problem – different criteria favour different models Note: These criteria can be used in many statistical techniques- regression analysis, multilevel models, SEM, etc.

Ordinal Loglinear Models The best model selection – other methods - Residuals – tests - Residuals – charts - Principle of parsimony

Ordinal Loglinear Models summary Reccomendation for model selection (Ishii-Kuntz 94:53-4) 1)Prefer model with lower number of parameters (parsimony). 2)Prefer model with simpler interpretation. 3)Prefer model with all parameters statistical significant. 4)Higher Sig. for overall test is good but too big Sig. can be sign of model icludes too much parameters. 5)For ordinal variables is reccomended to start with models for nominal data and then use appropriate model with ordinal varibles. 6)The most important rule is to follow the theory and use model proposed in literature. Do not apply (or try) all possible models (data driven analysis) but have some hypothesis in advance about model and test this hypothesis (theory driven analysis) – (Petr S slide 12)

HW 1) Try to use general loglinear model procedure and find appropriate model for at lest 3 variables. Interpret results and tests. 1) Try to find the best model with Model selection on your data 2) Try to use ordinal modelon your data and interpret results. Compare ordinal model and best hierarchical model (degrees of freedom, LR, criterias etc.)