The Art of Model Building and Statistical Tests. 2 Outline The art of model building Using Software output The t-statistic The likelihood ratio test The.

Slides:



Advertisements
Similar presentations
Further Inference in the Multiple Regression Model Hill et al Chapter 8.
Advertisements

The Multiple Regression Model.
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
Copyright © 2009 Pearson Education, Inc. Chapter 29 Multiple Regression.
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Correlation and regression
Testing means, part III The two-sample t-test. Sample Null hypothesis The population mean is equal to  o One-sample t-test Test statistic Null distribution.
The Multiple Regression Model Prepared by Vera Tabakova, East Carolina University.
Chapter 8 Estimation: Additional Topics
PSY 307 – Statistics for the Behavioral Sciences
Chapter 10 Simple Regression.
Statistics II: An Overview of Statistics. Outline for Statistics II Lecture: SPSS Syntax – Some examples. Normal Distribution Curve. Sampling Distribution.
QUALITATIVE AND LIMITED DEPENDENT VARIABLE MODELS.
The Simple Regression Model
Final Review Session.
SIMPLE LINEAR REGRESSION
Correlation. Two variables: Which test? X Y Contingency analysis t-test Logistic regression Correlation Regression.
Log-linear and logistic models Generalised linear model ANOVA revisited Log-linear model: Poisson distribution logistic model: Binomial distribution Deviances.
T-test.
Chap 9-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 9 Estimation: Additional Topics Statistics for Business and Economics.
Chapter 11 Multiple Regression.
Lecture 23 Multiple Regression (Sections )
SIMPLE LINEAR REGRESSION
Chapter 9 Hypothesis Testing.
5-3 Inference on the Means of Two Populations, Variances Unknown
Statistical Analysis. Purpose of Statistical Analysis Determines whether the results found in an experiment are meaningful. Answers the question: –Does.
Multiple Linear Regression Analysis
Hypothesis Testing and T-Tests. Hypothesis Tests Related to Differences Copyright © 2009 Pearson Education, Inc. Chapter Tests of Differences One.
Chapter 9 Title and Outline 1 9 Tests of Hypotheses for a Single Sample 9-1 Hypothesis Testing Statistical Hypotheses Tests of Statistical.
Hypothesis Tests and Confidence Intervals in Multiple Regressors
SIMPLE LINEAR REGRESSION
AM Recitation 2/10/11.
Regression Analysis. Regression analysis Definition: Regression analysis is a statistical method for fitting an equation to a data set. It is used to.
Introduction to Linear Regression and Correlation Analysis
Chapter 13: Inference in Regression
Statistical Analysis Statistical Analysis
Regression Analysis (2)
Chapter 12 Multiple Regression and Model Building.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
Education 793 Class Notes T-tests 29 October 2003.
Inferences in Regression and Correlation Analysis Ayona Chatterjee Spring 2008 Math 4803/5803.
QMS 6351 Statistics and Research Methods Regression Analysis: Testing for Significance Chapter 14 ( ) Chapter 15 (15.5) Prof. Vera Adamchik.
CHAPTER 14 MULTIPLE REGRESSION
Multinomial Distribution
Welcome to Econ 420 Applied Regression Analysis Study Guide Week Six.
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
Lecture Slide #1 Logistic Regression Analysis Estimation and Interpretation Hypothesis Tests Interpretation Reversing Logits: Probabilities –Averages.
Multiple Regression Petter Mostad Review: Simple linear regression We define a model where are independent (normally distributed) with equal.
Interval Estimation and Hypothesis Testing Prepared by Vera Tabakova, East Carolina University.
I271B The t distribution and the independent sample t-test.
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
Inferential Statistics. The Logic of Inferential Statistics Makes inferences about a population from a sample Makes inferences about a population from.
Meeghat Habibian Analysis of Travel Choice Transportation Demand Analysis Lecture note.
Chapter Outline Goodness of Fit test Test of Independence.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics S eventh Edition By Brase and Brase Prepared by: Lynn Smith.
Multiple Logistic Regression STAT E-150 Statistical Methods.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial.
Chapter Eleven Performing the One-Sample t-Test and Testing Correlation.
Hypothesis Tests u Structure of hypothesis tests 1. choose the appropriate test »based on: data characteristics, study objectives »parametric or nonparametric.
Lecture 7: Bivariate Statistics. 2 Properties of Standard Deviation Variance is just the square of the S.D. If a constant is added to all scores, it has.
The Probit Model Alexander Spermann University of Freiburg SS 2008.
Multiple Regression Reference: Chapter 18 of Statistics for Management and Economics, 7 th Edition, Gerald Keller. 1.
CHI SQUARE DISTRIBUTION. The Chi-Square (  2 ) Distribution The chi-square distribution is the probability distribution of the sum of several independent,
INTRODUCTION TO MULTIPLE REGRESSION MULTIPLE REGRESSION MODEL 11.2 MULTIPLE COEFFICIENT OF DETERMINATION 11.3 MODEL ASSUMPTIONS 11.4 TEST OF SIGNIFICANCE.
The Probit Model Alexander Spermann University of Freiburg SoSe 2009
Two-Sample Hypothesis Testing
Fundamentals of regression analysis
STAT120C: Final Review.
Interval Estimation and Hypothesis Testing
SIMPLE LINEAR REGRESSION
Presentation transcript:

The Art of Model Building and Statistical Tests

2 Outline The art of model building Using Software output The t-statistic The likelihood ratio test The use of goodness-of-fit Other tests Tests of model structure Test of the IIA assumption Test of taste variations Test of unequal variances (hetero scedasticity) Prediction tests Outlier analysis Market segment prediction tests Policy forecasting tests

Estimation Results for Trinomial Mode Choice Model Base Specification Ben Akiva Table 7.1

4 The Use of Goodness-of-Fit Measures Value of likelihood function The likelihood ratio index (rho-squared) Adjusted likelihood ratio index (rho-squared bar)

5 The Art of Model Building Informal tests of the coefficient estimates Signs and relative values Marginal rate of substitution Value of time

6 The t-Statistic The Likelihood Ratio Test  2 distributed k degrees of freedom = number of parameters in the model

T-statistic Discrete choice models use the t-ratio to determine if the statistic produced is statistically different from zero. Standard t-tests provide a significance level of rejecting the null hypothesis. The null hypotheses being that coefficients estimated are statistically different from zero. The t-values are placed in brackets next to each estimated coefficient in this thesis. A t-value of < ±2.56 rejects the null hypothesis at the 99% confidence level and value of between ± 2.56 and 1.96 is significant at the 95% confidence level. T-values of between ± 1.96 and 1.50 are significant at the 85% confidence level, and generally are left in the model but care should be taken when interpreting the results.

Likelihood ratio test The likelihood ratio test measures the performance of one model relative to another model. Typically in MNL modelling this test is used to compare models, one which may have additional variables included in the model, and the other without these variables. This statistic uses the measures of the difference between two models using the final likelihood statistics from both models and using the following formula: Here L* is the likelihood ratio and L(0) is the final likelihood of the base model, and L(β) is the final likelihood statistic from the model with different number of variables. A specified level of confidence is taken (0.1 or 0.5) with the given degrees of freedom from the chi-squared tables. If the estimated value of the chi-squared exceeds the critical value of the specified level of confidence, the null hypothesis is rejected. That is, the L(0) has a better model fit than L(β).

9 The Likelihood Ratio Test (continued) -2( ) =  2 distributed k U - k R degrees of freedom

10 Testing new variables Ben-Akiva Table 7.3

11 The Use of Goodness-of-Fit Measure  2 =  2 bar ->  12 =  14 =  15 = 0 -2( ) = 4.2 < 6.25 = c.v. 90%,3 The Likelihood Ratio Test

12 Test of Nonlinear Specifications Ben Akiva Table 7.6

13 Disutility of travel Time Test of Nonlinear Specifications (continued) Piecewise linear approximation Nonlinear specification x travel time 2

14 Test of Nonlinear Specifications (continued)

15 Estimation Problems Use of too many alternative-specific constants Incorrect specification of socioeconomic variables Perfect collinearity of variables Models with one or more unbounded coefficients

16 Constrained Estimation Inequality constraints Fixed value constraints Linear constraints Assumed value of time

17 Test of Model Structure The key question Is a multinomial logit model appropriate for the current data set? Or Are the basic assumptions required for the MNL structure true for the data? Basic MNL assumptions IIA – Independence from irrelevant alternatives (Test nested structures) No random taste variations (all significant differences in tastes are captured by socioeconomic variables) – means and variances are constant

18 Taste Variation Test General strategy – estimate models for market segments and compare with full data set model Market segments Groups based on socioeconomic variables – household size, income ranges, etc. Null hypothesis – the vector of estimated coefficients for each subset,  j, equals  F Likelihood ratio test statistic -2[L F (  F )-  i L i (  i )] distributed as  2 with  i k i - k F, where k i is the number of coefficients in the model for subset i, and k F is the number for the full model ^^

Estimation Results for Trinomial Mode Choice Model Market Segmentation by Income

Taste Variation Test

21 Other Applications of Market Segmentation Tests Life-cycle/life-style groups Tests of transferability – models based on data from different years or urban areas, to determine stability of policy-relevant coefficients Unrestricted model – collection of models estimated for separate data sets Restricted model – combined data sets, critical coefficients constrained to single values