1 Power 14 Goodness of Fit & Contingency Tables. 2 II. Goodness of Fit & Chi Square u Rolling a Fair Die u The Multinomial Distribution u Experiment:

Slides:



Advertisements
Similar presentations
© Department of Statistics 2012 STATS 330 Lecture 32: Slide 1 Stats 330: Lecture 32.
Advertisements

Copyright ©2006 Brooks/Cole A division of Thomson Learning, Inc. Introduction to Probability and Statistics Twelfth Edition Robert J. Beaver Barbara M.
1 1 Slide © 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Likelihood Ratio, Wald, and Lagrange Multiplier (Score) Tests
Outline input analysis input analyzer of ARENA parameter estimation
1 Midterm Review. 2 Econ 240A  Descriptive Statistics  Probability  Inference  Differences between populations  Regression.
Inference about the Difference Between the
Chap 8: Estimation of parameters & Fitting of Probability Distributions Section 6.1: INTRODUCTION Unknown parameter(s) values must be estimated before.
Nguyen Ngoc Anh Nguyen Ha Trang
Discrete (Categorical) Data Analysis
1 1 Slide © 2009 Econ-2030-Applied Statistics-Dr. Tadesse. Chapter 11: Comparisons Involving Proportions and a Test of Independence n Inferences About.
1 Lecture Twelve. 2 Outline Failure Time Analysis Linear Probability Model Poisson Distribution.
Project Two Groups I, 2, & 3 ppt. as attached file to Mark Dotson Subject: econ 240a Message Mark:
1 Econ 240A Power 7. 2 This Week, So Far §Normal Distribution §Lab Three: Sampling Distributions §Interval Estimation and Hypothesis Testing.
Chapter 10 Simple Regression.
1 Midterm Review Econ 240A. 2 The Big Picture The Classical Statistical Trail Descriptive Statistics Inferential Statistics Probability Discrete Random.
Maximum likelihood Conditional distribution and likelihood Maximum likelihood estimations Information in the data and likelihood Observed and Fisher’s.
1 Lecture Twelve. 2 Outline Projects Failure Time Analysis Linear Probability Model Poisson Approximation.
1 Econ 240A Power Outline Review Projects 3 Review: Big Picture 1 #1 Descriptive Statistics –Numerical central tendency: mean, median, mode dispersion:
1 Power Projects 3 Logistics Put power point slide show on a high density floppy disk for a WINTEL machine. the slide-show.
1 Power 14 Goodness of Fit & Contingency Tables. 2 Outline u I. Parting Shots On the Linear Probability Model u II. Goodness of Fit & Chi Square u III.Contingency.
1 Economics 240A Power Eight. 2 Outline Lab Four Lab Four Maximum Likelihood Estimation Maximum Likelihood Estimation The UC Budget Again The UC Budget.
1 Economics 240A Power Eight. 2 Outline n Maximum Likelihood Estimation n The UC Budget Again n Regression Models n The Income Generating Process for.
1 Power 14 Goodness of Fit & Contingency Tables. 2 Outline u I. Projects u II. Goodness of Fit & Chi Square u III.Contingency Tables.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 14 Goodness-of-Fit Tests and Categorical Data Analysis.
1 Econ 240A Power 7. 2 This Week, So Far §Normal Distribution §Lab Three: Sampling Distributions §Interval Estimation and Hypothesis Testing.
Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides
An Introduction to Logistic Regression
Inferences About Process Quality
1 Regression Econ 240A. 2 Retrospective w Week One Descriptive statistics Exploratory Data Analysis w Week Two Probability Binomial Distribution w Week.
Linear statistical models 2009 Count data  Contingency tables and log-linear models  Poisson regression.
Correlation and Linear Regression
SIMPLE LINEAR REGRESSION
AS 737 Categorical Data Analysis For Multivariate
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
Copyright © 2010 Pearson Education, Inc. Warm Up- Good Morning! If all the values of a data set are the same, all of the following must equal zero except.
1 1 Slide IS 310 – Business Statistics IS 310 Business Statistics CSU Long Beach.
1 1 Slide © 2005 Thomson/South-Western Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial Population Goodness of.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Business Statistics for Managerial Decision Farideh Dehkordi-Vakil.
1 1 Slide © 2006 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 In this case, each element of a population is assigned to one and only one of several classes or categories. Chapter 11 – Test of Independence - Hypothesis.
1 Economics 240A Power Eight. 2 Outline n Maximum Likelihood Estimation n The UC Budget Again n Regression Models n The Income Generating Process for.
Introduction Many experiments result in measurements that are qualitative or categorical rather than quantitative. Humans classified by ethnic origin Hair.
Data Analysis for Two-Way Tables. The Basics Two-way table of counts Organizes data about 2 categorical variables Row variables run across the table Column.
1 1 Slide © 2009 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
Chapter Outline Goodness of Fit test Test of Independence.
Copyright © 2010 Pearson Education, Inc. Warm Up- Good Morning! If all the values of a data set are the same, all of the following must equal zero except.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial.
The Probit Model Alexander Spermann University of Freiburg SS 2008.
1 Ka-fu Wong University of Hong Kong A Brief Review of Probability, Statistics, and Regression for Forecasting.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
The Probit Model Alexander Spermann University of Freiburg SoSe 2009
Sampling Distributions
Test of independence: Contingency Table
Chapter 11 – Test of Independence - Hypothesis Test for Proportions of a Multinomial Population In this case, each element of a population is assigned.
Final Review Econ 240A.
Economics 240A Power Eight.
Active Learning Lecture Slides
Hypothesis testing. Chi-square test
John Loucks St. Edward’s University . SLIDES . BY.
Data Analysis for Two-Way Tables
Basic Estimation Techniques
The Simple Linear Regression Model: Specification and Estimation
Goodness of Fit.
SIMPLE LINEAR REGRESSION
St. Edward’s University
Presentation transcript:

1 Power 14 Goodness of Fit & Contingency Tables

2 II. Goodness of Fit & Chi Square u Rolling a Fair Die u The Multinomial Distribution u Experiment: 600 Tosses

3 The Expected Frequencies

4 The Expected Frequencies & Empirical Frequencies Empirical Frequency

5 Hypothesis Test u Null H 0 : Distribution is Multinomial u Statistic: (O i - E i ) 2 /E i, : observed minus expected squared divided by expected u Set Type I 5% for example u Distribution of Statistic is Chi Square P(n 1 =1, n 2 n 3 =0, n 4 =0, n 5 =0, n 6 =0) = n!/ P(n 1 =1, n 2 =0, n 3 =0, n 4 =0, n 5 =0, n 6 =0) = n!/ P(n 1 =1, n 2 n 3 =0, n 4 =0, n 5 =0, n 6 =0)= 1!/1!0!0!0!0!0!(1/6) 1 (1/6) 0 P(n 1 =1, n 2 =0, n 3 =0, n 4 =0, n 5 =0, n 6 =0)= 1!/1!0!0!0!0!0!(1/6) 1 (1/6) 0 (1/6) 0 (1/6) 0 (1/6) 0 (1/6) 0 One Throw, side one comes up: multinomial distribution

6 Chi Square: x 2 =  (O i - E i ) 2 = 6.15

Chi Square Density for 5 degrees of freedom %

8 Contingency Table Analysis u Tests for Association Vs. Independence For Qualitative Variables

9 Does Consumer Knowledge Affect Purchases? Frost Free Refrigerators Use More Electricity

10 Marginal Counts

11 Marginal Distributions, f(x) & f(y)

12 Joint Disribution Under Independence f(x,y) = f(x)*f(y)

13 Expected Cell Frequencies Under Independence

14 Observed Cell Counts

15 Contribution to Chi Square: (observed-Expected) 2 /Expected Chi Sqare = = 3.09 (m-1)*(n-1) = 1*1=1 degrees of freedom Upper Left Cell: ( ) 2 /324 = 100/324 =0.31

5% 5.02

17 Conclusion u No association between consumer knowledge about electricity use and consumer choice of a frost-free refrigerator

18 Using Goodness of Fit to Choose Between Competing Probability Models u Men on base when a home run is hit

19 Men on base when a home run is hit

20 Conjecture u Distribution is binomial

21 Average # of men on base Sum of products = n*p = = 0.63

22 Using the binomial k=men on base, n=# of trials u P(k=0) = [3!/0!3!] (0.21) 0 (0.79) 3 = u P(k=1) = [3!/1!2!] (0.21) 1 (0.79) 2 = u P(k=2) = [3!/2!1!] (0.21) 2 (0.79) 1 = u P(k=3) = [3!/3!0!] (0.21) 3 (0.79) 0 = 0.009

23 Assuming the binomial u The probability of zero men on base is u the total number of observations is 765 u so the expected number of observations for zero men on base is 0.493*765=377.1

24 Goodness of Fit

Chi Square, 3 degrees of freedom 5% 7.81

26 Conjecture: Poisson where  np = 0.63 u P(k=3) = 1- P(k=2)-P(k=1)-P(k=0)  P(k=0) = e -   k /k! = e (0.63) 0 /0! =  P(k=1) = e -   k /k! = e (0.63) 1 /1! =  P(k=2) = e -   k /k! = e (0.63) 2 /2! =

27 Average # of men on base Sum of products = n*p = = 0.63

28 Conjecture: Poisson where  np = 0.63 u P(k=3) = 1- P(k=2)-P(k=1)-P(k=0)  P(k=0) = e -   k /k! = e (0.63) 0 /0! =  P(k=1) = e -   k /k! = e (0.63) 1 /1! =  P(k=2) = e -   k /k! = e (0.63) 2 /2! =

29 Goodness of Fit

Chi Square, 3 degrees of freedom 5% 7.81

31 Likelihood Functions u Review OLS Likelihood u Proceed in a similar fashion for the probit

32 Likelihood function u The joint density of the estimated residuals can be written as: u If the sample of observations on the dependent variable, y, and the independent variable, x, is random, then the observations are independent of one another. If the errors are also identically distributed, f, i.e. i.i.d, then

33 Likelihood function u Continued: If i.i.d., then u If the residuals are normally distributed: u This is one of the assumptions of linear regression: errors are i.i.d normal u then the joint distribution or likelihood function, L, can be written as:

34 Likelihood function u and taking natural logarithms of both sides, where the logarithm is a monotonically increasing function so that if lnL is maximized, so is L:

35 Log-Likelihood u Taking the derivative of lnL with respect to either a-hat or b-hat yields the same estimators for the parameters a and b as with ordinary least squares, except now we know the errors are normally distributed.

36 Probit u Example: expenditures on lottery as a % of household income u lottery i = a + b*income i + e i u if lottery i >0, i.e. a + b*income i + e i >0, then Bern i, the yes-no indicator variable is equal to one and e i >- a - b*income i u this determines a threshold for observation i in the distribution of the error e i u assume

i

i Area above the threshold is the probability of playing the lottery for observation i, P yes

i Area above the threshold is the probability of playing the lottery for observation i, P yes P no for observation i

40 Probit u Likelihood function for the observed sample u Log likelihood:

41

i Area above the threshold is the probability of playing the lottery for observation i, P yes P no for observation i

43 Probit u Substituting these expressions for P no and P yes in the ln Likelihood function gives the complete expression.

44 Probit u Likelihood function for the observed sample u Log likelihood:

45

46 Outline u I. Projects u II. Goodness of Fit & Chi Square u III.Contingency Tables

47 Part I: Projects u Teams u Assignments u Presentations u Data Sources u Grades

48 Team One u : Project choice u : Data Retrieval u : Statistical Analysis u : PowerPoint Presentation u : Executive Summary u : Technical Appendix u : Graphics (Excel, Eviews, other)

49 Assignments u 1. Project choice: Markus Ansmann u 2. Data Retrieval: Theodore Ehlert u 3. Statistical Analysis: David Sheehan u 4. PowerPoint Presentation: Qun Luo u 5. Executive Summary: Steven Comstock u 6. Technical Appendix: Alan Weinberg u 7. Graphics: Gregory Adams

50 PowerPoint Presentations: Member 4 u 1. Introduction: Members 1,2, 3 –What –Why –How u 2. Executive Summary: Member 5 u 3. Exploratory Data Analysis: Members 3, 7 u 4. Descriptive Statistics: Member 3, 7 u 5. Statistical Analysis: Member 3 u 6. Conclusions: Members 3 & 5 u 7. Technical Appendix: Table of Contents, Member 6

51 Executive Summary and Technical Appendix

52

53 Grades

54 Data Sources u FRED: Federal Reserve Bank of St. Louis, –Business/Fiscal F Index of Consumer Sentiment, Monthly (1952:11) F Light Weight Vehicle Sales, Auto and Light Truck, Monthly ( ) u Economagic, u U S Dept. of Commerce, –Population –Economic Analysis,

55 Data Sources (Cont. ) u Bureau of Labor Statistics, u California Dept of Finance,