1 Power 14 Goodness of Fit & Contingency Tables. 2 Outline u I. Projects u II. Goodness of Fit & Chi Square u III.Contingency Tables.

Slides:



Advertisements
Similar presentations
Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
Advertisements

McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. Chi-Square Tests Chapter 12.
1 1 Slide © 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Likelihood Ratio, Wald, and Lagrange Multiplier (Score) Tests
Inference about the Difference Between the
Chap 8: Estimation of parameters & Fitting of Probability Distributions Section 6.1: INTRODUCTION Unknown parameter(s) values must be estimated before.
Nguyen Ngoc Anh Nguyen Ha Trang
Discrete (Categorical) Data Analysis
1 1 Slide © 2009 Econ-2030-Applied Statistics-Dr. Tadesse. Chapter 11: Comparisons Involving Proportions and a Test of Independence n Inferences About.
1 Lecture Twelve. 2 Outline Failure Time Analysis Linear Probability Model Poisson Distribution.
Project Two Groups I, 2, & 3 ppt. as attached file to Mark Dotson Subject: econ 240a Message Mark:
1 Econ 240A Power 7. 2 This Week, So Far §Normal Distribution §Lab Three: Sampling Distributions §Interval Estimation and Hypothesis Testing.
1 Power 14 Goodness of Fit & Contingency Tables. 2 II. Goodness of Fit & Chi Square u Rolling a Fair Die u The Multinomial Distribution u Experiment:
1 Midterm Review Econ 240A. 2 The Big Picture The Classical Statistical Trail Descriptive Statistics Inferential Statistics Probability Discrete Random.
Maximum likelihood Conditional distribution and likelihood Maximum likelihood estimations Information in the data and likelihood Observed and Fisher’s.
1 Lecture Twelve. 2 Outline Projects Failure Time Analysis Linear Probability Model Poisson Approximation.
Statistical Inference and Regression Analysis: GB Professor William Greene Stern School of Business IOMS Department Department of Economics.
1 Econ 240A Power Outline Review Projects 3 Review: Big Picture 1 #1 Descriptive Statistics –Numerical central tendency: mean, median, mode dispersion:
1 Power Projects 3 Logistics Put power point slide show on a high density floppy disk for a WINTEL machine. the slide-show.
1 Power 14 Goodness of Fit & Contingency Tables. 2 Outline u I. Parting Shots On the Linear Probability Model u II. Goodness of Fit & Chi Square u III.Contingency.
1 Economics 240A Power Eight. 2 Outline Lab Four Lab Four Maximum Likelihood Estimation Maximum Likelihood Estimation The UC Budget Again The UC Budget.
1 Economics 240A Power Eight. 2 Outline n Maximum Likelihood Estimation n The UC Budget Again n Regression Models n The Income Generating Process for.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 14 Goodness-of-Fit Tests and Categorical Data Analysis.
1 Econ 240A Power 7. 2 Last Week §Normal Distribution §Lab Three: Sampling Distributions §Interval Estimation and HypothesisTesting.
1 Econ 240A Power 7. 2 This Week, So Far §Normal Distribution §Lab Three: Sampling Distributions §Interval Estimation and Hypothesis Testing.
Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides
An Introduction to Logistic Regression
1 Regression Econ 240A. 2 Retrospective w Week One Descriptive statistics Exploratory Data Analysis w Week Two Probability Binomial Distribution w Week.
Linear statistical models 2009 Count data  Contingency tables and log-linear models  Poisson regression.
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
The Chi-square Statistic. Goodness of fit 0 This test is used to decide whether there is any difference between the observed (experimental) value and.
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
Copyright © 2010 Pearson Education, Inc. Warm Up- Good Morning! If all the values of a data set are the same, all of the following must equal zero except.
1 1 Slide IS 310 – Business Statistics IS 310 Business Statistics CSU Long Beach.
1 1 Slide © 2005 Thomson/South-Western Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial Population Goodness of.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Business Statistics for Managerial Decision Farideh Dehkordi-Vakil.
1 1 Slide © 2006 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 In this case, each element of a population is assigned to one and only one of several classes or categories. Chapter 11 – Test of Independence - Hypothesis.
1 Economics 240A Power Eight. 2 Outline n Maximum Likelihood Estimation n The UC Budget Again n Regression Models n The Income Generating Process for.
Introduction Many experiments result in measurements that are qualitative or categorical rather than quantitative. Humans classified by ethnic origin Hair.
Data Analysis for Two-Way Tables. The Basics Two-way table of counts Organizes data about 2 categorical variables Row variables run across the table Column.
1 1 Slide © 2009 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
Chapter Outline Goodness of Fit test Test of Independence.
Copyright © 2010 Pearson Education, Inc. Warm Up- Good Morning! If all the values of a data set are the same, all of the following must equal zero except.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial.
The Probit Model Alexander Spermann University of Freiburg SS 2008.
1 Ka-fu Wong University of Hong Kong A Brief Review of Probability, Statistics, and Regression for Forecasting.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
The Probit Model Alexander Spermann University of Freiburg SoSe 2009
Sampling Distributions
Test of independence: Contingency Table
Chapter 11 – Test of Independence - Hypothesis Test for Proportions of a Multinomial Population In this case, each element of a population is assigned.
CHAPTER 26 Comparing Counts.
Final Review Econ 240A.
Probability Theory and Parameter Estimation I
Basic Estimation Techniques
Economics 240A Power Eight.
Active Learning Lecture Slides
John Loucks St. Edward’s University . SLIDES . BY.
Data Analysis for Two-Way Tables
Basic Estimation Techniques
Econ 3790: Business and Economics Statistics
SIMPLE LINEAR REGRESSION
Goodness of Fit.
Chapter 26 Comparing Counts.
Chapter Outline Goodness of Fit test Test of Independence.
St. Edward’s University
Presentation transcript:

1 Power 14 Goodness of Fit & Contingency Tables

2 Outline u I. Projects u II. Goodness of Fit & Chi Square u III.Contingency Tables

3 Part I: Projects u Teams u Assignments u Presentations u Data Sources u Grades

4 Team One u Catherine Wohletz: Project choice u Joshua Friedberg: Data Retrieval u Julio Urenda: Statistical Analysis u Daniel Grund: PowerPoint Presentation u Takuro Hatanaka: Executive Summary u Sylvia Salinas: Technical Appendix

5 Assignments u 1. Project choice u 2. Data Retrieval u 3. Statistical Analysis u 4. PowerPoint Presentation u 5. Executive Summary u 6. Technical Appendix

6 PowerPoint Presentations: Member 4 u 1. Introduction: Members 1,2, 3 –What –Why –How u 2. Executive Summary: Member 5 u 3. Exploratory Data Analysis: Member 3 u 4. Descriptive Statistics: Member 3 u 5. Statistical Analysis: Member 3 u 6. Conclusions: Members 3 & 5 u 7. Technical Appendix: Table of Contents, Member 6

7 Executive Summary and Technical Appendix

8

9 Grades

10 Data Sources u FRED: Federal Reserve Bank of St. Louis, –Business/Fiscal F Index of Consumer Sentiment, Monthly (1952:11) F Light Weight Vehicle Sales, Auto and Light Truck, Monthly ( ) u Economagic, u U S Dept. of Commerce, –Population –Economic Analysis,

11 Data Sources (Cont. ) u Bureau of Labor Statistics, u California Dept of Finance,

12 II. Goodness of Fit & Chi Square u Rolling a Fair Die u The Multinomial Distribution u Experiment: 600 Tosses

13 The Expected Frequencies

14 The Expected Frequencies & Empirical Frequencies Empirical Frequency

15 Hypothesis Test u Null H 0 : Distribution is Multinomial u Statistic: (O i - E i ) 2 /E i, : observed minus expected squared divided by expected u Set Type I 5% for example u Distribution of Statistic is Chi Square P(n 1 =1, n 2 n 3 =0, n 4 =0, n 5 =0, n 6 =0) = n!/ P(n 1 =1, n 2 =0, n 3 =0, n 4 =0, n 5 =0, n 6 =0) = n!/ P(n 1 =1, n 2 n 3 =0, n 4 =0, n 5 =0, n 6 =0)= 1!/1!0!0!0!0!0!(1/6) 1 (1/6) 0 P(n 1 =1, n 2 =0, n 3 =0, n 4 =0, n 5 =0, n 6 =0)= 1!/1!0!0!0!0!0!(1/6) 1 (1/6) 0 (1/6) 0 (1/6) 0 (1/6) 0 (1/6) 0 One Throw, side one comes up: multinomial distribution

16 Chi Square: x 2 =  (O i - E i ) 2 = 6.15

Chi Square Density for 5 degrees of freedom %

18 Contingency Table Analysis u Tests for Association Vs. Independence For Qualitative Variables

19 Does Consumer Knowledge Affect Purchases? Frost Free Refrigerators Use More Electricity

20 Marginal Counts

21 Marginal Distributions, f(x) & f(y)

22 Joint Disribution Under Independence f(x,y) = f(x)*f(y)

23 Expected Cell Frequencies Under Independence

24 Observed Cell Counts

25 Contribution to Chi Square: (observed-Expected) 2 /Expected Chi Sqare = = 3.09 (m-1)*(n-1) = 1*1=1 degrees of freedom Upper Left Cell: ( ) 2 /324 = 100/324 =0.31

5% 5.02

27 Conclusion u No association between consumer knowledge about electricity use and consumer choice of a frost-free refrigerator

28 Using Goodness of Fit to Choose Between Competing Probability Models u Men on base when a home run is hit

29 Men on base when a home run is hit

30 Conjecture u Distribution is binomial

31 Average # of men on base Sum of products = n*p = = 0.63

32 Using the binomial k=men on base, n=# of trials u P(k=0) = [3!/0!3!] (0.21) 0 (0.79) 3 = u P(k=1) = [3!/1!2!] (0.21) 1 (0.79) 2 = u P(k=2) = [3!/2!1!] (0.21) 2 (0.79) 1 = u P(k=3) = [3!/3!0!] (0.21) 3 (0.79) 0 = 0.009

33 Assuming the binomial u The probability of zero men on base is u the total number of observations is 765 u so the expected number of observations for zero men on base is 0.493*765=377.1

34 Goodness of Fit

Chi Square, 3 degrees of freedom 5% 7.81

36 Conjecture: Poisson where  np = 0.63 u P(k=3) = 1- P(k=2)-P(k=1)-P(k=0)  P(k=0) = e -   k /k! = e (0.63) 0 /0! =  P(k=1) = e -   k /k! = e (0.63) 1 /1! =  P(k=2) = e -   k /k! = e (0.63) 2 /2! =

37 Average # of men on base Sum of products = n*p = = 0.63

38 Conjecture: Poisson where  np = 0.63 u P(k=3) = 1- P(k=2)-P(k=1)-P(k=0)  P(k=0) = e -   k /k! = e (0.63) 0 /0! =  P(k=1) = e -   k /k! = e (0.63) 1 /1! =  P(k=2) = e -   k /k! = e (0.63) 2 /2! =

39 Goodness of Fit

Chi Square, 3 degrees of freedom 5% 7.81

41 Likelihood Functions u Review OLS Likelihood u Proceed in a similar fashion for the probit

42 Likelihood function u The joint density of the estimated residuals can be written as: u If the sample of observations on the dependent variable, y, and the independent variable, x, is random, then the observations are independent of one another. If the errors are also identically distributed, f, i.e. i.i.d, then

43 Likelihood function u Continued: If i.i.d., then u If the residuals are normally distributed: u This is one of the assumptions of linear regression: errors are i.i.d normal u then the joint distribution or likelihood function, L, can be written as:

44 Likelihood function u and taking natural logarithms of both sides, where the logarithm is a monotonically increasing function so that if lnL is maximized, so is L:

45 Log-Likelihood u Taking the derivative of lnL with respect to either a-hat or b-hat yields the same estimators for the parameters a and b as with ordinary least squares, except now we know the errors are normally distributed.

46 Probit u Example: expenditures on lottery as a % of household income u lottery i = a + b*income i + e i u if lottery i >0, i.e. a + b*income i + e i >0, then Bern i, the yes-no indicator variable is equal to one and e i >- a - b*income i u this determines a threshold for observation i in the distribution of the error e i u assume

i

i Area above the threshold is the probability of playing the lottery for observation i, P yes

i Area above the threshold is the probability of playing the lottery for observation i, P yes P no for observation i

50 Probit u Likelihood function for the observed sample u Log likelihood:

51

i Area above the threshold is the probability of playing the lottery for observation i, P yes P no for observation i

53 Probit u Substituting these expressions for P no and P yes in the ln Likelihood function gives the complete expression.

54 Probit u Likelihood function for the observed sample u Log likelihood: