Part 24: Hypothesis Tests 24-1/33 Statistics and Data Analysis Professor William Greene Stern School of Business IOMS Department Department of Economics.

Slides:



Advertisements
Similar presentations
STATISTICS Linear Statistical Models
Advertisements

STATISTICS HYPOTHESES TEST (I)
STATISTICS INTERVAL ESTIMATION Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University.
Lecture 2 ANALYSIS OF VARIANCE: AN INTRODUCTION
Chapter 7 Sampling and Sampling Distributions
Part 17: Multiple Regression – Part /26 Statistics and Data Analysis Professor William Greene Stern School of Business IOMS Department Department.
Solve Multi-step Equations
Simple Linear Regression 1. review of least squares procedure 2
Part 6: Multiple Regression 6-1/35 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics.
Chapter 4: Basic Estimation Techniques
Elementary Statistics
PP Test Review Sections 6-1 to 6-6
Topics: Multiple Regression Analysis (MRA)
Multiple Regression. Introduction In this chapter, we extend the simple linear regression model. Any number of independent variables is now allowed. We.
Lecture Unit Multiple Regression.
Part 23: Multiple Regression – Part /47 Statistics and Data Analysis Professor William Greene Stern School of Business IOMS Department of Department.
Module 16: One-sample t-tests and Confidence Intervals
The Right Questions about Statistics: How regression works Maths Learning Centre The University of Adelaide Regression is a method designed to create a.
© The McGraw-Hill Companies, Inc., Chapter 10 Testing the Difference between Means and Variances.
Chapter Twelve Multiple Regression and Model Building McGraw-Hill/Irwin Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Chapter Thirteen The One-Way Analysis of Variance.
Ch 14 實習(2).
Part 15: Hypothesis Tests 15-1/18 Statistics and Data Analysis Professor William Greene Stern School of Business IOMS Department Department of Economics.
Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 11 Simple Linear Regression.
Experimental Design and Analysis of Variance
Module 20: Correlation This module focuses on the calculating, interpreting and testing hypotheses about the Pearson Product Moment Correlation.
Part 20: Aspects of Regression 20-1/26 Statistics and Data Analysis Professor William Greene Stern School of Business IOMS Department Department of Economics.
COMPLETE f o u r t h e d i t i o n BUSINESS STATISTICS Aczel Irwin/McGraw-Hill © The McGraw-Hill Companies, Inc., Using Statistics The Simple.
Simple Linear Regression Analysis
©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Linear Regression and Correlation Chapter 13.
Correlation and Linear Regression
ANalysis Of VAriance can be used to test for the equality of three or more population means. H 0 :  1  =  2  =  3  = ... =  k H a : Not all population.
Multiple Linear Regression and Correlation Analysis
Multiple Regression and Model Building
Part 14: Statistical Tests – Part /25 Statistics and Data Analysis Professor William Greene Stern School of Business IOMS Department Department of.
One-Way BG ANOVA Andrew Ainsworth Psy 420. Topics Analysis with more than 2 levels Deviation, Computation, Regression, Unequal Samples Specific Comparisons.
Adapted by Peter Au, George Brown College McGraw-Hill Ryerson Copyright © 2011 McGraw-Hill Ryerson Limited.
Qualitative predictor variables
More on understanding variance inflation factors (VIFk)
Chapter 12 Simple Linear Regression
Analysis of Economic Data
Multiple Regression and Correlation Analysis
Part 18: Regression Modeling 18-1/44 Statistics and Data Analysis Professor William Greene Stern School of Business IOMS Department Department of Economics.
Part 7: Multiple Regression Analysis 7-1/54 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics.
Simple Linear Regression Analysis
Hypothesis tests for slopes in multiple linear regression model Using the general linear test and sequential sums of squares.
Review Guess the correlation. A.-2.0 B.-0.9 C.-0.1 D.0.1 E.0.9.
Lecture 5 Correlation and Regression
Part 3: Regression and Correlation 3-1/41 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics.
Part 24: Multiple Regression – Part /45 Statistics and Data Analysis Professor William Greene Stern School of Business IOMS Department Department.
Part 2: Model and Inference 2-1/49 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics.
Copyright ©2011 Nelson Education Limited Linear Regression and Correlation CHAPTER 12.
Solutions to Tutorial 5 Problems Source Sum of Squares df Mean Square F-test Regression Residual Total ANOVA Table Variable.
14- 1 Chapter Fourteen McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Environmental Modeling Basic Testing Methods - Statistics III.
Multiple Regression I 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 4 Multiple Regression Analysis (Part 1) Terry Dielman.
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Multiple Regression Chapter 14.
Chapter 20 Linear and Multiple Regression
Chapter 13 Created by Bethany Stubbe and Stephan Kogitz.
Chapter 11 Simple Regression
Quantitative Methods Simple Regression.
Statistics and Data Analysis
Multiple Regression Chapter 14.
Simple Linear Regression
Econometrics I Professor William Greene Stern School of Business
Chapter Fourteen McGraw-Hill/Irwin
Presentation transcript:

Part 24: Hypothesis Tests 24-1/33 Statistics and Data Analysis Professor William Greene Stern School of Business IOMS Department Department of Economics

Part 24: Hypothesis Tests 24-2/33 Statistics and Data Analysis Part 24 – Hypothesis Tests

Part 24: Hypothesis Tests 24-3/33 Hypothesis Tests Hypothesis Tests in the Regression Model Tests of Independence of Random Variables

Part 24: Hypothesis Tests 24-4/33 Application: Monet Paintings Does the size of the painting really explain the sale prices of Monets paintings? Investigate: Compute the regression Hypothesis: The slope is actually zero. Rejection region: Slope estimates that are very far from zero. The hypothesis that β = 0 is rejected

Part 24: Hypothesis Tests 24-5/33 Regression Analysis Investigate: Is the coefficient in a regression model really nonzero? Testing procedure: Model: y = α + βx + ε Hypothesis: H 0 : β = 0. Rejection region: Least squares coefficient is far from zero. Test: α level for the test = 0.05 as usual Compute t = b/StandardError Reject H 0 if t is above the critical value 1.96 if large sample Value from t table if small sample. Reject H 0 if reported P value is less than α level Degrees of Freedom for the t statistic is N-2

Part 24: Hypothesis Tests 24-6/33 An Equivalent Test Is there a relationship? H 0 : No correlation Rejection region: Large R 2. Test: F= Reject H 0 if F > 4 Math result: F = t 2. Degrees of Freedom for the F statistic are 1 and N-2

Part 24: Hypothesis Tests 24-7/33 Partial Effect Hypothesis: If we include the signature effect, size does not explain the sale prices of Monet paintings. Test: Compute the multiple regression; then H 0 : β 1 = 0. α level for the test = 0.05 as usual Rejection Region: Large value of b 1 (coefficient) Test based on t = b 1 /StandardError Regression Analysis: ln (US$) versus ln (SurfaceArea), Signed The regression equation is ln (US$) = ln (SurfaceArea) Signed Predictor Coef SE Coef T P Constant ln (SurfaceArea) Signed S = R-Sq = 46.2% R-Sq(adj) = 46.0% Reject H 0. Degrees of Freedom for the t statistic is N-3 = N-number of predictors – 1.

Part 24: Hypothesis Tests 24-8/33 Testing The Regression Degrees of Freedom for the F statistic are K and N-K-1

Part 24: Hypothesis Tests 24-9/33 n 1 = Number of predictors n 2 = Sample size – number of predictors – 1

Part 24: Hypothesis Tests 24-10/33 Cost Function Regression The regression is significant. F is huge. Which variables are significant? Which variables are not significant?

Part 24: Hypothesis Tests 24-11/33 Application: Part of a Regression Model Regression model includes variables x1, x2,… I am sure of these variables. Maybe variables z1, z2,… I am not sure of these. Model: y = α+β 1 x1+β 2 x2 + δ 1 z1+δ 2 z2 + ε Hypothesis: δ 1 =0 and δ 2 =0. Strategy: Start with model including x1 and x2. Compute R 2. Compute new model that also includes z1 and z2. Rejection region: R 2 increases a lot.

Part 24: Hypothesis Tests 24-12/33 Test Statistic

Part 24: Hypothesis Tests 24-13/33 Gasoline Market

Part 24: Hypothesis Tests 24-14/33 Gasoline Market Regression Analysis: logG versus logIncome, logPG The regression equation is logG = logIncome logPG Predictor Coef SE Coef T P Constant logIncome logPG S = R-Sq = 93.6% R-Sq(adj) = 93.4% Analysis of Variance Source DF SS MS F P Regression Residual Error Total R 2 = / =

Part 24: Hypothesis Tests 24-15/33 Gasoline Market Regression Analysis: logG versus logIncome, logPG,... The regression equation is logG = logIncome logPG logPNC logPUC logPPT Predictor Coef SE Coef T P Constant logIncome logPG logPNC logPUC logPPT S = R-Sq = 96.0% R-Sq(adj) = 95.6% Analysis of Variance Source DF SS MS F P Regression Residual Error Total Now, R 2 = / = Previously, R 2 = / =

Part 24: Hypothesis Tests 24-16/33 Improvement in R 2 Inverse Cumulative Distribution Function F distribution with 3 DF in numerator and 46 DF in denominator P( X <= x ) = 0.95 x = The null hypothesis is rejected. Notice that none of the three individual variables are significant but the three of them together are.

Part 24: Hypothesis Tests 24-17/33 Application Health satisfaction depends on many factors: Age, Income, Children, Education, Marital Status Do these factors figure differently in a model for women compared to one for men? Investigation: Multiple regression Null hypothesis: The regressions are the same. Rejection Region: Estimated regressions that are very different.

Part 24: Hypothesis Tests 24-18/33 Equal Regressions Setting: Two groups of observations (men/women, countries, two different periods, firms, etc.) Regression Model: y = α+β 1 x1+β 2 x2 + … + ε Hypothesis: The same model applies to both groups Rejection region: Large values of F

Part 24: Hypothesis Tests 24-19/33 Procedure: Equal Regressions There are N1 observations in Group 1 and N2 in Group 2. There are K variables and the constant term in the model. This test requires you to compute three regressions and retain the sum of squared residuals from each: SS1 = sum of squares from N1 observations in group 1 SS2 = sum of squares from N2 observations in group 2 SSALL = sum of squares from NALL=N1+N2 observations when the two groups are pooled. The hypothesis of equal regressions is rejected if F is larger than the critical value from the F table (K numerator and NALL-2K-2 denominator degrees of freedom)

Part 24: Hypothesis Tests 24-20/ |Variable| Coefficient | Standard Error | T |P value]| Mean of X| Women===|=[NW = 13083]================================================ Constant| AGE | EDUC | HHNINC | HHKIDS | MARRIED | Men=====|=[NM = 14243]================================================ Constant| AGE | EDUC | HHNINC | HHKIDS | MARRIED | Both====|=[NALL = 27326]============================================== Constant| AGE | EDUC | HHNINC | HHKIDS | MARRIED | German survey data over 7 years, 1984 to 1991 (with a gap). 27,326 observations on Health Satisfaction and several covariates. Health Satisfaction Models: Men vs. Women

Part 24: Hypothesis Tests 24-21/33 Computing the F Statistic | Women Men All | | HEALTH Mean = | | Standard deviation = | | Number of observs. = | | Model size Parameters = | | Degrees of freedom = | | Residuals Sum of squares = | | Standard error of e = | | Fit R-squared = | | Model test F (P value) = (.000) (.000) (.0000) |

Part 24: Hypothesis Tests 24-22/33 A Test of Independence In the credit card example, are Own/Rent and Accept/Reject independent? Hypothesis: Prob(Ownership) and Prob(Acceptance) are independent Formal hypothesis, based only on the laws of probability: Prob(Own,Accept) = Prob(Own)Prob(Accept) (and likewise for the other three possibilities. Rejection region: Joint frequencies that do not look like the products of the marginal frequencies.

Part 24: Hypothesis Tests 24-23/33 A Contingency Table Analysis

Part 24: Hypothesis Tests 24-24/33 Independence Test Step 2: Expected proportions assuming independence: If the factors are independent, then the joint proportions should equal the product of the marginal proportions. [Rent,Reject] x = [Rent,Accept] x = [Own,Reject] x = [Own,Accept] x =

Part 24: Hypothesis Tests 24-25/33 Comparing Actual to Expected

Part 24: Hypothesis Tests 24-26/33 When is Chi Squared Large? For a 2x2 table, the critical chi squared value for α = 0.05 is (Not a coincidence, 3.84 = ) Our is large, so the hypothesis of independence between the acceptance decision and the own/rent status is rejected.

Part 24: Hypothesis Tests 24-27/33 Computing the Critical Value Calc Probability Distributions Chi- square The value reported is For an R by C Table, D.F. = (R-1)(C-1)

Part 24: Hypothesis Tests 24-28/33 Analyzing Default Do renters default more often (at a different rate) than owners? To investigate, we study the cardholders (only) We have the raw observations in the data set. DEFAULT OWNRENT 0 1 All All

Part 24: Hypothesis Tests 24-29/33 Hypothesis Test

Part 24: Hypothesis Tests 24-30/33 Treatment Effects in Clinical Trials Does Phenogyrabluthefentanoel (Zorgrab) work? Investigate: Carry out a clinical trial. N+0 = The placebo effect N+T – N+0 = The treatment effect Is N+T > N+0 (significantly)? Placebo Drug Treatment No Effect N00 N0T Positive Effect N+0 N+T

Part 24: Hypothesis Tests 24-31/33

Part 24: Hypothesis Tests 24-32/33 Confounding Effects

Part 24: Hypothesis Tests 24-33/33 What About Confounding Effects? Normal Weight Obese Nonsmoker Smoker Age and Sex are usually relevant as well. How can all these factors be accounted for at the same time?