Linear Regression Inference

Slides:



Advertisements
Similar presentations
Objectives 10.1 Simple linear regression
Advertisements

Hypothesis Testing Steps in Hypothesis Testing:
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 27 Inferences for Regression.
Copyright © 2010 Pearson Education, Inc. Chapter 27 Inferences for Regression.
Chapter 27 Inferences for Regression This is just for one sample We want to talk about the relation between waist size and %body fat for the complete population.
Copyright © 2009 Pearson Education, Inc. Chapter 29 Multiple Regression.
Copyright © 2010 Pearson Education, Inc. Slide
Linear Regression t-Tests Cardiovascular fitness among skiers.
Inference for Regression
Regression Inferential Methods
Statistics and Quantitative Analysis U4320
Objectives (BPS chapter 24)
Inference for Regression 1Section 13.3, Page 284.
Department of Applied Economics National Chung Hsing University
SIMPLE LINEAR REGRESSION
Tuesday, October 22 Interval estimation. Independent samples t-test for the difference between two means. Matched samples t-test.
Business Statistics - QBM117 Statistical inference for regression.
Correlation and Regression Analysis
Linear Regression 2 Sociology 5811 Lecture 21 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
Chapter 12 Section 1 Inference for Linear Regression.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 12 Analyzing the Association Between Quantitative Variables: Regression Analysis Section.
Chapter 13: Inference in Regression
STA291 Statistical Methods Lecture 27. Inference for Regression.
Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis.
Regression Analysis (2)
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
Education 793 Class Notes T-tests 29 October 2003.
Simple Linear Regression Models
Inferences for Regression
Confidence Intervals for the Regression Slope 12.1b Target Goal: I can perform a significance test about the slope β of a population (true) regression.
Inference for Linear Regression Conditions for Regression Inference: Suppose we have n observations on an explanatory variable x and a response variable.
Comparing Two Population Means
Dan Piett STAT West Virginia University
Statistics for Business and Economics Dr. TANG Yu Department of Mathematics Soochow University May 28, 2007.
Student’s t-distributions. Student’s t-Model: Family of distributions similar to the Normal model but changes based on degrees-of- freedom. Degrees-of-freedom.
+ Chapter 12: Inference for Regression Inference for Linear Regression.
Regression. Height Weight How much would an adult female weigh if she were 5 feet tall? She could weigh varying amounts – in other words, there is a distribution.
Regression. Height Weight Suppose you took many samples of the same size from this population & calculated the LSRL for each. Using the slope from each.
© 2011 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license.
Regression with Inference Notes: Page 231. Height Weight Suppose you took many samples of the same size from this population & calculated the LSRL for.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
+ Chapter 12: More About Regression Section 12.1 Inference for Linear Regression.
Sociology 5811: Lecture 11: T-Tests for Difference in Means Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
Lecture 10 Chapter 23. Inference for regression. Objectives (PSLS Chapter 23) Inference for regression (NHST Regression Inference Award)[B level award]
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 12 More About Regression 12.1 Inference for.
1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 
Chapter Eleven Performing the One-Sample t-Test and Testing Correlation.
Chapter 26 Inferences for Regression. An Example: Body Fat and Waist Size Our chapter example revolves around the relationship between % body fat and.
Confidence Intervals. Point Estimate u A specific numerical value estimate of a parameter. u The best point estimate for the population mean is the sample.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 12 More About Regression 12.1 Inference for.
Regression Inference. Height Weight How much would an adult male weigh if he were 5 feet tall? He could weigh varying amounts (in other words, there is.
CHAPTER 12 More About Regression
Inference about the slope parameter and correlation
CHAPTER 12 More About Regression
Regression.
Inferences for Regression
CHAPTER 12 More About Regression
Regression.
Regression 1 Sociology 8811 Copyright © 2007 by Evan Schofer
Chapter 12 Regression.
Regression.
Regression.
Regression.
Regression Chapter 8.
Regression.
CHAPTER 12 More About Regression
Regression.
CHAPTER 12 More About Regression
Inferences for Regression
Presentation transcript:

Linear Regression Inference AP Statistics Linear Regression Inference

Hypothesis Tests: Slopes Given: Observed slope relating Education to Job Prestige = 2.47 Question: Can we generalize this to the population of all Americans? How likely is it that this observed slope was actually drawn from a population with slope = 0? Solution: Conduct a hypothesis test Notation: slope = b, population slope = b H0: Population slope b = 0 H1: Population slope b  0 (two-tailed test)

Review: Slope Hypothesis Tests What information lets us to do a hypothesis test? Answer: Estimates of a slope (b) have a sampling distribution, like any other statistic It is the distribution of every value of the slope, based on all possible samples (of size N) If certain assumptions are met, the sampling distribution approximates the t-distribution Thus, we can assess the probability that a given value of b would be observed, if b = 0 If probability is low – below alpha – we reject H0

Review: Slope Hypothesis Tests Visually: If the population slope (b) is zero, then the sampling distribution would center at zero Since the sampling distribution is a probability distribution, we can identify the likely values of b if the population slope is zero If b=0, observed slopes should commonly fall near zero, too b Sampling distribution of the slope If observed slope falls very far from 0, it is improbable that b is really equal to zero. Thus, we can reject H0.

Bivariate Regression Assumptions Assumptions for bivariate regression hypothesis tests: 1. Random sample Ideally N > 20 But different rules of thumb exist. (10, 30, etc.) 2. Variables are linearly related i.e., the mean of Y increases linearly with X Check scatter plot for general linear trend Watch out for non-linear relationships (e.g., U-shaped)

Bivariate Regression Assumptions 3. Y is normally distributed for every outcome of X in the population “Conditional normality” Ex: Years of Education = X, Job Prestige (Y) Suppose we look only at a sub-sample: X = 12 years of education Is a histogram of Job Prestige approximately normal? What about for people with X = 4? X = 16 If all are roughly normal, the assumption is met

Bivariate Regression Assumptions Examine sub-samples at different values of X. Make histograms and check for normality. Normality: Good Not very good

Bivariate Regression Assumptions 4. The variances of prediction errors are identical at different values of X Recall: Error is the deviation from the regression line Is dispersion of error consistent across values of X? Definition: “homoskedasticity” = error dispersion is consistent across values of X Opposite: “heteroskedasticity”, errors vary with X Test: Compare errors for X=12 years of education with errors for X=2, X=8, etc. Are the errors around line similar? Or different?

Bivariate Regression Assumptions Homoskedasticity: Equal Error Variance Examine error at different values of X. Is it roughly equal? Here, things look pretty good.

Bivariate Regression Assumptions Heteroskedasticity: Unequal Error Variance At higher values of X, error variance increases a lot. This looks pretty bad.

Bivariate Regression Assumptions Notes/Comments: 1. Overall, regression is robust to violations of assumptions It often gives fairly reasonable results, even when assumptions aren’t perfectly met 2. Variations of regression can handle situations where assumptions aren’t met 3. But, there are also further diagnostics to help ensure that results are meaningful…

Regression Hypothesis Tests If assumptions are met, the sampling distribution of the slope (b) approximates a T-distribution Standard deviation of the sampling distribution is called the standard error of the slope (sb) Population formula of standard error: Where se2 is the variance of the regression error

Regression Hypothesis Tests Estimating se2 lets us estimate the standard error: Now we can estimate the S.E. of the slope:

Regression Hypothesis Tests Finally: A t-value can be calculated: It is the slope divided by the standard error Where sb is the sample point estimate of the standard error The t-value is based on N-2 degrees of freedom

Regression Confidence Intervals You can also use the standard error of the slope to estimate confidence intervals: Where tN-2 is the t-value for a two-tailed test given a desired a-level Example: Observed slope = 2.5, S.E. = .10 95% t-value for 102 d.f. is approximately 2 95% C.I. = 2.5 +/- 2(.10) Confidence Interval: 2.3 to 2.7