Announcements: Homework 10: –Due next Thursday (4/25) –Assignment will be on the web by tomorrow night.

Slides:



Advertisements
Similar presentations
Test of (µ 1 – µ 2 ),  1 =  2, Populations Normal Test Statistic and df = n 1 + n 2 – 2 2– )1– 2 ( 2 1 )1– 1 ( 2 where ] 2 – 1 [–
Advertisements

Topic 12: Multiple Linear Regression
 Population multiple regression model  Data for multiple regression  Multiple linear regression model  Confidence intervals and significance tests.
BPS - 5th Ed. Chapter 241 One-Way Analysis of Variance: Comparing Several Means.
Chapter 14, part D Statistical Significance. IV. Model Assumptions The error term is a normally distributed random variable and The variance of  is constant.
Regression Analysis Simple Regression. y = mx + b y = a + bx.
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Simple Linear Regression. Start by exploring the data Construct a scatterplot  Does a linear relationship between variables exist?  Is the relationship.
Objectives (BPS chapter 24)
Chapter 12 Simple Linear Regression
Note 14 of 5E Statistics with Economics and Business Applications Chapter 12 Multiple Regression Analysis A brief exposition.
Announcements: Next Homework is on the Web –Due next Tuesday.
Where we’ve been & where we’re going We can use data to address following questions: 1.Question:Is a mean = some number? Large sample z-test and CI Small.
Lesson #32 Simple Linear Regression. Regression is used to model and/or predict a variable; called the dependent variable, Y; based on one or more independent.
1 Pertemuan 13 Uji Koefisien Korelasi dan Regresi Matakuliah: A0392 – Statistik Ekonomi Tahun: 2006.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics 11 th Edition.
Stat Today: Multiple comparisons, diagnostic checking, an example After these notes, we will have looked at (skip figures 1.2 and 1.3, last.
Introduction to Probability and Statistics Linear Regression and Correlation.
This Week Continue with linear regression Begin multiple regression –Le 8.2 –C & S 9:A-E Handout: Class examples and assignment 3.
Lecture 19 Simple linear regression (Review, 18.5, 18.8)
Chapter 7 Forecasting with Simple Regression
Hypothesis tests for slopes in multiple linear regression model Using the general linear test and sequential sums of squares.
Regression Analysis Regression analysis is a statistical technique that is very useful for exploring the relationships between two or more variables (one.
HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2010 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Chapter 14 Analysis.
CPE 619 Simple Linear Regression Models Aleksandar Milenković The LaCASA Laboratory Electrical and Computer Engineering Department The University of Alabama.
Simple Linear Regression Models
Inferences in Regression and Correlation Analysis Ayona Chatterjee Spring 2008 Math 4803/5803.
Econ 3790: Business and Economics Statistics
Topic 7: Analysis of Variance. Outline Partitioning sums of squares Breakdown degrees of freedom Expected mean squares (EMS) F test ANOVA table General.
Time Series Analysis – Chapter 2 Simple Regression Essentially, all models are wrong, but some are useful. - George Box Empirical Model-Building and Response.
Introduction to Probability and Statistics Chapter 12 Linear Regression and Correlation.
One-Factor Analysis of Variance A method to compare two or more (normal) population means.
Chapter 11 Linear Regression Straight Lines, Least-Squares and More Chapter 11A Can you pick out the straight lines and find the least-square?
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
Go to Table of Content Single Variable Regression Farrokh Alemi, Ph.D. Kashif Haqqi M.D.
Introduction to Probability and Statistics Thirteenth Edition Chapter 12 Linear Regression and Correlation.
An alternative approach to testing for a linear association The Analysis of Variance (ANOVA) Table.
MBP1010H – Lecture 4: March 26, Multiple regression 2.Survival analysis Reading: Introduction to the Practice of Statistics: Chapters 2, 10 and 11.
Chapter 5: Regression Analysis Part 1: Simple Linear Regression.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Solutions to Tutorial 5 Problems Source Sum of Squares df Mean Square F-test Regression Residual Total ANOVA Table Variable.
Inference for regression - More details about simple linear regression IPS chapter 10.2 © 2006 W.H. Freeman and Company.
Simple Linear Regression ANOVA for regression (10.2)
ANOVA for Regression ANOVA tests whether the regression model has any explanatory power. In the case of simple regression analysis the ANOVA test and the.
Chapter 13 Multiple Regression
Regression Analysis Relationship with one independent variable.
Lack of Fit (LOF) Test A formal F test for checking whether a specific type of regression function adequately fits the data.
STA 286 week 131 Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression.
Multiple Regression I 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 4 Multiple Regression Analysis (Part 1) Terry Dielman.
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons Business Statistics, 4e by Ken Black Chapter 14 Multiple Regression Analysis.
Inference for regression - More details about simple linear regression IPS chapter 10.2 © 2006 W.H. Freeman and Company.
Chapter 12 Simple Linear Regression n Simple Linear Regression Model n Least Squares Method n Coefficient of Determination n Model Assumptions n Testing.
1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 
ENGR 610 Applied Statistics Fall Week 11 Marshall University CITE Jack Smith.
Analysis of variance approach to regression analysis … an (alternative) approach to testing for a linear association.
Bivariate Regression. Bivariate Regression analyzes the relationship between two variables. Bivariate Regression analyzes the relationship between two.
Announcements There’s an in class exam one week from today (4/30). It will not include ANOVA or regression. On Thursday, I will list covered material and.
Multiple Regression.
The simple linear regression model and parameter estimation
Inference for Least Squares Lines
Where we’ve been & where we’re going
Statistics for Managers using Microsoft Excel 3rd Edition
Simple Linear Regression
Regression model with multiple predictors
Relationship with one independent variable
Review of Chapter 3 where Multiple Linear Regression Model:
Multiple Regression.
Prepared by Lee Revere and John Large
Review of Chapter 2 Some Basic Concepts: Sample center
Relationship with one independent variable
Presentation transcript:

Announcements: Homework 10: –Due next Thursday (4/25) –Assignment will be on the web by tomorrow night.

Fabric B u r n T i m e Vertical spread of data points within each oval is one type of variability. Vertical spread of the ovals is another type of variability.

Suppose there are k treatments and n data points. ANOVA table: Source Sum ofMean of VariationdfSquaresSquare F P Treatmentk-1SSTMST=SST/(k-1)MST/MSE Errorn-kSSEMSE=SSE/(n-k) Totaln-1total SS ESTIMATE OF “WITHIN FABRIC TYPE” VARIABILITY ESTIMATE OF “ACROSS FABRIC TYPE” VARIABILITY “SUM OF SQUARES” IS WHAT GOES INTO NUMERATOR OF s 2 : “(X 1 -X) 2 + … + (X n -X) 2” P-VALUE FOR TEST OF All Means equal. (REJECT IF LESS THAN  )

One-way ANOVA: Burn Time versus Fabric Analysis of Variance for Burn Time Source DF SS MS F P Fabric Error Total Explaining why ANOVA is an analysis of variance: MST = / 3 = Sqrt(MST) describes standard deviation among the fabrics. MSE = / 12 = 1.35 Sqrt(MSE) describes standard deviation of burn time within each fabric type. (MSE is estimate of variance of each burn time.) F = MST / MSE = It makes sense that this is large and p-value = Pr(F 4-1,16-4 > 27.15) = 0 is small because the variance “among treatments” is much larger than variance within the units that get each treatment. (Note that the F test assumes the burn times are independent and normal with the same variance.) For test: H 0 :        

It turns out that ANOVA is a special case of regression. We’ll come back to that in a class or two. First, let’s learn about regression (chapters 12 and 13). Simple Linear Regression example: Ingrid is a small business owner who wants to buy a fleet of Mitsubishi sigmas. To save $ she decides to buy second hand cars and wants to estimate how much to pay. In order to do this, she asks one of her employees to collect data on how much people have paid for these cars recently. (From Matt Wand)

Age (years) Regression Plot Data: Each point is a car Price ($)

Plot suggests a simple model: Price of car = intercept + slope times car’s age + error or y i =  0 +  1 x i +  i, i = 1,…,39. Estimate  0 and  1. Outline for Regression: 1.Estimating the regression parameters and ANOVA tables for regression 2.Testing and confidence intervals 3.Multiple regression models & ANOVA 4.Regression Diagnostics

Plot suggests a model: Price of car = intercept + slope times car’s age + error or y i =  0 +  1 x i +  i, i = 1,…,39. Estimate  0 and  1 with b 0 and b 1. Find these with “least squares”. In other words, find b 0 and b 1 to minimize sum of squared errors: SSE = {y 1 – (b 0 + b 1 x 1 )} 2 + … + {y n – (b 0 + b 1 x n )} 2 See green line on next page. Each term is squared difference between observed y and the regression line ((b 0 + b 1 x 1 )

Squared length of this line contributes one term to Sum of Squared Errors (SSE) This line has length y i – b 0 – b 1 x i for some i Age P r i c e S = R-Sq = 43.8 % R-Sq(adj) = 42.2 % Price = Age Regression Plot

Age (years) S = R-Sq = 43.8 % R-Sq(adj) = 42.2 % General Model: Price =  0 +  1 Age + error Fitted Model: Price = Age Regression Plot Price ($) Do Minitab example

Regression parameter estimates, b 0 and b 1, minimize SSE = {y 1 – (b 0 + b 1 x 1 )} 2 + … + {y – (b 0 + b 1 x n )} 2 Full model is y i =  0 +  1 x i +  i Suppose errors (  i ’s) are independent N(0,  2 ). What do you think a good estimate of  2 is? MSE = SSE/(n-2) is an estimate of  2. Note how SSE looks like the numerator in s 2.

(I divided price by $1000. Think about why this doesn’t matter.) Source DF SS MS F P Regression Residual Error Total Sum of Squares Total = {y 1 –mean(y)} 2 + … + {y 39 – mean(y)} 2 = Sum of Squared Errors = {y 1 – (b 0 + b 1 x 1 )} 2 + … + {y – (b 0 + b 1 x n )} 2 = Sum of Squares for Regression = SSTotal - SSE What do these mean?

Overall mean of $3,656 Regression line Age P r i c e S = R-Sq = 43.8 % R-Sq(adj) = 42.2 % Price = Age Regression Plot

(I divided price by $1000. Think about why this doesn’t really matter.) Source DF SS MS F P Regression 1=p Residual Error 37=n-p Total 38=n p is the number of regression parameters (2 for now) SSTotal = {y 1 –mean(y)} 2 + … + {y 39 – mean(y)} 2 = SSTotal / 38 is an estimate of the variance around the overall mean. (i.e. variance in the data without doing regression) SSE = {y 1 – (b 0 + b 1 x 1 )} 2 + … + {y – (b 0 + b 1 x n )} 2 = MSE = SSE / 37 is an estimate of the variance around the line. (i.e. variance that is not explained by the regression) SSR = SSTotal – SSE MSR = SSR / 1 is the variance the data that is “explained by the regression”.

(I divided price by $1000. Think about why this doesn’t really matter.) Source DF SS MS F P Regression 1=p Residual Error 37=n-p Total 38=n p is the number of regression parameters A test of H 0 :  1 = 0 versus H A : parameter is not 0 Reject if the variance explained by the regression is high compared to the unexplained variability in the data. Reject if F is large. F = MSR / MSE p-value is Pr(F p-1,n-p > MSR / MSE) Reject H 0 for any  less than the p-value (Assuming errors are independent and normal.)

R2R2 Another summary of a regression is: R 2 =Sum of Squares for Regression Sum of Squares Total 0<= R 2 <= 1 This is the percentage of the of variation in the data that is described by the regression.

Two different ways to assess “worth” of a regression 1.Absolute size of slope: bigger = better 2.Size of error variance: smaller = better 1.R 2 close to one 2.Large F statistic