1 More Regression Information. 2 3 On the previous slide I have an Excel regression output. The example is the pizza sales we saw before. The first thing.

Slides:



Advertisements
Similar presentations
Here we add more independent variables to the regression.
Advertisements

1 Multiple Regression Here we add more independent variables to the regression.
Simple Linear Regression 1. 2 I want to start this section with a story. Imagine we take everyone in the class and line them up from shortest to tallest.
1 1 Slide IS 310 – Business Statistics IS 310 Business Statistics CSU Long Beach.
Chapter 12: Testing hypotheses about single means (z and t) Example: Suppose you have the hypothesis that UW undergrads have higher than the average IQ.
Chapter 14, part D Statistical Significance. IV. Model Assumptions The error term is a normally distributed random variable and The variance of  is constant.
Hypothesis: It is an assumption of population parameter ( mean, proportion, variance) There are two types of hypothesis : 1) Simple hypothesis :A statistical.
Correlation and regression Dr. Ghada Abo-Zaid
1 One Tailed Tests Here we study the hypothesis test for the mean of a population when the alternative hypothesis is an inequality.
Econ 140 Lecture 81 Classical Regression II Lecture 8.
1 Multiple Regression Interpretation. 2 Correlation, Causation Think about a light switch and the light that is on the electrical circuit. If you and.
Chapter 12 Simple Linear Regression
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
The Multiple Regression Model Prepared by Vera Tabakova, East Carolina University.
1 Matched Samples The paired t test. 2 Sometimes in a statistical setting we will have information about the same person at different points in time.
1 Test for the Population Proportion. 2 When we have a qualitative variable in the population we might like to know about the population proportion of.
1 Analysis of Variance This technique is designed to test the null hypothesis that three or more group means are equal.
1 Difference Between the Means of Two Populations.
1 Multiple Regression Here we add more independent variables to the regression. In this section I focus on sections 13.1, 13.2 and 13.4.
The Simple Regression Model
The Basics of Regression continued
1 The Basics of Regression Regression is a statistical technique that can ultimately be used for forecasting.
1 Hypothesis Testing In this section I want to review a few things and then introduce hypothesis testing.
More Simple Linear Regression 1. Variation 2 Remember to calculate the standard deviation of a variable we take each value and subtract off the mean and.
Business Statistics - QBM117 Interval estimation for the slope and y-intercept Hypothesis tests for regression.
Chapter 9: Correlation and Regression
8-4 Testing a Claim About a Mean
1 T-test for the Mean of a Population: Unknown population standard deviation Here we will focus on two methods of hypothesis testing: the critical value.
1 Confidence Interval for Population Mean The case when the population standard deviation is unknown (the more common case).
An Inference Procedure
What Is Hypothesis Testing?
AM Recitation 2/10/11.
Correlation and Regression
Introduction to Linear Regression and Correlation Analysis
Chapter 13: Inference in Regression
Intermediate Statistical Analysis Professor K. Leppel.
Overview Definition Hypothesis
Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis.
4.2 One Sided Tests -Before we construct a rule for rejecting H 0, we need to pick an ALTERNATE HYPOTHESIS -an example of a ONE SIDED ALTERNATIVE would.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
Correlation and Regression
F OUNDATIONS OF S TATISTICAL I NFERENCE. D EFINITIONS Statistical inference is the process of reaching conclusions about characteristics of an entire.
1 1 Slide © 2005 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Chapter 9 Hypothesis Testing and Estimation for Two Population Parameters.
QMS 6351 Statistics and Research Methods Regression Analysis: Testing for Significance Chapter 14 ( ) Chapter 15 (15.5) Prof. Vera Adamchik.
1 1 Slide Simple Linear Regression Coefficient of Determination Chapter 14 BA 303 – Spring 2011.
Production Planning and Control. A correlation is a relationship between two variables. The data can be represented by the ordered pairs (x, y) where.
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
1 Lecture 4 Main Tasks Today 1. Review of Lecture 3 2. Accuracy of the LS estimators 3. Significance Tests of the Parameters 4. Confidence Interval 5.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Inference for Regression Chapter 14. Linear Regression We can use least squares regression to estimate the linear relationship between two quantitative.
Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.3 Using Multiple Regression to Make Inferences.
5.1 Chapter 5 Inference in the Simple Regression Model In this chapter we study how to construct confidence intervals and how to conduct hypothesis tests.
Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis.
Interval Estimation and Hypothesis Testing Prepared by Vera Tabakova, East Carolina University.
Statistics for Business and Economics 8 th Edition Chapter 11 Simple Regression Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch.
Two-Sample Hypothesis Testing. Suppose you want to know if two populations have the same mean or, equivalently, if the difference between the population.
Chapter 8 Parameter Estimates and Hypothesis Testing.
June 30, 2008Stat Lecture 16 - Regression1 Inference for relationships between variables Statistics Lecture 16.
Regression Analysis Deterministic model No chance of an error in calculating y for a given x Probabilistic model chance of an error First order linear.
Chapter 12 Simple Linear Regression n Simple Linear Regression Model n Least Squares Method n Coefficient of Determination n Model Assumptions n Testing.
1 1 Slide © 2011 Cengage Learning Assumptions About the Error Term  1. The error  is a random variable with mean of zero. 2. The variance of , denoted.
Copyright © 2009 Pearson Education, Inc t LEARNING GOAL Understand when it is appropriate to use the Student t distribution rather than the normal.
Lecture #25 Tuesday, November 15, 2016 Textbook: 14.1 and 14.3
Section 9-3   We already know how to calculate the correlation coefficient, r. The square of this coefficient is called the coefficient of determination.
Section 9-3   We already know how to calculate the correlation coefficient, r. The square of this coefficient is called the coefficient of determination.
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Presentation transcript:

1 More Regression Information

2

3 On the previous slide I have an Excel regression output. The example is the pizza sales we saw before. The first thing I look at is the coefficients. See cell b28 has the word coefficient. We take the information below and write the equation as y hat = x. This is the estimated regression equation. The intercept is 60 and the slope is 5. Remember x = population of students and y = sales.

4 Hypothesis test about the population slope B 1. Remember we have taken a sample of data. In this context we have taken a sample and estimated the unknown population regression. Our real point in a study like this is to see if a relationship exists between the two variables in the population. If the slope is not zero in the population, then the x variable has an influence on the outcome of y. Now, in a sample, the estimated slope may or may not be zero. But the sample provides a basis for a test of the true unknown population slope being zero. For the test we will use the t distribution.

5 The t-distribution At this stage of the game I am going to have you accept some of the following without much proof. The t-distribution is like the normal except for two notable features. 1) t-distributions tend to be wider (show more variability) than z distributions. 2) the t-distribution does not have one standard like the normal distribution. Each t- distribution is unique, based on its degrees of freedom. Admittedly, degrees of freedom is a term without much meaning to you, but in the context of simple regression equals the sample size minus 2.

6 Many books have t-tables. Or you could do a Google search. Go to the upper tail area being.025. If you run down the column with your finger you will notice at the bottom the number So, when the degrees of freedom is really large, the t is like the z. But, with other degrees of freedom on the t-distribution, you have to go out farther than 1.96 to get to.025 in the upper tail. This is what I mean be t-distributions being wider. The t-values in this table are critical values for tests of hypotheses. Back to our hypothesis test about the slope. The null hypothesis is that B 1 = 0, and the alternative is that B 1 is not equal to zero. Since the alternative is not equal to zero we have a two-tailed test. Our example has a sample size of 10, so the degrees of freedom is 8. A level of significance of.05 means we want.025 on each side for a two tail test. From t-table the critical t is

7 Back on the computer output we see the calculated t in cell d30. The t stat from the sample is the slope divided by the standard error. Notice the t is Since this is bigger than the critical t we reject the null and conclude the slope is not zero in the population. Thus in the population of all company stores, sales are influenced by populations of students in the college towns. Excel prints the p-value for the test. For the slope we have 2.55E-05. E notation of the form E-05 means move the decimal in the number 5 places to the left. So our p-value is This is a two-tailed p-value. Since this is less than.05, it is an alternative way to reject the null hypothesis. This method can be used without looking at the t-table.

8 In cells f30 and g30 you have the 95% confidence interval for the slope. The interval is (3.6619, ). So you can be 95% confident the true unknown population slope is in this interval. A few slides back I wrote,” From the t-table the critical t is ” The margin of error in the confidence interval is the critical value times the standard error: (2.306 ).5803 = for an interval for the slope 5 – and In cell b17 you see the R square value of Sometimes this is called r 2, and its real name is the coefficient of determination. The coefficient of determination is a statistic used to see how well the data points “hug” the regression line. The value can be anywhere from 0 to 1. If all the data points actually touch the line then R square would be 1. If the value is 0 the points are not close to the line at all.

9 The square root of the coefficient of determination is the correlation coefficient ( called r). Remember the correlation coefficient was an indicator of the direction and strength of the relationship between two variables. The correlation coefficient could be anywhere from minus 1 to 1. Negative values meant a negative relationship and positive values meant a positive relationship. There we said the closer to 1 or minus 1 the stronger the relationship. If R square = 1, r = 1 and the relationship is as strong as you can get. If R square =.9, r =.94 and you still have a pretty strong relationship. If R square =.5, r =.71 and you would still be in the strong relationship neighborhood.

10 Well, in this section I have tried to go over some of the basic regression ideas. The point again is that we are studying two variables together and trying to establish if the two variables are related or not. Why should we care if two variables are related? As a person in business it might help the bottom line. As another example, say it can be established that the size of the advertising budget has an impact on sales. This could help us determine the right size budget. I have a claim that one day I will try to back up by using regression. I claim that recycling of paper makes states in the country have less trees. Each state probably recycles a different number of pounds of paper and has a certain amount of tree population growth or destruction. With tree population as the dependent variable, I would expect the slope coefficient on pounds of paper recycled to be negative. In other words, the more recycling, the less trees. (ITS an econ story, but anyway.) Regression can be used in social policy analysis. Anyway, that’s all for now.