Download presentation
Presentation is loading. Please wait.
1
Simple Linear Regression
l Chapter 9 l Simple Linear Regression 9.1 Simple Linear Regression 9.2 Scatter Diagram 9.3 Graphical Method for Determining Regression 9.4 Least Square Method 9.5 Correlation Coefficient and Coefficient Determination 9.6 Test of Significance
2
9.0 Introduction to Regression
Regression is a statistical procedure for establishing the relationship between 2 or more variables. This is done by fitting a linear equation to the observed data. The regression line is then used by the researcher to see the trend and make prediction of values for the data. There are 2 types of relationship: Simple ( 2 variables) Multiple (more than 2 variables)
3
9.1 Simple Linear Regression
It involve relationship analysis between two variables (One independent variable and one dependent variable). Its model use an equation that describes a dependent variable (Y) in terms of an independent variable (X) plus random error ε. Random error, is the difference of data point from the deterministic value. This regression line is estimated from the data collected by fitting a straight line to the data set and getting the equation of the straight line, where, = intercept of the line with the Y-axis = slope of the line = random error
4
9.1 Simple Linear Regression
Example of independent and dependent variables. 1) A nutritionist studying weight loss programs might wants to find out if reducing intake of carbohydrate can help a person reduce weight. X is the carbohydrate intake (independent variable). Y is the weight (dependent variable). 2) An entrepreneur might want to know whether increasing the cost of packaging his new product will have an effect on the sales volume. X is the cost (independent variable) Y is sales volume (dependent variable)
5
9.2 Scatter Diagram A scatter plot is a graph or ordered pairs (x,y).
The purpose of scatter plot – to describe the nature of the relationships between independent variable, X and dependent variable, Y in visual way. The independent variable, x is plotted on the horizontal axis and the dependent variable, y is plotted on the vertical axis.
6
9.2 Scatter Diagram A linear regression can be develop by freehand plot of the data. Example: The given table contains values for 2 variables, X and Y. Plot the given data and make a freehand estimated regression line. X -3 -2 -1 1 2 3 Y 5 8 11 12
7
9.2 Scatter Diagram
8
9.2 Scatter Diagram
9
9.4 Least Square Method Produces a straight line that minimizes the sum of square differences between the point and the line (determine values for and that ensure a best fit for the estimated regression line to the sample data points). Involve some calculation procedure. However, as we are using SPSS output, no calculation is necessary.
10
9.4 Least Square Method
11
9.4 Least Square Method Conducting simple linear regression through SPSS. The data below represent scores obtained by ten primary school students before and after they were taken on a tour to the museum (which is supposed to increase their interest in history) Fit a linear regression model with “before” as the explanatory variable and “after” as the dependent variable. Predict the score a student would obtain “after” if he scored 60 marks “before”. Before,x 65 63 76 46 68 72 57 36 96 After, y 66 86 48 71 42 87
12
9.4 Least Square Method Analysis 1 3 2
13
9.4 Least Square Method Output
If a student scored 60 marks in “before”, he would obtain in “after”
14
9.5 Correlation Coefficient and Coefficient of Determination
Correlation measures the strength of a linear relationship between the two variables. Also known as Pearson’s product moment coefficient of correlation. The symbol for the sample coefficient of correlation is r. Values of r close to 1 strong positive linear relationship between x and y. close to -1 strong negative linear relationship between x and y. close to 0 little or no linear relationship between x and y.
15
9.5 Correlation Coefficient and Coefficient of Determination
Positive linear relationship E(y) x Regression line Slope b1 is positive Intercept b0
16
9.5 Correlation Coefficient and Coefficient of Determination
Negative linear relationship E(y) x Regression line b0 Intercept Slope b1 is negative
17
9.5 Correlation Coefficient and Coefficient of Determination
No relationship E(y) x Regression line b0 Intercept Slope b1 is 0
18
9.5 Correlation Coefficient and Coefficient of Determination
The coefficient of determination is a measure of the variation of the dependent variable (Y) that is explained by the regression line and the independent variable (X). If r = 0.90, then = It means that 81% of the variation in the dependent variable (Y) is accounted for by the variations in the independent variable (X). The rest of the variation, 0.19 or 19%, is unexplained and called the coefficient of nondetermination. Formula for the coefficient of nondetermination is 1-
19
9.5 Correlation Coefficient and Coefficient of Determination
Output Strong positive linear relationship 88.7% variation in X is explained by Y
20
9.6 Test of Significance Simple linear regression involves two estimated parameters which are β0 and β1. Test of hypothesis is used in order to know whether independent variable is significant to dependent variable (whether X provides information in predicting Y). The t-test or analysis of variance (ANOVA) method is an approach to test the significance of the regression. Basically, two test are commonly used: t Test F Test (ANOVA)
21
9.6 Test of Significance (NO RELATIONSHIP) (THERE IS RELATIONSHIP)
t-test Compare P-value (refer to Coefficient table) with α Reject if P-value < α If we reject there is a significant relationship between variable X and Y. (NO RELATIONSHIP) (THERE IS RELATIONSHIP)
22
9.6 Test of Significance (NO RELATIONSHIP) (THERE IS RELATIONSHIP)
F-test (ANOVA) Compare P-value (refer to ANOVA table) with α Reject if P-value < α If we reject there is a significant relationship between variable X and Y. (NO RELATIONSHIP) (THERE IS RELATIONSHIP)
23
9.6 Test of Significance f= Construction of ANOVA table Sum of squares
Source of variation Sum of squares Degree of freedom Mean square ftest Regression SSR 1 MSR =SSR/1 f= MSR/MSE Error SSE n-2 MSE =SSE/n-2 Total SST n-1
24
Exercise The following table gives information on lists of the midterm, X, and final exam, Y, scores for seven students in a statistics class. Find the least squares regression line. Explain the values of r and . Predict the final exam scores the student will get if he/she got 60 marks for midterm test. Do the data support the existence of a linear relationship between midterm and final exam? Test using α = 0.05. X 79 95 81 66 87 94 59 Y 85 97 78 76 84 67
25
Exercise The manufacturer of Cardio Glide exercise equipment wants to study the relationship between the number of months since the glide was purchased and the length of time the equipment was used last week. Determine the regression equation. At , test whether there is a linear relationship between the variables
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.