Download presentation
1
Chapter 5: Introductory Linear Regression
Chapter Outline 5.1 Simple Linear Regression Simple Linear Regression Model 5.2 Inferences About Estimated Parameters 5.3 Adequacy of the model coefficient of determination 5.4 Pearson Product Moment Correlation Coefficient 5.5 Test for Linearity of Regression
2
INTRODUCTION TO LINEAR REGRESSION
Regression – is a statistical procedure for establishing the r/ship between 2 or more variables. This is done by fitting a linear equation to the observed data. The regression line is used by the researcher to see the trend and make prediction of values for the data. There are 2 types of relationship: Simple ( 2 variables) Multiple (more than 2 variables)
3
Many problems in science and engineering involve exploring the relationship between two or more variables. Two statistical techniques: Regression Analysis Computing the Correlation Coefficient (r). Linear regression - study on the linear relationship between two or more variables. This is done by formulate a linear equation to the observed data. The linear equation is then used to predict values for the data.
4
In simple linear regression only two variables are involved:
X is the independent variable. Y is dependent variable. The correlation coefficient (r ) tells us how strongly two variables are related.
5
a) X is the carbohydrate intake (independent variable).
Example 5.1: 1) A nutritionist studying weight loss programs might wants to find out if reducing intake of carbohydrate can help a person reduce weight. a) X is the carbohydrate intake (independent variable). b) Y is the weight (dependent variable). 2) An entrepreneur might want to know whether increasing the cost of packaging his new product will have an effect on the sales volume. a) X is cost b) Y is sales volume
6
5.1 SIMPLE LINEAR REGRESSION MODEL
Linear regression model is a model that expresses the linear relationship between two variables. The simple linear regression model is written as: where ;
7
5.2 INFERENCES ABOUT ESTIMATED PARAMETERS
LEAST SQUARES METHOD The Least Square method is the method most commonly used for estimating the regression coefficients The straight line fitted to the data set is the line: where is the estimated value of y for a given value of X.
8
y-Intercept for the Estimated Regression Equation,
9
ii) Slope for the Estimated Regression Equation,
10
Example 5.2: Students Score In MATHEMATICS
The data below represent scores obtained by ten students in subject Mathematics 1 and Mathematics 2. Math 1, x 65 63 76 46 68 72 57 36 96 Math 2, y 66 86 48 71 42 87 Develop an estimated linear regression model with “Math 1” as the independent variable and “Math 2” as the dependent variable. Predict the score a student would obtain “Math 2” if he scored 60 marks in “Math 1”.
13
5.3 ADEQUACY OF THE MODEL COEFFICIENT OF DETERMINATION( )
The coefficient of determination is a measure of the variation of the dependent variable (Y) that is explained by the regression line and the independent variable (X). The symbol for the coefficient of determination is or Example : If =0.90, then =0.81. It means that 81% of the variation in the dependent variable (Y) is accounted for by the variations in the independent variable (X).
14
The rest of the variation, 0
The rest of the variation, 0.19 or 19%, is unexplained and called the coefficient of non determination. Formula for the coefficient of non determination is
15
The coefficient of determination is:
16
5.4 PEARSON PRODUCT MOMENT CORRELATION COEFFICIENT (r)
Correlation measures the strength of a linear relationship between the two variables. Also known as Pearson’s product moment coefficient of correlation. The symbol for the sample coefficient of correlation is (r) Formula :
17
Properties of (r): Values of r close to 1 implies there is a strong positive linear relationship between x and y. Values of r close to -1 implies there is a strong negative linear relationship between x and y. Values of r close to 0 implies little or no linear relationship between x and y.
18
Example 5.4: Refer Previous Example 5.2,
Calculate the value of r and interpret its meaning. Solution: Thus, there is a strong positive linear relationship between score obtain Math 1 (x) and Math 2 (y).
19
Exercise 5.2: Refer to previous Exercise 3.1 and Exercise 3.2, calculate coefficient correlation and interpret the results.
20
5.5 TEST FOR LINEARITY OF REGRESSION
To test the existence of a linear relationship between two variables x and y, we proceed with testing the hypothesis. Test commonly used: t -Test
21
t-Test 1. Determine the hypotheses. ( no linear r/ship)
(exist linear r/ship) 2. Compute Critical Value/ level of significance. 3. Compute the test statistic.
22
4. Determine the Rejection Rule.
Reject H0 if : p-value < a 5.Conclusion. There is a significant relationship between variable X and Y.
23
Example 5.5: Refer Previous Example 5.3,
Test to determine if their scores before and after the trip is related. Use a=0.05 Solution: 1) 2) ( no linear r/ship) (exist linear r/ship)
24
3) 4) Rejection Rule: 5) Conclusion: Thus, we reject H0. The score before (x) is linear relationship to the score after (y) the trip.
25
EXERCISE 5.1: The owner of a small factory that produces working gloves is concerned about the high cost of air conditioning in the summer. Keeping the higher temperature in the factory may lower productivity. During summer, he conducted an experiment with temperature settings from 68 to 81 degrees Fahrenheit and measures each day’s productivity which produced the following table: Find the regression model. Predict the number of pairs of gloves produced if x = 74. Compute the Pearson correlation coefficient. What you can say about the relationship of the two variables? d) Can you conclude that the temperature is linearly related to the number of pairs of gloves produced? Use α=0.05. Temperature 72 71 78 75 81 77 68 76 Number of Pairs of gloves (in hundreds) 37 32 36 33 35 39 34
26
EXERCISE 5.2 : An agricultural scientist planted alfalfa on several plots of land, identical except for the soil pH. Following are the dry matter yields (in pounds per acre) for each plot. pH Yield 4.6 1056 4.8 1833 5.2 1629 5.4 1852 5.6 1783 5.8 2647 6.0 2131
27
Compute the estimated regression line for predicting Yield from pH.
If the pH is increased by 0.1, by how much would you predict the yield to increase or decrease? For what pH would you predict a yield of 1500 pounds per acre? Calculate coefficient correlation, and interpret the results. Answer :
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.