Download presentation
Presentation is loading. Please wait.
1
Ch11 Curve Fitting Dr. Deshi Ye yedeshi@zju.edu.cn
2
2/30 Outline The method of Least Squares Inferences based on the Least Squares Estimators Curvilinear Regression Multiple Regression
3
3/30 11.1 The Method of Least Squares Study the case where a dependent variable is to be predicted in terms of a single independent variable. The random variable Y depends on a random variable X. Regressing curve of Y on x, the relationship between x and the mean of the corresponding distribution of Y.
4
4/30 Linear regression
5
5/30 Linear regression Linear regression: for any x, the mean of the distribution of the Y’s is given by In general, Y will differ from this mean, and we denote this difference as follows is a random variable and we can also choose so that the mean of the distribution of this random is equal to zero.
6
6/30 EX x 1 2 3 4 5 6 7 8 9 10 11 12 y16 35 45 64 86 96 106 124 134 156 164 182
7
7/30 Analysis as close as possible to zero.
8
8/30 Principle of least squares Choose a and b so that is minimum. The procedure of finding the equation of the line which best fits a given set of paired data, called the method of least squares. Some notations:
9
9/30 Least squares estimators Fitted (or estimated) regression line Residuals: observation – fitted value= The minimum value of the sum of squares is called the residual sum of squares or error sum of squares. We will show that
10
10/30 EX solution Y = 14.8 X + 4.35
11
11/30 X-and-Y X-axis Y-axis independent dependent predictor predicted carrier response input output
12
12/30 Example You’re a marketing analyst for Hasbro Toys. You gather the following data: Ad $Sales (Units) 11 21 32 42 54 What is the relationship between sales & advertising?
13
13/30 Scattergram Sales vs. Advertising Sales Advertising
14
14/30 the Least Squares Estimators
15
15/30 11.2 Inference based on the Least Squares Estimators We assume that the regression is linear in x and, furthermore, that the n random variable Yi are independently normally distribution with the means Statistical model for straight-line regression are independent normal distributed random variable having zero means and the common variance
16
16/30 Standard error of estimate The i-th deviation and the estimate of is Estimate of can also be written as follows
17
17/30 Statistics for inferences: based on the assumption made concerning the distribution of the values of Y, the following theorem holds. Theorem. The statistics are values of random variables having the t distribution with n-2 degrees of freedom. Confidence intervals
18
18/30 Example The following data pertain to number of computer jobs per day and the central processing unit (CPU) time required. Number of jobs x CPU time y 1234512345 2 5 4 9 10
19
19/30 EX 1) Obtain a least squares fit of a line to the observations on CPU time
20
20/30 Example 2) Construct a 95% confidence interval for α The 95% confidence interval of α,
21
21/30 Example 3) Test the null hypothesis against the alternative hypothesis at the 0.05 level of significance. Solution: the t statistic is given by Criterion: Decision: we cannot reject the null hypothesis
22
22/30 11.3 Curvilinear Regression Regression curve is nonlinear. Polynomial regression: Y on x is exponential, the mean of the distribution of values of Y is given by Take logarithms, we have Thus, we can estimate by the pairs of value
23
23/30 Polynomial regression If there is no clear indication about the function form of the regression of Y on x, we assume it is polynomial regression
24
24/30 Polynomial Fitting Really just a generalization of the previous case Exact solution Just big matrices
25
25/30 11.4 Multiple Regression The mean of Y on x is given by Minimize We can solve it when r=2 by the following equations
26
26/30 Example P365.
27
27/30 Multiple Linear Fitting X 1 (x),...,X M (x) are arbitrary fixed functions of x (can be nonlinear), called the basis functions normal equations of the least squares problem Can be put in matrix form and solved
28
28/30 Correlation Models 1. How strong is the linear relationship between 2 variables? 2. Coefficient of correlation used Population correlation coefficient denoted Values range from -1 to +1
29
29/30 Correlation Standardized observation The sample correlation coefficient r
30
30/30 Coefficient of Correlation Values+1.00-.5+.5 No Correlation Increasing degree of negative correlation Increasing degree of positive correlation
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.