Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Chapter 1 Introduction Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

Similar presentations


Presentation on theme: "1 Chapter 1 Introduction Ray-Bing Chen Institute of Statistics National University of Kaohsiung."— Presentation transcript:

1 1 Chapter 1 Introduction Ray-Bing Chen Institute of Statistics National University of Kaohsiung

2 2 1.1 Regression and Model Bulding Regression Analysis: a statistical technique for investigating and modeling the relationship between variables. Applications: Engineering, the physical and chemical science, economics, management, life and biological science, and the social science Regression analysis may be the most widely used statistical technique

3 3 Example: delivery time v.s. delivery volume –Suspect that the time required by a route deliveryman to load and service a machine is related to the number of cases of product delivered –25 randomly chosen retail outlet –The in-outlet delivery time and the volume of product delivery –Scatter diagram: display a relationship between delivery time and delivery volume

4 4

5 5

6 6 y: delivery time, x: delivery volume y =  0 +  1 x Error,  : –The difference between y and  0 +  1 x –A statistical error, i.e. a random variable –The effects of the other variables on delivery time, measurement errors, …

7 7 Simple linear regression model: y =  0 +  1 x +  –x: independent (predictor, regressor) variable –y: dependent (response) variable –  : error If x is fixed, y is determined by . Suppose that E(  ) = 0 and Var(  ) =  2. Then E(y|x) = E(  0 +  1 x +  ) =  0 +  1 x Var(y|x) = Var(  0 +  1 x +  ) =  2

8 8 The true regression line is a line of mean values: the height of the regression line at any x is the expected value of y for that x. The slope,  1 : the change in the mean of y for a unit change in x The variability of y at x is determined by the variance of the error

9 9 Example: –E(y|x) = 3.5 + 2 x, and Var(y|x) = 2 –y|x ~ N(  0 +  1 x,  2 ) –  2 small: the observed values will fall close the line. –  2 large: the observed values may deviate considerably from the line.

10 10

11 11 The regression equation is only an approximation to the true functional relationship between the variables. Regression model: Empirical model

12 12

13 13 Valid only over the region of the regressor variables contained in the observed data!

14 14 Multiple linear regression model: y =  0 +  1 x 1 +  +  k x k +  Linear: the model is linear in the parameters,  0,  1, …,  k, not because y is a linear function of x’s.

15 15 Two important objectives: –Estimate the unknown parameters (fitting the model to the data): The method of least squares. –Model adequacy checking: An iterative procedure to choose an appropriate regression model to describe the data. Remarks: –Don’t imply a cause-effect relationship between the variables –Can aid in confirming a cause-effect relationship, but it is not the sole basis! –Part of a broader data-analysis approach

16 16 1.2 Data Collection Three basic methods for collecting data: –A retrospective study based on historical data –An observational study –A designed experiment (BEST)

17 17 1.3 Use of Regression Several purpose: –Data decription –Parameter estimation –Prediction and estimation –Control

18 18 1.4 Role of the Computer Regression analysis requires the intelligent and artful use of the computer. SAS, SPSS, S-plus, R, MATLAB, …


Download ppt "1 Chapter 1 Introduction Ray-Bing Chen Institute of Statistics National University of Kaohsiung."

Similar presentations


Ads by Google