Presentation is loading. Please wait.

Presentation is loading. Please wait.

STA302/1001 week 11 Regression Models - Introduction In regression models, two types of variables that are studied:  A dependent variable, Y, also called.

Similar presentations


Presentation on theme: "STA302/1001 week 11 Regression Models - Introduction In regression models, two types of variables that are studied:  A dependent variable, Y, also called."— Presentation transcript:

1 STA302/1001 week 11 Regression Models - Introduction In regression models, two types of variables that are studied:  A dependent variable, Y, also called response variable. It is modeled as random.  An independent variable, X, also called predictor variable or explanatory variable. It is sometimes modeled as random and sometimes it has fixed value for each observation. In regression models we are fitting a statistical model to data. We generally use regression to be able to predict the value of one variable given the value of others.

2 STA302/1001 week 12 Simple Linear Regression - Introduction Simple linear regression studies the relationship between a quantitative response variable Y, and a single explanatory variable X. Idea of statistical model: Actual observed value of Y = … Box (a well know statistician) claim: “All models are wrong, some are useful”. ‘Useful’ means that they describe the data well and can be used for predictions and inferences. Recall: parameters are constants in a statistical model which we usually don’t know but will use data to estimate.

3 STA302/1001 week 13 Simple Linear Regression Models The statistical model for simple linear regression is a straight line model of the form where… For particular points, We expect that different values of X will produce different mean response. In particular we have that for each value of X, the possible values of Y follow a distribution whose mean is Formally it means that ….

4 STA302/1001 week 14 Estimation – Least Square Method Estimates of the unknown parameters β 0 and β 1 based on our observed data are usually denoted by b 0 and b 1. For each observed value x i of X the fitted value of Y is This is an equation of a straight line. The deviations from the line in vertical direction are the errors in prediction of Y and are called “residuals”. They are defined as The estimates b 0 and b 1 are found by the Method of Lease Squares which is based on minimizing sum of squares of residuals. Note, the least-squares estimates are found without making any statistical assumptions about the data.

5 STA302/1001 week 15 Derivation of Least-Squares Estimates Let We want to find b 0 and b 1 that minimize RSS. Use calculus….

6 STA302/1001 week 16 Properties of Fitted Line Note: you need to know how to prove the above properties.

7 STA302/1001 week 17 Statistical Assumptions for SLR  Recall, the simple linear regression model is Y i = β 0 + β 1 X i + ε i where i = 1, …, n.  The assumptions for the simple linear regression model are: 1) E(ε i )=0 2) Var(ε i ) = σ 2 3) ε i ’s are uncorrelated. These assumptions are also called Gauss-Markov conditions. The above assumptions can be stated in terms of Y’s…

8 STA302/1001 week 18 Gauss-Markov Theorem The least-squares estimates are BLUE (Best Linear, Unbiased Estimators). The least-squares estimates are linear in y’s… Of all the possible linear, unbiased estimators of β 0 and β 1 the least squares estimates have the smallest variance.

9 STA302/1001 week 19 Properties of Least Squares Estimates Estimate of β 0 and β 1 – functions of data that can be calculated numerically for a given data set. Estimator of β 0 and β 1 – functions of the underlying random variables. Recall: the least-square estimators are… Claim: The least squares estimators are unbiased estimators for β 0 and β 1. Proof:…

10 STA302/1001 week 110 Estimation of Error Term Variance σ 2 The variance σ 2 of the error terms ε i ’s needs to be estimated to obtain indication of the variability of the probability distribution of Y. Further, a variety of inferences concerning the regression function and the prediction of Y require an estimate of σ 2. Recall, for random variable Z the estimates of the mean and variance of Z based on n realization of Z are…. Similarly, the estimate of σ 2 is S 2 is called the MSE – Mean Square Error it is an unbiased estimator of σ 2 (proof later on).

11 STA302/1001 week 111 Normal Error Regression Model In order to make inference we need one more assumption about ε i ’s. We assume that ε i ’s have a Normal distribution, that is ε i ~ N(0, σ 2 ). The Normality assumption implies that the errors ε i ’s are independent (since they are uncorrelated). Under the Normality assumption of the errors, the least squares estimates of β 0 and β 1 are equivalent to their maximum likelihood estimators. This results in additional nice properties of MLE’s: they are consistent, sufficient and MVUE.


Download ppt "STA302/1001 week 11 Regression Models - Introduction In regression models, two types of variables that are studied:  A dependent variable, Y, also called."

Similar presentations


Ads by Google