Stat 112: Notes 2 This class: Start Section 3.3. Thursday’s class: Finish Section 3.3. I will and post on the web site the first homework tonight. It will be due next Thursday.
Father and Son’s Heights Francis Galton was interested in the relationship between –Y=son’s height –X=father’s height Galton surveyed 952 father-son pairs in 19 th Century England. Data is in Galton.JMP
Simple Linear Regression Model
Sample vs. Population We can view the data – -- as a sample from a population. Our goal is to learn about the relationship between X and Y in the population: –We don’t care about how father’s heights and son’s heights are related in the particular 952 men sampled but among all fathers and sons. –From Notes 1, we don’t care about the relationship between tracks counted and the density of deer for the particular sample, but the relationship among the population of all tracks; this enables to predict in the future the density of deer from the number of tracks counted.
Simple Linear Regression Model
Checking the Assumptions
Residual Plot
Checking Linearity Assumption
Violation of Linearity
Checking Constant Variance
Checking Normality
Inferences
Sampling Distribution of b 0,b 1 Utopia.JMP contains simulations of pairs and from a simple linear regression model with Notice the difference in the estimated coefficients from the y’s and y*’s. The sampling distribution of describes the probability distribution of the estimates over repeated samples from the simple linear regression model with fixed.
Utopia Linear Fit y = x Parameter Estimates TermEstimateStd Errort RatioProb>|t| Intercept <.0001 x <.0001 Linear Fit y* = x Parameter Estimates TermEstimateStd Errort RatioProb>|t| Intercept x <.0001
Sampling distributions Sampling distribution of – –Sampling distribution is normally distributed. Sampling distribution of – –Sampling distribution is normally distributed. –Even if the normality assumption fails and the errors e are not normal, the sampling distributions of are still approximately normal if n>30.
Properties of and as estimators of and Unbiased Estimators: Mean of the sampling distribution is equal to the population parameter being estimated. Consistent Estimators: As the sample size n increases, the probability that the estimator will become as close as you specify to the true parameter converges to 1. Minimum Variance Estimator: The variance of the estimator is smaller than the variance of any other linear unbiased estimator of, say