Lecture 15 – Tues., Oct. 28 Review example of one-way layout Simple Linear Regression: –Simple Linear Regression Model, 7.2 –Least Squares Regression Estimation,

Slides:



Advertisements
Similar presentations
Lecture 17: Tues., March 16 Inference for simple linear regression (Ch ) R2 statistic (Ch ) Association is not causation (Ch ) Next.
Advertisements

Chap 12-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 12 Simple Regression Statistics for Business and Economics 6.
Stat 112: Lecture 7 Notes Homework 2: Due next Thursday The Multiple Linear Regression model (Chapter 4.1) Inferences from multiple regression analysis.
Inference for Regression
Statistics lecture 4 Relationships Between Measurement Variables.
Ch11 Curve Fitting Dr. Deshi Ye
Class 16: Thursday, Nov. 4 Note: I will you some info on the final project this weekend and will discuss in class on Tuesday.
Chapter 13 Multiple Regression
Multiple regression analysis
AST 101 Lecture 3: What is Science. “In the space of one hundred and seventy-six years the Lower Mississippi has shortened itself two hundred and forty-
Chapter 10 Simple Regression.
Lecture 23: Tues., Dec. 2 Today: Thursday:
Lecture 22: Thurs., April 1 Outliers and influential points for simple linear regression Multiple linear regression –Basic model –Interpreting the coefficients.
Class 15: Tuesday, Nov. 2 Multiple Regression (Chapter 11, Moore and McCabe).
Chapter 12 Simple Regression
Simple Linear Regression
Lecture 16 – Thurs., March 4 Chi squared test for M&M experiment Simple linear regression (Chapter 7.2) Next class after spring break: Inference for simple.
Chapter 12 Multiple Regression
Lecture 14 – Thurs, Oct 23 Multiple Comparisons (Sections 6.3, 6.4). Next time: Simple linear regression (Sections )
Lecture 19: Tues., Nov. 11th R-squared (8.6.1) Review
SIMPLE LINEAR REGRESSION
Lecture 24: Thurs. Dec. 4 Extra sum of squares F-tests (10.3) R-squared statistic (10.4.1) Residual plots (11.2) Influential observations (11.3,
1 Simple Linear Regression Chapter Introduction In this chapter we examine the relationship among interval variables via a mathematical equation.
Chapter 11 Multiple Regression.
Lecture 16 – Thurs, Oct. 30 Inference for Regression (Sections ): –Hypothesis Tests and Confidence Intervals for Intercept and Slope –Confidence.
Simple Linear Regression Analysis
Stat 217 – Day 25 Regression. Last Time - ANOVA When?  Comparing 2 or means (one categorical and one quantitative variable) Research question  Null.
Linear Regression Example Data
SIMPLE LINEAR REGRESSION
Lecture 20 – Tues., Nov. 18th Multiple Regression: –Case Studies: Chapter 9.1 –Regression Coefficients in the Multiple Linear Regression Model: Chapter.
Lecture 19 Simple linear regression (Review, 18.5, 18.8)
Linear Regression/Correlation
Regression and Correlation Methods Judy Zhong Ph.D.
SIMPLE LINEAR REGRESSION
Inference for regression - Simple linear regression
Chapter 13: Inference in Regression
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
Correlation and Regression. The test you choose depends on level of measurement: IndependentDependentTest DichotomousContinuous Independent Samples t-test.
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
Stat 112 Notes 3 Today: –Finish Chapter 3.3 (Hypothesis Testing). Homework 1 due next Thursday.
Introduction to Linear Regression
EQT 373 Chapter 3 Simple Linear Regression. EQT 373 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value.
Applied Quantitative Analysis and Practices LECTURE#23 By Dr. Osman Sadiq Paracha.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
+ Chapter 12: More About Regression Section 12.1 Inference for Linear Regression.
Stat 112: Notes 2 Today’s class: Section 3.3. –Full description of simple linear regression model. –Checking the assumptions of the simple linear regression.
STA 286 week 131 Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression.
Lecture 10: Correlation and Regression Model.
Applied Quantitative Analysis and Practices LECTURE#25 By Dr. Osman Sadiq Paracha.
June 30, 2008Stat Lecture 16 - Regression1 Inference for relationships between variables Statistics Lecture 16.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Simple Linear Regression Analysis Chapter 13.
Statistics for Managers Using Microsoft® Excel 5th Edition
Unit 4 Lesson 3 (5.3) Summarizing Bivariate Data 5.3: LSRL.
Stat 112 Notes 14 Assessing the assumptions of the multiple regression model and remedies when assumptions are not met (Chapter 6).
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 12-1 Chapter 12 Simple Linear Regression Statistics for Managers Using.
Chapter 5 Lesson 5.2 Summarizing Bivariate Data 5.2: LSRL.
Chapters 8 Linear Regression. Correlation and Regression Correlation = linear relationship between two variables. Summarize relationship with line. Called.
Université d’Ottawa / University of Ottawa 2001 Bio 4118 Applied Biostatistics L11.1 Simple linear regression What regression analysis does The simple.
Introduction. We want to see if there is any relationship between the results on exams and the amount of hours used for studies. Person ABCDEFGHIJ Hours/
The simple linear regression model and parameter estimation
Inference for Least Squares Lines
Unit 4 LSRL.
LSRL.
Least Squares Regression Line.
Chapter 3.2 LSRL.
Simple Linear Regression - Introduction
CHAPTER 29: Multiple Regression*
Least Squares Regression Line LSRL Chapter 7-continued
BA 275 Quantitative Business Methods
Chapter 5 LSRL.
Presentation transcript:

Lecture 15 – Tues., Oct. 28 Review example of one-way layout Simple Linear Regression: –Simple Linear Regression Model, 7.2 –Least Squares Regression Estimation, , –Causation, Next time: Inference for simple linear regression, 7.3.3, 7.3.5, 7.4.

Review of One-way layout Assumptions of ideal model –All populations have same standard deviation. –Each population is normal –Observations are independent Planned comparisons: Usual t-test but use all groups to estimate. If many planned comparisons, use Bonferroni to adjust for multiple comparisons Test of vs. alternative that at least two means differ: one-way ANOVA F-test Unplanned comparisons: Use Tukey-Kramer procedure to adjust for multiple comparisons.

Review Example A developmental psychologist is interested in the extent to which children’s memory for facts improves as children get older. Ten children of ages 4, 6, 8 and 10 are randomly selected to participate in the study. Each child is given a 30 item memory test; the scores are recorded in memorytest.JMP.

Regression for memorytest Let Y = score, X = age. Each age is a subpopulation. The regression of Y on X is the mean of Y as a function of the subpopulation X, denoted by Simple linear regression model: = slope = change in mean number of items remembered for each additional year of age = intercept = mean number of items remembered at age 0 Least squares estimates:

Regression – General Setup General setup: We have data (y i, x i ), i=1,…,n. [Later we will look at setting where we have multiple x’s]. Y is called the response variable, X is called the explanatory variable. Regression: the mean of Y given X=x, Regression model: an ideal formula to approximate the regression Simple linear regression model:

Uses of Regression Analysis Description: Describe the association between Y and X, e.g., case study 7.1.1: What is the relationship between the distance from Earth (Y) and the recession velocity of extra- galactic nebulae (X)? The relationship can be used to estimate the age of the universe using the theory of the big bang. Passive prediction. Predict y based on x where you do not plan to manipulate x, e.g., predict today’s stock price based on yesterday’s stock price. Control. Predict what y will be if you change x, e.g., predict what your earnings will be if you obtain different levels of education.

Example (Problem 30) Studies over the past two decades have shown that activity can affect the reorganization of the human central nervous system. Psychologists used magnetic source imaging (MSI) to measure neuronal activity in the brains of nine string players and six controls when thumb and fifth finger of left hand were exposed to mild stimulation. Research hypothesis: String players, who use fingers of left hand extensively, should show different brain behavior (in particular more neuronal activity).

Example Continued Two-sided t-test: p-value = , CI = (7.51,18.92), strong evidence that string players have higher neuron activity than controls More interesting question: How much does neuron activity index increase per extra year of playing the instrument? Y= neuron activity index, X = years playing. Simple linear regression model: What is the interpretation of and here?

Ideal Model Assumptions of ideal simple linear regression model –There is a normally distributed subpopulation of responses for each value of the explanatory variable –The means of the subpopulations fall on a straight-line function of the explanatory variable. –The subpopulation standard deviations are all equal (to ) –The selection of an observation from any of the subpopulations is independent of the selection of any other observation.

Estimating the coefficients We want to make the predictions of Y based on X as good as possible. The best prediction of Y based on X is Least Squares Method: Choose coefficients to minimize the sum of squared prediction errors. Fitted value for observation i is its estimated mean: Residual for observation is the prediction error of using X to predict Y: Least squares method: Find estimates that minimize the sum of squared residuals, solution on page 182.

Regression Analysis in JMP Use Analyze, Fit Y by X. Put response variable in Y and explanatory variable in X (make sure X is continuous). Click on fit line under red triangle next to Bivariate Fit of Y by X.

JMP output for example

The standard deviation is the standard deviation in each subpopulation. measures the accuracy of predictions from the regression. If the simple linear regression model holds, then approximately –68% of the observations will fall within of the regression line –95% of the observations will fall within of the regression line

Estimating Residuals provide basis for an estimate of Degrees of freedom for simple linear regression = n-2 If the simple linear regression models holds, then approximately –68% of the observations will fall within of the least squares line –95% of the observations will fall within of the least squares line

JMP commands is found under Summary of Fit and is labeled “Root Mean Square Error” To look at a plot of residuals versus X, click Plot Residuals under the red triangle next to Linear Fit after fitting the line. To save the residuals or fitted values (predicted values), click Save Residuals or Save Predicteds under the red triangle next to Linear Fit after fitting the line.

Interpolation and Extrapolation The simple linear regression model makes it possible to draw inference about any mean response, Interpolation: Drawing inference about mean response for X within range of observed X; strong advantage of regression model is ability to interpolate. Extrapolation: Drawing inference about mean response for X outside of range of observed X; dangerous. Straight-line model may hold approximately over region of observed X but not for all X.

Extrapolation in Memory Test Y=Score on test of 30 items, X = Age. Least squares estimates: Predicted Mean of Y at age 0: 4.7 Predicted Mean of Y at age 20: 43.9 Predicted Mean of Y at age 90: 181.1

Difficulties of extrapolation Mark Twain: “In the space of one hundred and seventy-six years, the Lower Mississippi has shortened itself two hundred and forty-two miles. That is an average of a trifle over one mile and a third per year. Therefore, any calm person, who is not blind or idiotic, can see that in the old Oolitic Silurian period, just a million years ago next November, the Lower Mississippi River was upward of one million three hundred thousand miles long, and stuck out over the Gulf of Mexico like a fishing-rod. And by the same token any person can see that seven hundred and forty-two years from now the Lower Mississippi will be only a mile and three-quarters long, and Cairo and New Orleans will have joined their streets together and be plodding comfortably along under a single mayor and a mutual board of aldermen. There is something fascinating about science. One gets such wholesale return of conjecture out of such a trifling investment of fact.”

Cause and Effect? The regression summarizes the association between the mean response of Y and the value of the explanatory variable X. No cause and effect relationship can be inferred unless X is randomly assigned to units in a random experiment. A researcher measures the number of television sets per person X and the average life expectancy Y for the world’s nations. The regression line has a positive slope – nations with many TV sets have higher life expectancies. Could we lengthen the lives of people in Rwanda by shipping them TV sets?

Brain activity in string players Y=neuron activity, X = years playing string instrument Least squares estimates: Is this a randomized experiment? What is an alternative explanation for the association between Y and X other than that playing string instruments causes an increase in the neuron activity index?