Regression Review Evaluation Research (8521) Prof. Jesse Lecy Lecture 1 1.

Slides:



Advertisements
Similar presentations
The Simple Regression Model
Advertisements

1 Virtual COMSATS Inferential Statistics Lecture-7 Ossam Chohan Assistant Professor CIIT Abbottabad.
Chapter 8: Prediction Eating Difficulties Often with bivariate data, we want to know how well we can predict a Y value given a value of X. Example: With.
1 SSS II Lecture 1: Correlation and Regression Graduate School 2008/2009 Social Science Statistics II Gwilym Pryce
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Sociology 690 – Data Analysis Simple Quantitative Data Analysis.
Chapter 5 Hypothesis Tests With Means of Samples Part 1.
Statistics: Data Analysis and Presentation Fr Clinic II.
1 Review of Correlation A correlation coefficient measures the strength of a linear relation between two measurement variables. The measure is based on.
Statistical Analysis SC504/HS927 Spring Term 2008 Week 17 (25th January 2008): Analysing data.
Chapter Topics Types of Regression Models
The Simple Regression Model
FIN357 Li1 The Simple Regression Model y =  0 +  1 x + u.
REGRESSION Predict future scores on Y based on measured scores on X Predictions are based on a correlation from a sample where both X and Y were measured.
Measurement, Quantification and Analysis Some Basic Principles.
 a fixed measure for a given population  ie: Mean, variance, or standard deviation.
Linear Regression 2 Sociology 5811 Lecture 21 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
Introduction to Regression Analysis, Chapter 13,
Simple Linear Regression Analysis
So are how the computer determines the size of the intercept and the slope respectively in an OLS regression The OLS equations give a nice, clear intuitive.
Standard error of estimate & Confidence interval.
Correlation and Linear Regression
Confidence Intervals. Estimating the difference due to error that we can expect between sample statistics and the population parameter.
CHAPTER 4 Measures of Dispersion. In This Presentation  Measures of dispersion.  You will learn Basic Concepts How to compute and interpret the Range.
Introduction to Linear Regression and Correlation Analysis
Chapter 13: Inference in Regression
Linear Regression Inference
Section #6 November 13 th 2009 Regression. First, Review Scatter Plots A scatter plot (x, y) x y A scatter plot is a graph of the ordered pairs (x, y)
Simple Linear Regression Models
6.1 What is Statistics? Definition: Statistics – science of collecting, analyzing, and interpreting data in such a way that the conclusions can be objectively.
Confidence Intervals (Chapter 8) Confidence Intervals for numerical data: –Standard deviation known –Standard deviation unknown Confidence Intervals for.
CHAPTER 8 Estimating with Confidence
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
Descriptive Statistics: Variability Lesson 5. Theories & Statistical Models n Theories l Describe, explain, & predict real- world events/objects n Models.
Statistics 1 Measures of central tendency and measures of spread.
VARIANCE & STANDARD DEVIATION By Farrokh Alemi, Ph.D. This lecture is organized by Dr. Alemi and narrated by Yara Alemi. The lecture is based on the OpenIntro.
VI. Evaluate Model Fit Basic questions that modelers must address are: How well does the model fit the data? Do changes to a model, such as reparameterization,
Rule of sample proportions IF:1.There is a population proportion of interest 2.We have a random sample from the population 3.The sample is large enough.
Correlation and Prediction Error The amount of prediction error is associated with the strength of the correlation between X and Y.
Objectives The student will be able to: find the variance of a data set. find the standard deviation of a data set.
Summary of introduced statistical terms and concepts mean Variance & standard deviation covariance & correlation Describes/measures average conditions.
Part 2: Model and Inference 2-1/49 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics.
Educ 200C Wed. Oct 3, Variation What is it? What does it look like in a data set?
STA291 Statistical Methods Lecture LINEar Association o r measures “closeness” of data to the “best” line. What line is that? And best in what terms.
Standard Deviation Lecture 18 Sec Tue, Feb 15, 2005.
Chapter 3: Averages and Variation Section 2: Measures of Dispersion.
 The point estimators of population parameters ( and in our case) are random variables and they follow a normal distribution. Their expected values are.
1 Psych 5500/6500 Measures of Variability Fall, 2008.
Objectives The student will be able to:
Measures of Dispersion Section 4.3. The case of Fred and Barney at the bowling alley Fred and Barney are at the bowling alley and they want to know who’s.
Standard Deviation. Two classes took a recent quiz. There were 10 students in each class, and each class had an average score of 81.5.
Regression Analysis Deterministic model No chance of an error in calculating y for a given x Probabilistic model chance of an error First order linear.
Seven Sins of Regression Evaluation Research (8521) Prof. Jesse Lecy Lecture 7 1.
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 7: Regression.
Ex St 801 Statistical Methods Inference about a Single Population Mean (CI)
2.4 Measures of Variation Prob & Stats Mrs. O’Toole.
GOVT 201: Statistics for Political Science
Standard Deviation Lecture 18 Sec Tue, Oct 4, 2005.
Standard Deviation Calculate the mean Given a Data Set 12, 8, 7, 14, 4
Regression Computer Print Out
Confidence Intervals for a Population Mean, Standard Deviation Known
Sociology 690 – Data Analysis
Standard Deviation Lecture 20 Sec Fri, Feb 23, 2007.
Standard Deviation Lecture 20 Sec Fri, Sep 28, 2007.
Sociology 690 – Data Analysis
GENERALIZATION OF RESULTS OF A SAMPLE OVER POPULATION
Standard Deviation Lecture 18 Sec Wed, Oct 4, 2006.
Chapter 5 Hypothesis Tests With Means of Samples
Presentation transcript:

Regression Review Evaluation Research (8521) Prof. Jesse Lecy Lecture 1 1

Regression Review: Regression Error and the Standard Error of the Regressors

What should be clear in my mind? Variance and standard deviation. What is a sampling distribution? Difference between the standard deviation and standard error. What role the standard error plays in the confidence interval. Definitions of covariance and correlation. The “intuitive” regression formula.

The Road Map Variance  St. Dev  St. Error  Confidence Intervals Mean: Slope: (of x) (of the residual) (of the slope) (of the mean)

VARIANCE

KevinLarry Who travels the furthest?

KevinLarry

STD. DEVIATION VS. STD. ERROR

You first need a reference point in order to calculate average distance. You can’t ask “how far have I traveled if I am in Chicago?” without knowing where you started from. Use the mean as the starting point for each distance. Variance is a measure of average dispersion. So you add up all the distances. The problem is that when you use the mean then the sum of all distances from the mean will always be zero. As a result, you must square them first. Divide by N so you have an average squared distance from the mean. It is actually N-1 for reasons we will not discuss. This is variance. We want to reconcile the units so that the number is meaningful. We squared everything, so we take the square root. This is the standard deviation. Intuitively, it is the “average” amount each point must travel to reach the mean. What is a standard deviation? 1-22

What is the standard error? In this case the “error” means how far, on average, we come from the true mean. Note! This is the standard error of the mean.

What is the standard error? This is the standard error of the slope.

CONFIDENCE INTERVALS

Recall that a distribution can reflect the observed characteristics of a population, but it can also reflect the sampling variation of a parameter.

How sure are we about the mean? If we were sure of ourselves we wouldn’t need a margin of error! Because we only have a sample, though, we can’t be certain. The population parameters are never known, so we use t-stats and the formula for the sample standard error. (CI of the mean)

How often are we wrong? μ Chose an alpha-level, which determines the size of the confidence interval. This example uses alpha=0.05. We would expect five samples in one-hundred to result in confidence intervals that do not contain the true mean. We see 3 in 50 draws here, which is consistent with expectations.

COVARIANCE

What is Covariance? (+,+) (+,-)(-,-) (-,+) Classroom size and test scores XYX-XbarY-Ybar(X-Xbar)(Y-Ybar) (-)5-3 (+)(-)(+) = (-) (-)2-3 (-)(-)(-) = (+) (+)2-3 (-)(+)(-) = (-)

Negative Covariance Example High negative correlation, large negative b 1 in a regression Class Size Test Scores Class size and student performance

Positive Covariance Example Hours of study and test scores High positive correlation, large positive b 1 in a regression

Small Covariance Low correlation, small b 1 in a regression

What is Correlation? The correlation is the covariance in simple units: -1 < cor(x,y) < +1

THE REGRESSION SLOPE

Intuitive Regression Formula Cov(x,y) Var(x) We want to interpret the slope causally, meaning the outcome is going to change b amount because of a one-unit change in X.

The Road Map Variance  St. Dev  St. Error  Confidence Intervals Mean: Slope: (of x) (of the residual) (of the slope) (of the mean)

Exercise 25

Which model is the “right” one? To answer this we need to understand both slopes and standard errors.

xy z xy z xy z xy xy

xy z Example #1

xy z Example #2

xy z Example #3

xy xy