Regression Correlation Background Defines relationship between two variables X and Y R ranges from -1 (perfect negative correlation) 0 (No correlation)

Slides:

Advertisements

Similar presentations

Regression and correlation methods

Advertisements

Bivariate Analyses.

Learning Objectives Copyright © 2002 South-Western/Thomson Learning Data Analysis: Bivariate Correlation and Regression CHAPTER sixteen.

Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.

Correlation and Regression

EPI 809/Spring Probability Distribution of Random Error.

Comparing k Populations Means – One way Analysis of Variance (ANOVA)

Correlation & Regression Chapter 15. Correlation statistical technique that is used to measure and describe a relationship between two variables (X and.

Correlation Correlation is the relationship between two quantitative variables. Correlation coefficient (r) measures the strength of the linear relationship.

Statistical Tests Karen H. Hagglund, M.S.

PSY 307 – Statistics for the Behavioral Sciences

The Simple Regression Model

Lesson #32 Simple Linear Regression. Regression is used to model and/or predict a variable; called the dependent variable, Y; based on one or more independent.

Chapter Eighteen MEASURES OF ASSOCIATION

SIMPLE LINEAR REGRESSION

Chapter Topics Types of Regression Models

Topics: Regression Simple Linear Regression: one dependent variable and one independent variable Multiple Regression: one dependent variable and two or.

Introduction to Probability and Statistics Linear Regression and Correlation.

SIMPLE LINEAR REGRESSION

Ch. 14: The Multiple Regression Model building

Today Concepts underlying inferential statistics

1 Chapter 17: Introduction to Regression. 2 Introduction to Linear Regression The Pearson correlation measures the degree to which a set of data points.

Summary of Quantitative Analysis Neuman and Robson Ch. 11

Relationships Among Variables

Statistical hypothesis testing – Inferential statistics II. Testing for associations.

Correlation and Linear Regression

Linear Regression.

SIMPLE LINEAR REGRESSION

AM Recitation 2/10/11.

Introduction to Linear Regression and Correlation Analysis

Regression Analysis (2)

Bivariate Relationships Analyzing two variables at a time, usually the Independent & Dependent Variables Like one variable at a time, this can be done.

1 Psych 5500/6500 Chi-Square (Part Two) Test for Association Fall, 2008.

Understanding Multivariate Research Berry & Sanders.

Statistical Analysis. Statistics u Description –Describes the data –Mean –Median –Mode u Inferential –Allows prediction from the sample to the population.

Correlation and Regression Used when we are interested in the relationship between two variables. NOT the differences between means or medians of different.

POSC 202A: Lecture 12/10 Announcements: “Lab” Tomorrow; Final ed out tomorrow or Friday. I will make it due Wed, 5pm. Aren’t I tender? Lecture: Substantive.

Elementary Statistics Correlation and Regression.

Choosing the Appropriate Statistics Dr. Erin Devers October 17, 2012.

Chapter 14 Inference for Regression AP Statistics 14.1 – Inference about the Model 14.2 – Predictions and Conditions.

Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.

© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.

Regression & Correlation. Review: Types of Variables & Steps in Analysis.

STA291 Statistical Methods Lecture LINEar Association o r measures “closeness” of data to the “best” line. What line is that? And best in what terms.

Practice You collect data from 53 females and find the correlation between candy and depression is Determine if this value is significantly different.

Statistics in IB Biology Error bars, standard deviation, t-test and more.

Applied Quantitative Analysis and Practices LECTURE#25 By Dr. Osman Sadiq Paracha.

With the growth of internet service providers, a researcher decides to examine whether there is a correlation between cost of internet service per.

June 30, 2008Stat Lecture 16 - Regression1 Inference for relationships between variables Statistics Lecture 16.

LESSON 6: REGRESSION 2/21/12 EDUC 502: Introduction to Statistics.

Nonparametric Statistics

Chapter 14 Introduction to Regression Analysis. Objectives Regression Analysis Uses of Regression Analysis Method of Least Squares Difference between.

Statistical principles: the normal distribution and methods of testing Or, “Explaining the arrangement of things”

Nonparametric Statistics

Inference for Least Squares Lines

AP Statistics Chapter 14 Section 1.

Hypothesis Testing Review

Understanding Standards Event Higher Statistics Award

POSC 202A: Lecture Lecture: Substantive Significance, Relationship between Variables 1.

BPK 304W Correlation.

Nonparametric Statistics

Comparing k Populations

SIMPLE LINEAR REGRESSION

Example on the Concept of Regression . observation

SIMPLE LINEAR REGRESSION

15.1 The Role of Statistics in the Research Process

Chapter 14 Inference for Regression

Introduction to Regression

Presentation transcript:

Regression Correlation Background Defines relationship between two variables X and Y R ranges from -1 (perfect negative correlation) 0 (No correlation) +1 (perfect positive correlation) R=.689

Regression Correlation Background R 2 Indicates reduction in error knowing X and Predicting Y R 2 ranges from 0 (No reduction in error) 1 (complete reduction in error) R 2 =.474

Regression Examples Predicting height from G.P.A. R 2 = 0 (Knowing height does not help predict G.P.A – best guess is always mean G.P.A.) R 2 = 1 (Knowing height in CM completely predicts height in Inches)

Regression Real world examples are somewhere in between Predicting height from weight R 2 =.36 (Knowing height somewhat helps predict weight)

Regression But how do we figure out HOW to make that prediction given one of the variables?

Regression Need background concept of slope How much does Y change for a given change in X? All lines have R=1

Regression All lines have R=-1

Regression Need background concept of INTERCEPT What is Y when X=0? All lines have Same Slope but different intercept

Regression Unique line is defined by Slope and Y- Intercept Y=bX+a b=slope a=Y-Interecpt

Regression Predicting depression from loneliness Y= BDI Depression X= Loneliness Y=2X+2

Regression Predicted vs. Actual R=1, R 2 =1 No Error Never happens like this in real world

Actual scores don’t fit on a line perfectly

Some possible solutions? Error is Sum of (Predicted Y-Actual Y) 2

Where is the line with smallest error? Least Squares Regression Line

Calc slope=b= Σ (X-X)(Y-Y) Σ (X-X)(X-X) =2.13 with this data

Where is the line with smallest error? Least Squares Regression Line Calc y intercept = a Y- (b)(X) =4 with this data So Least squares regression line is Y=2.13X+4

Where is the line with smallest error? Least Squares Regression Line

How good is our prediction? Sum of (Predicted Y-Actual Y) 2 X ScoreActual Y scorePredicted Y scoreSquared Error

Can we standardize this for an average Error? Yes: Standard error of the estimate Like a standard deviation Gives average precition error per score Standard error of the estimate = SQRT(SS residual /N pairs -2) In this example = SQRT(44.9/10-2)=SQRT(44.9/8)=2.36

Chi-square (χ2) Non Parametric Statistical tests Used for nominal data (categories) ordinal (ordered categories) non-normal interval/ratio data

Goodness of fit χ2 Used with nominal data Tests a DISTRIBUTION (not a mean) Sees if observed data FITS an expected distribution H 0 =true frequency distribution is expected H 1 =true frequency distribution has some other form

VEGAS BABY!!! Rolling dice at the Mirage Lots of Snake Eyes coming up Are the dice fixed? Test with goodness of fit Does our distribution FIT the expected distribution

VEGAS BABY!!! Expected distribution for 120 rolls if fair: Each die(dice) has 1/6 chance 1/6 X 120 = 20 of each type Expected Distribution = [20,20,20,20,20,20]

VEGAS BABY!!! Actual distribution for 120 rolls is: [28,16,23,23,17,13] Are these dice fair? Use Goodness of fit χ2

VEGAS BABY!!! Determine critical χ2 value: df = number of categories – 1 = 6-1 = 5 χ2 critical for df=5 is from table

CatOiOi EiEi (O i -E i )(O i -E i ) 2 (O i -E i ) 2 / E i Σ FAIR!!!

CatOiOi EiEi (O i -E i )(O i -E i ) 2 (O i -E i ) 2 / E i Σ CHEAT!!!

Test of independence χ2 Used with nominal data Tests whether DISTRIBUTION 1 is dependent upon DISTRIBUTION 2 H 0 = Distribution 1 is independent of Distribution 2 H 1 = Distribution 1 is related to Distribution 2

Example: Are Men more likely to have supported was in IRAQ 100 Subjects (50 male, 50 female) Asked yes or no question about supporting war in Iraq H 0 = Gender does not affect likelihood of supporting war H 1 = Gender does affect likelihood of supporting war

Determine critical Value Df = (R-1) (C-1) Df = (Category 1 Size -1) size X Category 2 Size -1) =(2-1) X (2-1) = 1 X 1 = 1 Critical Value from A-3 is 3.84

Set up Data MalesFemalesTotal Support war Not support war Total

Set up Data MalesFemalesTotal Support war 32 (26.5) 21(26.5)53 Not support war 18 (23.5) 29(23.5)47 Total

CategoryOiEi(Oi-Ei)(Oi-Ei) 2 (Oi-Ei) 2 / Ei M/S M/N F/S F/N Σ Calculate observed χ2

Test observed against critical observed χ2 = 4.86 critical χ2 = 3.84 So we reject the idea that gender does not affect support of war and conclude Gender DOES affect support of war

McNemar test for significance of change Used with nominal data Tests whether DISTRIBUTION 1 is dependent upon DISTRIBUTION 2 Same as test of dependence but uses SAME person to test nominal data before and after some event

Example: Are Men more likely to have supported was in IRAQ 100 Subjects Do you favor the pledge allegiance? Before and After terrorist attacks H 0 = proportion of individuals supporting pledge before attacks is same as after attacks H 1 = proportion of individuals supporting pledge before attacks is different after attacks

Determine critical Value Df = 1 for all McNemar tests Critical Value is 3.84

Set up Data Before Attacks YesNoTotal After AttacksYes No Total

Set up Data Before Attacks YesNoTotal After AttacksYes (14.5) 53 No 9 (14.5) 3847 Total 4258

CategoryOiEi(Oi-Ei)(Oi-Ei) 2 (Oi-Ei) 2 / Ei Σ Calculate observed χ2

Test observed against critical observed χ2 = 4.71 critical χ2 = 3.84 So we reject the idea that the proportions are the same Conclusion: Attacks did change the proportion who support pledge of allegiance