Welcome to Week 05 College Statistics

Slides:



Advertisements
Similar presentations
Correlation and Linear Regression.
Advertisements

Probabilistic & Statistical Techniques Eng. Tamer Eshtawi First Semester Eng. Tamer Eshtawi First Semester
Correlation and Regression
Chapter 4 The Relation between Two Variables
2nd Day: Bear Example Length (in) Weight (lb)
Correlation and Regression Analysis
Basic Statistical Concepts Psych 231: Research Methods in Psychology.
Chapter 9: Correlation and Regression
BCOR 1020 Business Statistics Lecture 24 – April 17, 2008.
Least Squares Regression
Correlation and Regression A BRIEF overview Correlation Coefficients l Continuous IV & DV l or dichotomous variables (code as 0-1) n mean interpreted.
Correlation and Regression
Relationship of two variables
Chapter Correlation and Regression 1 of 84 9 © 2012 Pearson Education, Inc. All rights reserved.
ASSOCIATION BETWEEN INTERVAL-RATIO VARIABLES
Correlation and Regression
ASSOCIATION: CONTINGENCY, CORRELATION, AND REGRESSION Chapter 3.
Prior Knowledge Linear and non linear relationships x and y coordinates Linear graphs are straight line graphs Non-linear graphs do not have a straight.
Chapter 6 & 7 Linear Regression & Correlation
WELCOME TO THETOPPERSWAY.COM.
 Graph of a set of data points  Used to evaluate the correlation between two variables.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Relationships If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the.
CORRELATION. Correlation key concepts: Types of correlation Methods of studying correlation a) Scatter diagram b) Karl pearson’s coefficient of correlation.
Creating a Residual Plot and Investigating the Correlation Coefficient.
Chapter 4 – Correlation and Regression before: examined relationship among 1 variable (test grades, metabolism, trip time to work, etc.) now: will examine.
CHAPTER 5 CORRELATION & LINEAR REGRESSION. GOAL : Understand and interpret the terms dependent variable and independent variable. Draw a scatter diagram.
Statistics: Analyzing 2 Quantitative Variables MIDDLE SCHOOL LEVEL  Session #2  Presented by: Dr. Del Ferster.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics Seventh Edition By Brase and Brase Prepared by: Lynn Smith.
STATISTICS 12.0 Correlation and Linear Regression “Correlation and Linear Regression -”Causal Forecasting Method.
Discovering Mathematics Week 9 – Unit 6 Graphs MU123 Dr. Hassan Sharafuddin.
Chapter Correlation and Regression 1 of 84 9 © 2012 Pearson Education, Inc. All rights reserved.
Part II Exploring Relationships Between Variables.
1. Analyzing patterns in scatterplots 2. Correlation and linearity 3. Least-squares regression line 4. Residual plots, outliers, and influential points.
Correlation and Regression
Chapter 2 Linear regression.
Linear Regression Essentials Line Basics y = mx + b vs. Definitions
Correlation.
Welcome to the Unit 5 Seminar Kristin Webster
Sit in your permanent seat
Statistics 200 Lecture #6 Thursday, September 8, 2016
Scatter Plots and Correlation
Scatterplots, Association, and Correlation
Regression and Correlation
1 Functions and Applications
Correlation & Regression
Correlation and Simple Linear Regression
Welcome to . Week 12 Thurs . MAT135 Statistics.
Correlation 10/27.
Chapter 5 STATISTICS (PART 4).
Correlation and regression
SIMPLE LINEAR REGRESSION MODEL
Correlation 10/27.
4.5 – Analyzing Lines of Best Fit
Correlation and Regression
Statistics for the Social Sciences
CHAPTER 10 Correlation and Regression (Objectives)
Statistics Correlation
Correlation and Regression
2. Find the equation of line of regression
Lecture Notes The Relation between Two Variables Q Q
CORRELATION ANALYSIS.
Using Data to Analyze Trends: Spearman’s Rank
Correlation and Regression
11A Correlation, 11B Measuring Correlation
y = mx + b Linear Regression line of best fit REMEMBER:
Correlation & Trend Lines
Correlation and Regression
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Presentation transcript:

Welcome to Week 05 College Statistics

Two Variables So far we have used only one variable measured on each of our sample subjects What if we had more than one?

Two Variables Subject | Height | Weight 1 5’8’’ ’3’’ ’ ’7’’ ’2’’ 195

Two Variables With two variables measured on each subject, the measurements are “paired” – linked to each other

If you calculate means or standard deviations for one variable then the other, the pairing is lost, and you lose information (a bad thing…) Two Variables

Relationships Using the pairing, you can determine if there is a relationship between the two variables

Relationships The relationship can be seen by graphing the variables in a scatter graph Our brains are programmed to see visual relationships

Correlation There are all sorts of relationships, but we define a “correlation” to be a linear relationship

a linear relationship between two variables (x,y) Correlation

Francis Galton

Correlation Karl Pearson’s Correlation Coefficient: r

Correlation Correlation coefficient “r”: Measures the strength of the linear relationship between two variables (x,y)

Correlation coefficient “r”: Measured by the closeness of the data to a straight line of “best fit” Correlation

In real life, you practically never get all of your data points on the line of best fit How close is close enough?

Correlation Strong Weak relationship

Correlation So… a correlation is actually a measure of

Correlation So… a correlation is actually a measure of VARIABILITY!

Which has the strongest linear relationship? Correlation PROJECT QUESTION

Relationships Can be positive…

Relationships

Correlation Positive Correlation: as one variable increases, the other also increases

Correlation Negative correlation: as one variable increases, the other decreases

Correlation Zero correlation: no LINEAR relationship

Correlation Zero correlation: there may be a strong nonlinear relationship

Correlation Zero correlation: If it is a horizontal line

Positive, negative, or zero? Correlation PROJECT QUESTION

Correlation Correlation coefficients range from -1.0 to +1.0 r=-1.0 r=-0.9 r=-0.5 r=0.0 r=0.5 r=0.9 r=1.0

Correlation Correlation coefficients can’t be bigger than 1 or less than ≤ corr ≤ 1

Greeky Stuff The sample correlation coefficient is called “r” The population coefficient is called “ρ ” Pronounced “rho”

Which has the highest positive correlation? The highest negative correlation? Correlation PROJECT QUESTION

Correlation If you square the correlation coefficient (R 2 or RSQ) the coefficient tells how well the (x,y) values are “fitted” by the line of best fit

Correlation R 2 tells you in % how well the trend line fits the data

Is it a good fit? Correlation PROJECT QUESTION

R 2 = means 98.8% of the variability in the data is “explained” by the fit line Correlation

Is it a good fit? Correlation PROJECT QUESTION

Is it a good fit? Correlation PROJECT QUESTION

Is it a good fit? Correlation PROJECT QUESTION

Is it a good fit? Correlation PROJECT QUESTION

Questions?

How to Lie with Statistics #5 Just because two variables have a high correlation coefficient doesn’t mean one causes the other

They could both be caused by a third variable Or… it could just be a coincidence How to Lie with Statistics #5

How to Lie with Statistics Sleep studies

How to Lie with Statistics How would you “prove” that getting a certain amount of sleep causes you to live longer?

Correlation vs Causation Causation is hard to prove In research, causation can only be shown by a carefully controlled experiment

Correlation vs Causation So you can’t show causation by observation or by using a survey

Correlation vs Causation The research must be carefully designed so that the suspected cause can be the only cause of the result

Correlation vs Causation Just because a scientist speaks very persuasively about the results of their experiment does not mean he/she has proven causation

Correlation vs Causation You have to be skeptical!

Correlation vs Causation Spanish influenza video

Who “proved” causation? CAUSATION IN-CLASS PROBLEMS

Questions?

Causation If you know that one variable DOES cause the other, put the variable that is the “causal” or “explanatory” variable on the horizontal “x” axis

Causation The causal or explanatory or predictor variable is called the “independent” variable Changing its value changes the value of the y variable

Causation The variable affected by the causal variable (the response variable) is called the “dependent” variable Its value depends on the value of x

CAUSATION IN-CLASS PROBLEMS Which graph has the correct placement for the causative variable?

Questions?

Line of Best Fit The “line of best fit” is created by minimizing the total distance of all the points to the line (deviations)

Regression The line of best fit is called the “regression” line

Regression Because it is a line, it has an equation: y = b + mx m = slope b = y-intercept

Regression The slope “m” and the correlation coefficient “r” will both have the same sign

Regression R 2 tells how closely the regression line “fits” the data – “goodness of fit”

Regression As you can imagine, the calculations for correlation and the regression line are scary

Hooray for Excel! Regression

Francis Galton

Questions?

Regression in Excel Edwin Hubble gathered and analyzed data from astronomical objects He used regression to show that the universe is expanding

Regression in Excel Let’s take a look!

Regression in Excel What do we do first?

Regression in Excel What do we do first? GRAPH THE DATA!

Regression in Excel

Does it look like a straight line would fit the data well?

Regression in Excel Now we’re going to go to: Data Data Analysis Regression

Regression in Excel They want “y” first (I HATE this…)

Regression in Excel Let’s use “distance” for “x” and “velocity” for “y”

Regression in Excel Eeek! What’s all this????

Regression in Excel Here’s the RSQ. What is the %?

Regression in Excel For the trend line, you need:

Regression in Excel This (believe it or not) is the equation of the line of best fit!

Regression in Excel Line of best fit: y = mx + b

Regression in Excel Our equation is: Vel = x Dist

Regression in Excel Highlight and copy:

Regression in Excel Paste on the “Hubble” page

Regression in Excel Add a new column heading: Trend

We’re going to calculate our line:

Copy it down…

Oops! That doesn’t look right!

The reference cells are changing for each row We need to make those constant

Go back to the first entry Add a $ before the row numbers you want to keep constant

Now, copy it down!

Much better!

Regression in Excel Create a new graph:

Regression in Excel Make it purty! To make the trend line a line: change Marker Options to “none” change line color to “solid line”

TAH-DAH!

What Does It Say?

Regression Note: some people use the symbol “ŷ” for the trend data corresponding to the “y” observation data

Regression in Excel Summary of Regression Analysis: Data/Data Analysis/Regression Enter “y” first Copy the first two coefficients in the bottom table Trend line is: =Coeff*Data+Intercept Make a graph Change the trend dots into a line

Questions?