Correlation When two sets of data are strongly linked together we say they have a High Correlation.

Slides:



Advertisements
Similar presentations
A Brief Introduction to Spatial Regression
Advertisements

Spearman’s Rank Correlation Coefficient
Conclusions Causal relationship: ~one variable’s change causes a change in another's. Note~ Quiz this Friday Unit Test Next Friday (2 weeks)
R Squared. r = r = -.79 y = x y = x if x = 15, y = ? y = (15) y = if x = 6, y = ? y = (6)
Linear regression and correlation
7.1 Seeking Correlation LEARNING GOAL
Bivariate Data & Scatter Plots Learn to take bivariate data to create a scatter plot for the purpose of deriving meaning from the data.
Warm up Use calculator to find r,, a, b. Chapter 8 LSRL-Least Squares Regression Line.
Section 7.1 ~ Seeking Correlation
The Sample Standard Deviation. Example Let’s jump right into an example to think about some ideas. Let’s think about the number of Big Mac sandwiches.
Covariance and Correlation
Linear Regression and Correlation Topic 18. Linear Regression  Is the link between two factors i.e. one value depends on the other.  E.g. Drivers age.
Mathematics Scatter Graphs.
1 Lesson Shapes of Scatterplots. 2 Lesson Shapes of Scatterplots California Standard: Statistics, Data Analysis, and Probability 1.2 Represent.
Simple ideas of correlation Correlation refers to a connection between two sets of data. We will also be able to quantify the strength of that relationship.
Correlation vs. Causation What is the difference?.
Chapter 8: Linear Regression
Lecture 3-2 Summarizing Relationships among variables ©
Linear Regression.
HAWKES LEARNING SYSTEMS math courseware specialists Discovering Relationships Chapter 5 Copyright © 2010 by Hawkes Learning Systems/Quant Systems, Inc.
Correlation and regression 1: Correlation Coefficient
2.4 Trends, Interpolation and Extrapolation. Line of Best Fit a line that approximates a trend for the data in a scatter plot shows pattern and direction.
Chapter 13 Statistics © 2008 Pearson Addison-Wesley. All rights reserved.
Measure your handspan and foot length in cm to nearest mm We will record them as Bivariate data below: Now we need to plot them in what kind of graph?
Linear Regression When looking for a linear relationship between two sets of data we can plot what is known as a scatter diagram. x y Looking at the graph.
Math 15 Introduction to Scientific Data Analysis Lecture 5 Association Statistics & Regression Analysis University of California, Merced.
Introduction to Summary Statistics
Statistics: For what, for who? Basics: Mean, Median, Mode.
Nature of Science Science Nature of Science Scientific methods Formulation of a hypothesis Formulation of a hypothesis Survey literature/Archives.
© 2008 Pearson Addison-Wesley. All rights reserved Chapter 1 Section 13-6 Regression and Correlation.
Correlation – Pearson’s. What does it do? Measures straight-line correlation – how close plotted points are to a straight line Takes values between –1.
Behavioral Research Methods. College Level – Research methods Controlled Experiments to study human behavior Homework: Read 4 pages (click arrows) by.
CHAPTER 38 Scatter Graphs. Correlation To see if there is a relationship between two sets of data we plot a SCATTER GRAPH. If there is some sort of relationship.
© 2010 Pearson Prentice Hall. All rights reserved. CHAPTER 12 Statistics.
 Graph of a set of data points  Used to evaluate the correlation between two variables.
Unit One Notes: Graphing How do we graph data?. Name the different types of graphs (charts).
Sec 1.5 Scatter Plots and Least Squares Lines Come in & plot your height (x-axis) and shoe size (y-axis) on the graph. Add your coordinate point to the.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 3 Describing Relationships 3.1 Scatterplots.
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 8 Linear Regression.
Section 4.1 Scatter Diagrams and Correlation. Definitions The Response Variable is the variable whose value can be explained by the value of the explanatory.
Objective: Understanding and using linear regression Answer the following questions: (c) If one house is larger in size than another, do you think it affects.
Correlation & Regression Correlation does not specify which variable is the IV & which is the DV.  Simply states that two variables are correlated. Hr:There.
April 1 st, Bellringer-April 1 st, 2015 Video Link Worksheet Link
Grade Reports Feb. 2 nd, SID.6: Line of Best Fit Feb. 2 nd, 2015.
9.1B – Computing the Correlation Coefficient by Hand
Correlation.
Algebra 3 Lesson 1.9 Objective: SSBAT identify positive, negative or no correlation. SSBAT calculate the line of best fit using a graphing calculator.
1 Data Analysis Linear Regression Data Analysis Linear Regression Ernesto A. Diaz Department of Mathematics Redwood High School.
Business Statistics for Managerial Decision Making
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 8- 1.
.  Relationship between two sets of data  The word Correlation is made of Co- (meaning "together"), and Relation  Correlation is Positive when the.
Discovering Mathematics Week 9 – Unit 6 Graphs MU123 Dr. Hassan Sharafuddin.
Scatter Plot A scatter plot is a graph of a collection of ordered pairs (x,y). The ordered pairs are not connected The graph looks like a bunch of dots,
Math 8C Unit 3 – Statistics. Unit 3 – Day 1 (U3D1) Standards Addressed: – Create a scatter plot and label any trends and clustering. – Explain why a linear.
Linear Regression Chapter 8. Fat Versus Protein: An Example The following is a scatterplot of total fat versus protein for 30 items on the Burger King.
Scatterplots and Correlation Section 3.1 Part 2 of 2 Reference Text: The Practice of Statistics, Fourth Edition. Starnes, Yates, Moore.
7.1 Seeking Correlation LEARNING GOAL
11.2 Scatter Plots. Scatter Plots Example: What is the relationship here? Is the data on the x-axis related to the data on the y-axis? Is there a relationship.
1 Correlation [kor-uh-ley-shun] noun Statistics: the degree to which two or more attributes or measurements show a tendency to vary together. HONR 101.
Correlation and Regression Ch 4. Why Regression and Correlation We need to be able to analyze the relationship between two variables (up to now we have.
Measures of Dispersion Advanced Higher Geography Statistics.
CCSS.Math.Content.8.SP.A.1 Construct and interpret scatter plots for bivariate measurement data to investigate patterns of association between two quantities.
Part II Exploring Relationships Between Variables.
Correlation of a Scatter Plot. What is correlation… O It describes the direction and strength of a straight-line relationship between two quantitative.
Scatter Plots and Correlation Coefficients
Discussion: What is this residual plot telling us about the relationship between speed and braking distance? Let’s now end by discussing how to interpret.
Short crash on correlations
Correlation vs. Causation
Scatterplots and Correlation
Data Analysis and Statistics
Presentation transcript:

Correlation When two sets of data are strongly linked together we say they have a High Correlation.

The word Correlation is made of Co- (meaning "together"), and Relation Correlation is Positive when the values increase together, and Correlation is Negative when one value decreases as the other increases

Correlation can have a value: 1 is a perfect positive correlation 0 is no correlation (the values don't seem linked at all) -1 is a perfect negative correlation The value shows how good the correlation is (not how steep the line is), and if it is positive or negative.

Example: Ice Cream Sales The local ice cream shop keeps track of how much ice cream they sell versus the temperature on that day, here are their figures for the last 12 days:

Ice Cream Sales vs Temperature Temperature °CIce Cream Sales 14.2°$ °$ °$ °$ °$ °$ °$ °$ °$ °$ °$ °$408

The same data as a Scatter PlotScatter Plot

We can easily see that warmer weather leads to more sales, the relationship is good but not perfect. In fact the correlation is see at the end how I calculated it. Correlation Is Not Good at Curves The correlation calculation only works well for relationships that follow a straight line.

Our Ice Cream Example: there has been a heat wave! It gets so hot that people aren't going near the shop, and sales start dropping.

Here is the latest graph:

The correlation is now 0: "No Correlation"... ! The calculated value of correlation is 0 (trust me, I worked it out), which says there is "no correlation". But we can see the data follows a nice curve that reaches a peak around 25° C. But the correlation calculation is not "smart" enough to see this. Moral of the story: make a Scatter Plot, and look at it! You may see more than the correlation value says.Scatter Plot

Correlation Is Not Causation "Correlation Is Not Causation"... which says that a correlation does not mean that one thing causes the other (there could be other reasons the data has a good correlation).

Example: Sunglasses vs Ice Cream Our Ice Cream shop finds how many sunglasses were sold by a big store for each day and compares them to their ice cream sales:

How To Calculate How did I calculate the value at the top? I used "Pearson's Correlation". There is software that can calculate it, such as the CORREL() function in Excel or OpenOffice Calc but here is how to calculate it yourself: Let us call the two sets of data "x" and "y" (in our case Temperature is x and Ice Cream Sales is y): Step 1: Find the mean of x, and the mean of y Step 2: Subtract the mean of x from every x value (call them "a"), do the same for y (call them "b") Step 3: Calculate: a × b, a 2 and b 2 for every value Step 4: Sum up a × b, sum up a 2 and sum up b 2 Step 5: Divide the sum of a × b by the square root of [(sum of a 2 ) × (sum of b 2 )] Here is how I calculated the first Ice Cream example (values rounded to 1 or 0 decimal places):

As a formula it is: Where: Σ is Sigma, the symbol for "sum up"Sigma is each x-value minus the mean of x (called "a" above) is each y-value minus the mean of y (called "b" above) You probably won't have to calculate it like that, but at least you know it is not "magic", but simply a routine set of calculations. Note for Programmers You can calculate it in one pass through the data. Just sum up x, y, x 2, y 2 and xy (no need for a or b calculations above) then use the formula: