Measure your handspan and foot length in cm to nearest mm We will record them as Bivariate data below: Now we need to plot them in what kind of graph?

Slides:



Advertisements
Similar presentations
A recent newspaper article reported that the number of personal computers being sold is increasing. In addition, the number of athletic shoes being sold.
Advertisements

Linear regression and correlation
Correlation and regression Dr. Ghada Abo-Zaid
MAT 105 SPRING 2009 Quadratic Equations
AP Statistics Chapters 3 & 4 Measuring Relationships Between 2 Variables.
Regression Regression: Mathematical method for determining the best equation that reproduces a data set Linear Regression: Regression method applied with.
Chapter 4 Describing the Relation Between Two Variables
2.2 Correlation Correlation measures the direction and strength of the linear relationship between two quantitative variables.
LSP 120: Quantitative Reasoning and Technological Literacy Section 118 Özlem Elgün.
Chapter 10 Relationships between variables
Regression and Correlation
1 Simple Linear Regression Chapter Introduction In this chapter we examine the relationship among interval variables via a mathematical equation.
More Graphs of y = nx2 Lesson
Correlation & Regression Math 137 Fresno State Burger.
Simple ideas of correlation Correlation refers to a connection between two sets of data. We will also be able to quantify the strength of that relationship.
Lecture 3: Bivariate Data & Linear Regression 1.Introduction 2.Bivariate Data 3.Linear Analysis of Data a)Freehand Linear Fit b)Least Squares Fit c)Interpolation/Extrapolation.
Linear Regression and Correlation
Lecture 3-2 Summarizing Relationships among variables ©
1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 11 Regression.
Correlation Scatter Plots Correlation Coefficients Significance Test.
12b. Regression Analysis, Part 2 CSCI N207 Data Analysis Using Spreadsheet Lingma Acheson Department of Computer and Information Science,
Is there a relationship between the lengths of body parts ?
Chapter 13 Statistics © 2008 Pearson Addison-Wesley. All rights reserved.
Correlation and regression lesson 1 Introduction.
Biostatistics Unit 9 – Regression and Correlation.
Researchers, such as anthropologists, are often interested in how two measurements are related. The statistical study of the relationship between variables.
Linear Regression When looking for a linear relationship between two sets of data we can plot what is known as a scatter diagram. x y Looking at the graph.
Jon Curwin and Roger Slater, QUANTITATIVE METHODS: A SHORT COURSE ISBN © Thomson Learning 2004 Jon Curwin and Roger Slater, QUANTITATIVE.
Bivariate Distributions Overview. I. Exploring Data Describing patterns and departures from patterns (20%-30%) Exploring analysis of data makes use of.
Correlation This Chapter is on Correlation We will look at patterns in data on a scatter graph We will be looking at how to calculate the variance and.
© 2008 Pearson Addison-Wesley. All rights reserved Chapter 1 Section 13-6 Regression and Correlation.
Correlation – Pearson’s. What does it do? Measures straight-line correlation – how close plotted points are to a straight line Takes values between –1.
Production Planning and Control. A correlation is a relationship between two variables. The data can be represented by the ordered pairs (x, y) where.
2 pt 3 pt 4 pt 5pt 1 pt 2 pt 3 pt 4 pt 5 pt 1 pt 2pt 3 pt 4pt 5 pt 1pt 2pt 3 pt 4 pt 5 pt 1 pt 2 pt 3 pt 4pt 5 pt 1pt Slope-Intercept Form Point-Slope.
Examining Relationships in Quantitative Research
Chapter 4 Describing the Relation Between Two Variables 4.1 Scatter Diagrams; Correlation.
LBSRE1021 Data Interpretation Lecture 11 Correlation and Regression.
Regression Regression relationship = trend + scatter
STATISTICS 12.0 Correlation and Linear Regression “Correlation and Linear Regression -”Causal Forecasting Method.
Linear Regression. Determine if there is a linear correlation between horsepower and fuel consumption for these five vehicles by creating a scatter plot.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Regression.
Relationships If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the.
April 1 st, Bellringer-April 1 st, 2015 Video Link Worksheet Link
Creating a Residual Plot and Investigating the Correlation Coefficient.
Area of a circle Radius r (m) Area A (m 2 ) when r = 2.5 A = 20 (to 2sf) when A = 30 r = 3.1 (to 2sf)
5.4 Line of Best Fit Given the following scatter plots, draw in your line of best fit and classify the type of relationship: Strong Positive Linear Strong.
Correlation.
Basic Statistics Linear Regression. X Y Simple Linear Regression.
LSP 120: Quantitative Reasoning and Technological Literacy Topic 1: Introduction to Quantitative Reasoning and Linear Models Lecture Notes 1.2 Prepared.
STATISTICS 12.0 Correlation and Linear Regression “Correlation and Linear Regression -”Causal Forecasting Method.
Discovering Mathematics Week 9 – Unit 6 Graphs MU123 Dr. Hassan Sharafuddin.
ContentDetail  Two variable statistics involves discovering if two variables are related or linked to each other in some way. e.g. - Does IQ determine.
Correlation Assumptions: You can plot a scatter graph You know what positive, negative and no correlation look like on a scatter graph.
Correlation. 2  In this topic, we will look at patterns in data on a scatter graph.  We will see how to numerically measure the strength of correlation.
REGRESSION Stats 1 with Liz. AIMS By the end of the lesson, you should be able to… o Understand the method of least squares to find a regression line.
PreCalculus 1-7 Linear Models. Our goal is to create a scatter plot to look for a mathematical correlation to this data.
Correlation & Linear Regression Using a TI-Nspire.
Correlation S1 Maths with Liz.
Regression and Correlation
CHAPTER 7 LINEAR RELATIONSHIPS
Lesson 2 Graphs.
Scatterplots A way of displaying numeric data
Correlation.
Correlation and Regression
Descriptive Analysis and Presentation of Bivariate Data
Warsaw Summer School 2017, OSU Study Abroad Program
Presentation transcript:

Measure your handspan and foot length in cm to nearest mm We will record them as Bivariate data below: Now we need to plot them in what kind of graph? Go on then!......accurately correlation 1.xls

Before adding a line of best fit it is sensible to consider if there should be one in the first place. At GCSE we just looked at the scattergraph and decided visually whether the correlation; existed, was weak or strong. However this is dangerous. Consider the graphs below and state there correlation purely from a visual point of view. corrrelation 2.xls

Because of this statisticians use a numerical value to assess whether the correlation is strong enough to add a line of best fit. A popular choice is the Product Moment Correlation Coefficient (PMCC) This is often just denoted by "r" and is often squared (although this would mean you don't know if it's a positive or negative correlation.) This is calculated using the formula below. It is a lot easier on a spreadsheet or graphical calculator and so in exams they often give you some of the "bits". This is the formula we use practically but this link explains where it has come from and how it relates to your scattergraph points

Once you've calculated the PMCC it needs to be interpreted. Open this spreadsheet and use the graph tool to draw scattergraphs for each. Consider which coloured data has the strongest correlation. Now add a linear line of best fit and consider how close the points appear to the line. Do you still agree with your previous answers? Calculate the r values for each set of data. Now add the r 2 value to each graph. Were you right? Square root these values to find the r value and consider if it's negative or positive.

If r is 1 there is perfect positive correlation (the points form a straight line) If r is -1 there is perfect negative correlation Between -1 and 1 we have, strong weak and no correlation. The closer to 1 or -1 the stronger the correlation and the closer to 0 indicates no correlation between your values. However the more points you have in your dataset the further from 1 it will appear, despite a strong correlation. To interpret r correctly we must also consider how many pieces of data are collected. From your earlier datasets which of the turquoise and orange is strongest according to the r value?

They have almost the same PMCC value. However the orange dataset has a stronger correlation because it is more difficult to get 10 points near to a straight line than 5 points. This weblink gives a table of data you should refer to when considering if a value has a high enough PMCC value to assume a correlation exists. As long as the r value is larger than the one in the table you can be....% sure there is a correlation. Consider the yellow and blue data sets. Only one piece of data has changed. What is the probability these data sets show a correlation?

Try Ex 6A Q 2, 3, 6 and Ex 6C Q1, 2, 4, 5, 7

correlation 1.xls Add a line of best fit (visually) to your graph for feet and hand span data. Use what you have learned on C1 to calculate an equation for this line. Is this line the same as any of your classmates? Why do you think this is? Are you happy you have put your line in the right place? Could you move it and still be happy? What made you put it where you did?

Was your line equation the same as the excel one? Excel calculates this line mathematically rather than by visual judgement. It calculates the vertical distance between each coordinate and the possible line, adds the square of these distances together and then it adjusts the lines position to minimise this value. Why do you think it squares the value?

This seems complicated but there are formulae you can use to do it quickly. To begin with we will consider how each coordinate differs from the mean. Above you can see above how the formulae can be rewritten into an easier form to calculate. Below is how the three parts need to be put together to produce the Product Moment Correlation Coefficient (PMCC) - r from the Excel graphs we considered earlier. S xx = x 2 - ( x) 2 n S yy = y 2 - ( y) 2 n S xy = xy - ( x )( y) n

Try Ex 6B Q 1, 2, 3, 4, 5, 9

The formulae for S xx etc... can also be used to find the equation of the line of best fit A straight line is in the form y = a + bx where b is the gradient and found using where y is the mean of the y data and x is the mean of the x data Given the gradient of the line and knowing it should pass through the point (x,y) can easily be calculated

From our class data estimate the hand span of a year 12 SAC student who has a foot length of 30cm. Estimate the foot length of a student with a hand span of 22cm. Redraw the data with handspan on the x axis and draw a line of best fit. Is your answer the same. Download the data in excel and swap the data columns over. What happens to the equation of the line of best fit? Calculate the foot length above using both equations excel gives you. Comment on your results. correlation 1.xls

You will notice that the formula for b uses S xx but not S yy. This is because this formula is only used for a line if best fit required for finding y given a specific x coordinate. It minimises the distance of each point vertically from the line of best fit. If you want to estimate the x value given a specific y value you should use a different line of best fit which minimises the horizontal distance from each point to the line. The formula for the line of best fit is only very slightly different: Use b' and the means of x and y to find a' S xy b' = S yy

The only time we don't use the y = a' + b'x version for estimating x when we know the y value is when the x data is FIXED. If you collect data from an experiment where one value in the data is pre-set we call that FIXED and we must plot that on the x axis and then use the y = a + bx line of best fit for any estimating of values. An example of this might be timing an ice cube melting at certain temperatures. The temperatures used are decided before hand - FIXED - and temperature needs to be on the x axis.

Try Ex 7A Q1, 2, 4, 7, 9

A regression line can be used to estimate the value of any dependent variable for any independent variable. Interpolation is when you estimate the value with thin the range of data using the equation of the regression line (line of best fit) Extrapolation is when you estimate the value with thin the range of data using the equation of the regression line. What do you think are the dangers of either of these techniques and which one would you view most cautiously? Why?

Try Ex 7C Q1, 3, 4, 6, 8