ContentDetail  Two variable statistics involves discovering if two variables are related or linked to each other in some way. e.g. - Does IQ determine.

Slides:



Advertisements
Similar presentations
Linear regression and correlation
Advertisements

Ch. 13Inferential Statistics 13.1 Line of Best Fit.
Correlation and regression Dr. Ghada Abo-Zaid
Unit 4: Linear Relations Minds On 1.Determine which variable is dependent and which is independent. 2.Graph the data. 3.Label and title the graph. 4.Is.
AP Statistics Chapters 3 & 4 Measuring Relationships Between 2 Variables.
LSRL Least Squares Regression Line
CORRELATON & REGRESSION
Describing the Relation Between Two Variables
Regression and Correlation
Basic Statistical Concepts Psych 231: Research Methods in Psychology.
Basic Statistical Concepts
Statistics Psych 231: Research Methods in Psychology.
Statistics for the Social Sciences Psychology 340 Fall 2006 Relationships between variables.
Basic Statistical Concepts Part II Psych 231: Research Methods in Psychology.
Correlation and Regression. Relationships between variables Example: Suppose that you notice that the more you study for an exam, the better your score.
Correlations Determining Relationships: Chapter 5:
Scatter Diagrams and Correlation
Correlation & Regression Math 137 Fresno State Burger.
Linear Regression Analysis
Lecture 3: Bivariate Data & Linear Regression 1.Introduction 2.Bivariate Data 3.Linear Analysis of Data a)Freehand Linear Fit b)Least Squares Fit c)Interpolation/Extrapolation.
Lecture 16 Correlation and Coefficient of Correlation
Correlation Scatter Plots Correlation Coefficients Significance Test.
 Once data is collected and organized, we need to analyze the strength of the relationship and formalize it with an equation  By understanding the strength.
Scatter Plots and Linear Correlation. How do you determine if something causes something else to happen? We want to see if the dependent variable (response.
Chapter 14 – Correlation and Simple Regression Math 22 Introductory Statistics.
Is there a relationship between the lengths of body parts ?
Chapter 13 Statistics © 2008 Pearson Addison-Wesley. All rights reserved.
Measure your handspan and foot length in cm to nearest mm We will record them as Bivariate data below: Now we need to plot them in what kind of graph?
Linear Regression When looking for a linear relationship between two sets of data we can plot what is known as a scatter diagram. x y Looking at the graph.
Jon Curwin and Roger Slater, QUANTITATIVE METHODS: A SHORT COURSE ISBN © Thomson Learning 2004 Jon Curwin and Roger Slater, QUANTITATIVE.
© 2008 Pearson Addison-Wesley. All rights reserved Chapter 1 Section 13-6 Regression and Correlation.
WELCOME TO THETOPPERSWAY.COM.
Sec 1.5 Scatter Plots and Least Squares Lines Come in & plot your height (x-axis) and shoe size (y-axis) on the graph. Add your coordinate point to the.
Scatterplots are used to investigate and describe the relationship between two numerical variables When constructing a scatterplot it is conventional to.
Linear Regression. Determine if there is a linear correlation between horsepower and fuel consumption for these five vehicles by creating a scatter plot.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Scatter Diagrams and Correlation Variables ● In many studies, we measure more than one variable for each individual ● Some examples are  Rainfall.
April 1 st, Bellringer-April 1 st, 2015 Video Link Worksheet Link
5.4 Line of Best Fit Given the following scatter plots, draw in your line of best fit and classify the type of relationship: Strong Positive Linear Strong.
We would expect the ENTER score to depend on the average number of hours of study per week. So we take the average hours of study as the independent.
2.5 Using Linear Models A scatter plot is a graph that relates two sets of data by plotting the data as ordered pairs. You can use a scatter plot to determine.
Chapter 2 Examining Relationships.  Response variable measures outcome of a study (dependent variable)  Explanatory variable explains or influences.
1 Data Analysis Linear Regression Data Analysis Linear Regression Ernesto A. Diaz Department of Mathematics Redwood High School.
Mathematical Studies for the IB Diploma © Hodder Education Pearson’s product–moment correlation coefficient.
Simple Linear Regression The Coefficients of Correlation and Determination Two Quantitative Variables x variable – independent variable or explanatory.
CORRELATION ANALYSIS.
6.7 Scatter Plots. 6.7 – Scatter Plots Goals / “I can…”  Write an equation for a trend line and use it to make predictions  Write the equation for a.
Summarizing Data Graphical Methods. Histogram Stem-Leaf Diagram Grouped Freq Table Box-whisker Plot.
UNIT 2 BIVARIATE DATA. BIVARIATE DATA – THIS TOPIC INVOLVES…. y-axis DEPENDENT VARIABLE x-axis INDEPENDENT VARIABLE.
GOAL: I CAN USE TECHNOLOGY TO COMPUTE AND INTERPRET THE CORRELATION COEFFICIENT OF A LINEAR FIT. (S-ID.8) Data Analysis Correlation Coefficient.
Correlation and Regression. O UTLINE Introduction  10-1 Scatter plots.  10-2 Correlation.  10-3 Correlation Coefficient.  10-4 Regression.
Correlation & Linear Regression Using a TI-Nspire.
Chapter 2 Bivariate Data Scatterplots.   A scatterplot, which gives a visual display of the relationship between two variables.   In analysing the.
Department of Mathematics
CHAPTER 10 & 13 Correlation and Regression
Lesson 4.5 Topic/ Objective: To use residuals to determine how well lines of fit model data. To use linear regression to find lines of best fit. To distinguish.
Correlation & Regression
Regression and Correlation
Chapter 5 STATISTICS (PART 4).
Chapter 4 Correlation.
Correlation At a tournament, athletes throw a discus. The age and distance thrown are recorded for each athlete: Do you think the distance an athlete.
Correlation and Regression
CHAPTER 10 Correlation and Regression (Objectives)
Simple Linear Regression
Suppose the maximum number of hours of study among students in your sample is 6. If you used the equation to predict the test score of a student who studied.
Correlation and Regression
Presentation transcript:

ContentDetail

 Two variable statistics involves discovering if two variables are related or linked to each other in some way. e.g. - Does IQ determine income? - Is there a link between foot size and the height of a person?  One variable is independent (x-axis) whilst the other is dependent (y-axis)  This section of statistics involves conclusions that can be made about data that has not been collected using data that has been collected.  Hence we can infer or predict certain points based on the data collected. Often this involves sampling as analysing an entire population can be difficult.

 A scatter plot is necessary to quickly determine whether the variables are related, however, more formally we may need to measure:  (1) Correlation – initially it may be necessary to determine if a relationship exists between two or more variables (Pearson’s product moment correlation coefficient)  (2) Regression analysis – if a relationship appears to exist we can then conduct further analysis to determine the type and strength of the relationship (Linear Regression)

 Correlation refers to the relationship or association between two variable.  They are classified qualitatively in three ways: › Direction – positive, negative, none › Strength – weak, moderate, strong › Type – linear or non-linear  They are classified quantitatively by Pearson’s product-moment correlation coefficient  Outliers must also be considered and usually appear as isolated points away from the main body (group) of data.  Dancing Statistics: Exam hint - use this language!

Positive linear correlations Negative linear correlations

 Be careful not to jump to conclusions when you determine a strong correlation between two variables – why? › It does not mean that a causal relationship exists, i.e. one variable does not necessarily cause the other. › e.g – there is a strong correlation between arm length and running speed, does that mean that short arms cause a reduction in running speed?  How Ice Cream Kills! MFGBDo

 r tells how strong a correlation is between two variables  There are several formulae for calculating “r” but the one given and used in the IB course is: › s xy is the covariance (It will always be given if required).  If not given then use calculator method › S x is the standard deviation of x data values › S y is the standard deviation of y data values Exam hint – make sure you know that s x is σ x and s y is σ y on your GDC!

 Read p. 44 in your book and follow along with the example to practice using the GDC to calculate r  r lies between -1 and 1 › The closer to 1 the r-value is, the stronger the (positive) correlation › The closer to 0 the r-value is, the weaker the correlation › The closer to -1 the r-value is, the stronger the (negative) correlation

Correlation coefficient valueDescription of strength & direction 1Perfect positive 0.8 to 1Strong positive 0.6 to 0.8Moderate positive 0.4 to 0.6Weak positive 0 to 0.4No correlation -0.4 to 0No correlation -0.6 to -0.4Weak negative -0.8 to -0.6Moderate negative -1 to -0.8Strong negative Perfect negative Note: These are only guideline values, there is no specific division points where the description has to change from strong to moderate etc.

 Use the data in the table below to calculate the r –value, given s xy =7.92 › Calculate the standard deviation of x › Calculate the standard deviation of y › Evaluate “r” using the IB formula and compare it to your calculator. # x y

 The line of best fit is the “quick and easy” way of finding the trend of the data › By eye it should have approximately the same number of data points above the line as below › A more accurate method is to calculate the mean of the x data and y data and ensure the trend line passes through this point called the mean point  Linear regression is the most accurate process for determining the trend line, as the process takes every data point in to account via a formula.

ContentDetail

 A statistician wants to know if there is a correlation between HSC maths scores and the Math Studies IB exam scores. She collected the following data from 10 randomly selected students. › Is there a correlation ? If so, what kind? › Draw the scatter plot of IB vs HSC › Draw the line of best fit by finding the mean of each variable. › If an HSC score is 77, predict the corresponding IB score. › If an IB score was a 2, predict the corresponding HSC score. # HSC IB

 The line of best fit by the process of linear regression can be found using the given IB formula: › s xy is the covariance (It will always be given if required). › S x is the standard deviation of x data values › s x 2 = (s x ) 2 i.e the std dev of x, then squared

 Find the equation of the line of best fit in y=ax+b form using the linear regression formula, if: s xy = 9.23 s x = 3.46 = (14.4, 35.2)

 Once you have a line of best fit you can use that equation to infer or predict what would happen to one variable if the other changes.  If you are predicting values within the range of your current data then you are said to be “interpolating”. › The accuracy of interpolation depends on the accuracy of your line of best fit and your r-value  If you are predicting values outside the range of your data then you are “extrapolating”. › The accuracy of extrapolation not only on the accuracy of your line of best fit but also whether it is reasonable to assume that the same trend will continue outside your range of data.