Two-Variable Statistics. Correlation  A relationship between two variables  As one goes up, the other changes in a predictable way (either mostly goes.

Slides:



Advertisements
Similar presentations
2-7 Curve Fitting with Linear Models Warm Up Lesson Presentation
Advertisements

 Objective: To look for relationships between two quantitative variables.
Correlation Data collected from students in Statistics classes included their heights (in inches) and weights (in pounds): Here we see a positive association.
Chapter 15 (Ch. 13 in 2nd Can.) Association Between Variables Measured at the Interval-Ratio Level: Bivariate Correlation and Regression.
Describing the Relation Between Two Variables
Correlation A correlation exists between two variables when one of them is related to the other in some way. A scatterplot is a graph in which the paired.
1 Chapter 10 Correlation and Regression We deal with two variables, x and y. Main goal: Investigate how x and y are related, or correlated; how much they.
Correlation vs. Causation What is the difference?.
The Line of Best Fit Linear Regression. Definition - A Line of Best or a trend line is a straight line on a Scatter plot that comes closest to all of.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Chapter 13 Statistics © 2008 Pearson Addison-Wesley. All rights reserved.
Researchers, such as anthropologists, are often interested in how two measurements are related. The statistical study of the relationship between variables.
Correlation and Regression Paired Data Is there a relationship? Do the numbers appear to increase or decrease together? Does one set increase as the other.
Section 12.1 Scatter Plots and Correlation HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2008 by Hawkes Learning Systems/Quant Systems,
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
© 2008 Pearson Addison-Wesley. All rights reserved Chapter 1 Section 13-6 Regression and Correlation.
Scatter Diagrams Objective: Draw and interpret scatter diagrams. Distinguish between linear and nonlinear relations. Use a graphing utility to find the.
 Describe the association between two quantitative variables using a scatterplot’s direction, form, and strength  If the scatterplot’s form is linear,
2-7 Curve Fitting with Linear Models Warm Up Lesson Presentation
Section 6 – 1 Rate of Change and Slope Rate of change = change in the dependent variable change in the independent variable Ex1. Susie was 48’’ tall at.
Scatter Plots, Correlation and Linear Regression.
Linear Regression Day 1 – (pg )
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Chapter 10 Correlation and Regression 10-2 Correlation 10-3 Regression.
Scatterplots and Linear Regressions Unit 8. Warm – up!! As you walk in, please pick up your calculator and begin working on your warm – up! 1. Look at.
Scatter Plots and Correlations. Is there a relationship between the amount of gas put in a car and the number of miles that can be driven?
Scatterplots and Correlation Section 3.1 Part 2 of 2 Reference Text: The Practice of Statistics, Fourth Edition. Starnes, Yates, Moore.
Section 1.3 Scatter Plots and Correlation.  Graph a scatter plot and identify the data correlation.  Use a graphing calculator to find the correlation.
Chapter 4. Correlation and Regression Correlation is a technique that measures the strength of the relationship between two continuous variables. For.
Page 1 Introduction to Correlation and Regression ECONOMICS OF ICMAP, ICAP, MA-ECONOMICS, B.COM. FINANCIAL ACCOUNTING OF ICMAP STAGE 1,3,4 ICAP MODULE.
Copyright © 2017, 2014 Pearson Education, Inc. Slide 1 Chapter 4 Regression Analysis: Exploring Associations between Variables.
Estimation.  Remember that we use statistics (information about a sample) to estimate parameters (information about a population.
Objective: to draw and write trend lines; to use a graphing device to compute the line of best fit. Do-Now: Look at the 3 scatterplot below. Match the.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Correlation and Regression
Linear Regression Essentials Line Basics y = mx + b vs. Definitions
Scatter Plots and Correlation
Inference for Regression
Correlation and Association
One-Variable Statistics
Objectives Fit scatter plot data using linear models with and without technology. Use linear models to make predictions.
CHAPTER 7 LINEAR RELATIONSHIPS
Modify—use bio. IB book  IB Biology Topic 1: Statistical Analysis
Splash Screen.
SCATTERPLOTS, ASSOCIATION AND RELATIONSHIPS
Objectives Fit scatter plot data using linear models with and without technology. Use linear models to make predictions.
Section 2.6: Draw Scatter Plots & best-Fitting Lines(Linear Regresion)
Elementary Statistics
EDRS6208 Fundamentals of Education Research 1
Note: In this chapter, we only cover sections 10-1 through 10-3
Exercise 4 Find the value of k such that the line passing through the points (−4, 2k) and (k, −5) has slope −1.
CHAPTER 10 Correlation and Regression (Objectives)
2.5 Scatterplots and Lines of Regression
Scatter Plots and Lines of best fit
Estimation.
Regression.
Correlation vs. Causation
A correlation is a relationship between two variables. 
Lecture Notes The Relation between Two Variables Q Q
The Weather Turbulence
Scatter Plots and Best-Fit Lines
Do Now Create a scatterplot following these directions
Scatterplots and Correlation
Do Now Describe the following relationships. Do Now Describe the following relationships.
Chapter 6 Confidence Intervals
MATH 1311 Section 3.4.
Section 10-1 Correlation © 2012 Pearson Education, Inc. All rights reserved.
Objectives Vocabulary
Honors Statistics Review Chapters 7 & 8
Correlation and Regression
3.2 Correlation Pg
Presentation transcript:

Two-Variable Statistics

Correlation  A relationship between two variables  As one goes up, the other changes in a predictable way (either mostly goes up or mostly goes down)

Positive Correlation  As one variable goes up, so does the other.

 Examples of positive correlations:  cost of using a computer printer has a positive correlation with the number of pages printed

 number of free throws a basketball player makes and number of times the player is fouled.

 time required for a reading assignment and the number of pages assigned  children’s ages and the number of words in their vocabulary

Negative Correlation  The variables move in opposite directions  As one goes up, the other goes down (and vice versa).

 Examples of negative correlations:  farm acres in Iowa planted to corn and acres planted to other crops

 power of a microwave and the time it takes to boil a cup of water  number of checkouts open at Wal-Mart and how long it takes to check out (at a given time of day)

 number of raffle tickets sold and your probability of winning the raffle

In some cases there can be no correlation  college students’ height and their grade point averages  outside temperature and number of problems assigned in a class

Important Just because there is a correlation between two things, it does NOT mean the first thing causes the second

It just means there’s a relationship.  Something else could be affecting both of the things we are studying.  This is called a lurking, confounding, or extraneous variable.

FOR INSTANCE, there is a strong correlation between children’s shoe size and their vocabulary.  BUT … Having big feet doesn’t CAUSE kids to have a bigger vocabulary.

Correlations can be shown using a scatterplot. This is literally a graph of the points (x,y) for the values in the relationship.

Positive Correlation Negative Correlation

Strong Correlation Weak Correlation

In a perfect correlation, all the points would be in a straight line. If there is no correlation, the points would be all over the place.

We are studying linear correlation, which means we are looking at patterns that are close to being in a line.

It is possible to have other patterns that don’t happen to approximate a line.

From our point of view, we would say there is no correlation for these.

We typically measure the strength of a correlation with a statistic called “r”.

 r is the correlation coefficient (or Pearson’s Product Moment).  r is always a number between –1 and 1.  If r = 0, we say there is NO correlation.

 If r = 1 or r = -1, we say there is a PERFECT correlation.  Usually r is a fraction (normally written as a decimal).

 Typically fractions less than ½ are considered weak correlations and fractions above ½ are considered strong correlations.

How to find “r”: On a TI-83 (or 84): FIRST—one time before you do any two-variable problems. 1. Hit 2 nd  0 (CATALOG) 2. Scroll through the catalog until you find “DiagnosticOn”.

3. Hit ENTER twice. For each individual problem: 1. STAT  EDIT 2. Enter the first variable in L1 and the second variable in L2. (It is important that each pair of data be kept paired.)

3. 2 nd  MODE (QUIT) 4. STAT  CALC 5. Choose #4 LinReg(ax+b) 6. One of the statistics on the read-out is “r”.

If you don’t have a calculator that will find “r”, here’s a quick way to find a rough estimate of its value: 1. On graph paper, graph the points corresponding to the data in your problem.

2. Draw a rectangle that surrounds all the points. 3. Measure the length and width of the rectangle. 4. r ≈ 1 – w / l

Significance tests for correlation QUESTION: Is this a significant positive (or negative) correlation?

HYPOTHESES: H 1 = the correlation in the whole population is significantly greater than 0. H 0 = the correlation isn’t significant.

You find the critical value in the “Pearson Product Moment” table on the handout. Use the 1-tail test for the given level of significance.

You can compute “r” (test statistic) with your calculator or by graphing. (In reality, in most cases it is given in the problem.) As always, if the calculated value (absolute value) is more than the critical value, the result is significant.

Coefficient of Determination  r 2 tells what amount of the change in the second variable can be predicted from the change in the first variable.

 For example, if r =.7, then.49 or 49% of the change in “y” can be predicted from the change in “x”.

Linear Regression  using a line to make predictions with 2-variable data  The regression line is essentially the “average” of the data.

 The idea is a line that runs through the center of all the points—as close as possible to every point.

Regression lines are generally written in the form  x-hat and y-hat are the predicted values of x and y.

On graphing calculators, the slope and y-intercept of the regression line are easily calculated. These may be given in either the form “ax + b” or “a + bx” —depending on the calculator.

Many software programs, such as Microsoft Excel ®, will also compute regression lines. In general we will be using, but not computing regression lines in this class.

In a regression line,  Slope (# by x) represents the rate of change in the variables  y-intercept (# by itself) is usually the initial value of the second variable

IMPORTANT: Regression equations will give reasonable (but not exact) estimates for values near the data used in the sample.

 A regression estimate is essentially a point estimate of a parameter.  For values far away from those used in the sample, regression equations are not usually very accurate (and often give silly answers).

Typical Problem: The New York City police noticed that as the number of officers on the street increased, the crime rate decreased.

In a certain precinct, there was a linear correlation with the following regression equation y = 135 – 7x, where x is the number of officers and y is the number of crimes committed each day.

a. How many crimes can be expected with 12 officers on the street?

b. How many officers would be needed to reduce the number of crimes to 20?

c. How many crimes could be expected if there were 25 officers on the street? (Note that this is a very large number of officers—far more than are involved in either previous problem.)

This is a silly answer because the data is nowhere close to the numbers used in the original sample.