ASSOCIATION: CONTINGENCY, CORRELATION, AND REGRESSION Chapter 3.

Slides:



Advertisements
Similar presentations
MATH 2400 Chapter 5 Notes. Regression Line Uses data to create a linear equation in the form y = ax + b where “a” is the slope of the line (unit rate.
Advertisements

Chapter 4 The Relation between Two Variables
Agresti/Franklin Statistics, 1 of 52 Chapter 3 Association: Contingency, Correlation, and Regression Learn …. How to examine links between two variables.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 3 Association: Contingency, Correlation, and Regression Section 3.1 The Association.
Agresti/Franklin Statistics, 1 of 63  Section 2.4 How Can We Describe the Spread of Quantitative Data?
Chapter 3 Bivariate Data
Chapter 6: Exploring Data: Relationships Lesson Plan
2nd Day: Bear Example Length (in) Weight (lb)
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Relationships Between Quantitative Variables Chapter 5.
Scatter Diagrams and Linear Correlation
AP Statistics Chapters 3 & 4 Measuring Relationships Between 2 Variables.
Chapter 3 Association: Contingency, Correlation, and Regression
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 3 Association: Contingency, Correlation, and Regression Section 3.4 Cautions in Analyzing.
2.4 Cautions about Correlation and Regression. Residuals (again!) Recall our discussion about residuals- what is a residual? The idea for line of best.
Chapter 2: Looking at Data - Relationships /true-fact-the-lack-of-pirates-is-causing-global-warming/
Relationships Between Quantitative Variables
Describing the Relation Between Two Variables
Ch 2 and 9.1 Relationships Between 2 Variables
Correlation & Regression
Descriptive Methods in Regression and Correlation
Relationship of two variables
Slide Copyright © 2008 Pearson Education, Inc. Chapter 4 Descriptive Methods in Regression and Correlation.
Chapter 3: Examining relationships between Data
Chapter 6: Exploring Data: Relationships Lesson Plan Displaying Relationships: Scatterplots Making Predictions: Regression Line Correlation Least-Squares.
Chapter 3 concepts/objectives Define and describe density curves Measure position using percentiles Measure position using z-scores Describe Normal distributions.
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
CHAPTER 7: Exploring Data: Part I Review
1 Chapter 10 Correlation and Regression 10.2 Correlation 10.3 Regression.
Chapter 3 Section 3.1 Examining Relationships. Continue to ask the preliminary questions familiar from Chapter 1 and 2 What individuals do the data describe?
Chapter 10 Correlation and Regression
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
1 EXPLORING RELATIONSHIPS BETWEEN QUANTITATIVE VARIABLES SCATTERPLOTS, ASSOCIATION, AND CORRELATION ADDITIONAL REFERENCE READING MATERIAL COURSEPACK PAGES.
Scatterplot and trendline. Scatterplot Scatterplot explores the relationship between two quantitative variables. Example:
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 12: Analyzing the Association Between Quantitative Variables: Regression Analysis Section.
Chapters 8 & 9 Linear Regression & Regression Wisdom.
Relationships If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 3 Describing Relationships 3.2 Least-Squares.
Examining Bivariate Data Unit 3 – Statistics. Some Vocabulary Response aka Dependent Variable –Measures an outcome of a study Explanatory aka Independent.
CHAPTER 5 Regression BPS - 5TH ED.CHAPTER 5 1. PREDICTION VIA REGRESSION LINE NUMBER OF NEW BIRDS AND PERCENT RETURNING BPS - 5TH ED.CHAPTER 5 2.
Chapter 5 Regression. u Objective: To quantify the linear relationship between an explanatory variable (x) and response variable (y). u We can then predict.
Chapter 4 – Correlation and Regression before: examined relationship among 1 variable (test grades, metabolism, trip time to work, etc.) now: will examine.
1 Association  Variables –Response – an outcome variable whose values exhibit variability. –Explanatory – a variable that we use to try to explain the.
 What is an association between variables?  Explanatory and response variables  Key characteristics of a data set 1.
Chapter 2 Examining Relationships.  Response variable measures outcome of a study (dependent variable)  Explanatory variable explains or influences.
Business Statistics for Managerial Decision Making
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 3 Association: Contingency, Correlation, and Regression Section 3.3 Predicting the Outcome.
^ y = a + bx Stats Chapter 5 - Least Squares Regression
LEAST-SQUARES REGRESSION 3.2 Least Squares Regression Line and Residuals.
Notes Chapter 7 Bivariate Data. Relationships between two (or more) variables. The response variable measures an outcome of a study. The explanatory variable.
Simple Linear Regression The Coefficients of Correlation and Determination Two Quantitative Variables x variable – independent variable or explanatory.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Chapters 8 Linear Regression. Correlation and Regression Correlation = linear relationship between two variables. Summarize relationship with line. Called.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 3 Association: Contingency, Correlation, and Regression Section 3.1 The Association.
CHAPTER 5: Regression ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
Unit 3 – Association: Contingency, Correlation, and Regression Lesson 3-3 Linear Regression, Residuals, and Variation.
Unit 3 Correlation. Homework Assignment For the A: 1, 5, 7,11, 13, , 21, , 35, 37, 39, 41, 43, 45, 47 – 51, 55, 58, 59, 61, 63, 65, 69,
Unit 3 – Association: Contingency, Correlation, and Regression Lesson 3-2 Quantitative Associations.
1. Analyzing patterns in scatterplots 2. Correlation and linearity 3. Least-squares regression line 4. Residual plots, outliers, and influential points.
Chapter 3 Association: Contingency, Correlation, and Regression Section 3.1 How Can We Explore the Association between Two Categorical Variables?
Chapter 2: Looking at Data — Relationships
Chapter 2 Looking at Data— Relationships
^ y = a + bx Stats Chapter 5 - Least Squares Regression
Least-Squares Regression
Chapter 2 Looking at Data— Relationships
Chapter 3 Association: Contingency, Correlation, and Regression
Least-Squares Regression
CHAPTER 3 Describing Relationships
Chapters Important Concepts and Terms
Honors Statistics Review Chapters 7 & 8
Presentation transcript:

ASSOCIATION: CONTINGENCY, CORRELATION, AND REGRESSION Chapter 3

3.1 The Association between Two Categorical Variables

Response and Explanatory Variables  Response variable (dependent, y) outcome variable  Explanatory variable (independent, x) defines groups  Response/Explanatory 1. Grade on test/Amount of study time 2. Yield of corn/Amount of rainfall

Association Association – When a value for one variable is more likely with certain values of the other variable Data analysis with two variables 1. Tell whether there is an association and 2. Describe that association

Contingency Table  Displays two categorical variables  The rows list the categories of one variable; the columns list the other  Entries in the table are frequencies www1.pictures.fp.zimbio.com

Contingency Table  What is the response (outcome) variable? Explanatory?  What proportion of organic foods contain pesticides?Conventionally grown?  What proportion of all sampled foods contain pesticides?

Proportions & Conditional Proportions

Side by side bar charts show conditional proportions and allow for easy comparison Proportions & Conditional Proportions

If no association, then proportions would be the same Proportions & Conditional Proportions Since there is association, then proportions are different

3.2 The Association between Two Quantitative Variables

Internet Usage & GDP Data Set

Scatterplot Graph of two quantitative variables:  Horizontal Axis: Explanatory, x  Vertical Axis: Response, y

Interpreting Scatterplots  The overall pattern includes trend, direction, and strength of the relationship  Trend: linear, curved, clusters, no pattern  Direction: positive, negative, no direction  Strength: how closely the points fit the trend  Also look for outliers from the overall trend

Used-car Dealership What association would we expect between the age of the car and mileage? a) Positive b) Negative c) No association

Linear Correlation, r Measures the strength and direction of the linear association between x and y

Correlation coefficient: Measuring Strength & Direction of a Linear Relationship  Positive r => positive association  Negative r => negative association  r close to +1 or -1 indicates strong linear association  r close to 0 indicates weak association

3.3 Can We Predict the Outcome of a Variable?

Regression Line  Predicts y, given x:  The y-intercept and slope are a and b  Only an estimate – actual data vary  Describes relationship between x and estimated means of y farm4.static.flickr.com

Residuals  Prediction errors: vertical distance between data point and regression line  Large residual indicates unusual observation  Each residual is:  Sum of residuals is always zero  Goal: Minimize distance from data to regression line

msenux.redwoods.edu Least Squares Method  Residual sum of squares:  Least squares regression line minimizes vertical distance between points and their predictions

Regression Analysis Identify response and explanatory variables  Response variable is y  Explanatory variable is x

Anthropologists Predict Height Using Remains?  Regression Equation:  is predicted height and x is the length of a femur, thighbone (cm) Predict height for femur length of 50 cm Bones

Interpreting the y-Intercept and slope  y-intercept: y-value when x = 0  Helps plot line  Slope: change in y for 1 unit increase in x  1 cm increase in femur length means 2.4 cm increase in predicted height

Slope Values: Positive, Negative, Zero

Slope and Correlation  Correlation, r:  Describes strength  No units  Same if x and y are swapped  Slope, b:  Doesn’t tell strength  Has units  Inverts if x and y are swapped

 Proportional reduction in error, r 2  Variation in y-values explained by relationship of y to x  A correlation, r, of.9 means  81% of variation in y is explained by x Squared Correlation, r 2

3.4 What Are Some Cautions in Analyzing Associations?

Extrapolation  Extrapolation: Predicting y for x-values outside range of data  Riskier the farther from the range of x  No guarantee trend holds Neil Weiss, Elementary Statistics, 7 th Edition

Outliers and Influential Points  Regression outlier lies far away from rest of data  Influential if both: 1. Low or high, compared to rest of data 2. Regression outlier www2.selu.edu

Correlation Does Not Imply Causation Strong correlation between x and y means  Strong linear association between the variables  Does not mean x causes y Ex. 95.6% of cancer patients have eaten pickles, so do pickles cause cancer?

Lurking Variables & Confounding 1. Ice cream sales & drowning => temperature 2. Reading level & shoe size => age  Confounding – two explanatory variables both associated with response variable and each other  Lurking variables – not measured in study but may confound

Simpson’s Paradox Example Probability of Death of Smoker = 139/582 = 24% Probability of Death of Nonsmoker = 230/732 = 31% Simpson’s Paradox:  Association between two variables reverses after third is included

Break out Data by Age Simpson’s Paradox Example

Associations look quite different after adjusting for third variable Simpson’s Paradox Example