Association between 2 variables

Slides:



Advertisements
Similar presentations
Chapter 6: Exploring Data: Relationships Lesson Plan
Advertisements

Chapter 41 Describing Relationships: Scatterplots and Correlation.
Looking at data: relationships Scatterplots IPS chapter 2.1 © 2006 W. H. Freeman and Company.
Chapter 7 Scatterplots, Association, Correlation Scatterplots and correlation Fitting a straight line to bivariate data © 2006 W. H. Freeman.
MATH 2400 Chapter 4 Notes. Response & Explanatory Variables A response variable (a.k.a. dependent variables) measures an outcome of a study. An explanatory.
LECTURE 2 Understanding Relationships Between 2 Numerical Variables
Examining Relationships Prob. And Stat. CH.2.1 Scatterplots.
Association between 2 variables We've described the distribution of 1 variable in Chapter 1 - but what if 2 variables are measured on the same individual?
Relationships Scatterplots and correlation BPS chapter 4 © 2006 W.H. Freeman and Company.
Chapter 6: Exploring Data: Relationships Chi-Kwong Li Displaying Relationships: Scatterplots Regression Lines Correlation Least-Squares Regression Interpreting.
Chapter 3: Examining relationships between Data
Chapter 6: Exploring Data: Relationships Lesson Plan Displaying Relationships: Scatterplots Making Predictions: Regression Line Correlation Least-Squares.
Scatterplots. Learning Objectives By the end of this lecture, you should be able to: – Describe what a scatterplot is – Be comfortable with the terms.
LECTURE UNIT 7 Understanding Relationships Among Variables Scatterplots and correlation Fitting a straight line to bivariate data.
Looking at data: relationships Scatterplots IPS chapter 2.1 © 2006 W. H. Freeman and Company.
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
Objectives (IPS Chapter 2.1)
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
Objectives 2.1Scatterplots  Scatterplots  Explanatory and response variables  Interpreting scatterplots  Outliers Adapted from authors’ slides © 2012.
Chapter 7 Scatterplots, Association, and Correlation.
Scatterplots and Correlations
Association between 2 variables We've described the distribution of 1 variable - but what if 2 variables are measured on the same individual? Examples?
Relationships Scatterplots and correlation BPS chapter 4 © 2006 W.H. Freeman and Company.
Chapter 4 - Scatterplots and Correlation Dealing with several variables within a group vs. the same variable for different groups. Response Variable:
3.2: Linear Correlation Measure the strength of a linear relationship between two variables. As x increases, no definite shift in y: no correlation. As.
4.2 Correlation The Correlation Coefficient r Properties of r 1.
Chapter 4 Scatterplots and Correlation. Chapter outline Explanatory and response variables Displaying relationships: Scatterplots Interpreting scatterplots.
Relationships Scatterplots and correlation BPS chapter 4 © 2006 W.H. Freeman and Company.
Relationships Scatterplots and Correlation.  Explanatory and response variables  Displaying relationships: scatterplots  Interpreting scatterplots.
Notes Chapter 7 Bivariate Data. Relationships between two (or more) variables. The response variable measures an outcome of a study. The explanatory variable.
Lecture 4 Chapter 3. Bivariate Associations. Objectives (PSLS Chapter 3) Relationships: Scatterplots and correlation  Bivariate data  Scatterplots (2.
Lecture 3 – Sep 3. Normal quantile plots are complex to do by hand, but they are standard features in most statistical software. Good fit to a straight.
Statistics for Business and Economics Module 2: Regression and time series analysis Spring 2010 Lecture 2: Examining the relationship between two quantitative.
Scatter plots Adapted from 350/
3. Relationships Scatterplots and correlation
Exploring Relationships Between Variables
CHAPTER 3 Describing Relationships
Chapter 6: Exploring Data: Relationships Lesson Plan
Module 11 Math 075. Module 11 Math 075 Bivariate Data Proceed similarly as univariate distributions … What is univariate data? Which graphical models.
Exploring Relationships Between Variables
Review for Test Chapters 1 & 2:
Daniela Stan Raicu School of CTI, DePaul University
Linear transformations
Chapter 6: Exploring Data: Relationships Lesson Plan
The Practice of Statistics in the Life Sciences Fourth Edition
AGENDA: Quiz # minutes Begin notes Section 3.1.
Chapter 7 Part 1 Scatterplots, Association, and Correlation
Daniela Stan Raicu School of CTI, DePaul University
CHAPTER 4: Scatterplots and Correlation
Chapter 2 Looking at Data— Relationships
STEM Fair Graphs.
Objectives (IPS Chapter 2.3)
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 4 - Scatterplots and Correlation
Chapter 3 Scatterplots and Correlation.
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Examining Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Examining Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Day 8 Agenda: Quiz 1.1 & minutes Begin Ch 3.1
Association between 2 variables
Scatterplots.
CHAPTER 3 Describing Relationships
Image from Minitab Website
Presentation transcript:

Association between 2 variables We've described the distribution of 1 variable (univariate) but what if 2 variables are measured on the same individual (bivariate)? Examples? How could you describe the association between the two? Our descriptions will depend upon the types of variables (categorical or quantitative): categorical vs. categorical - Examples? categorical vs. quantitative - Examples? quantitative vs. quantitative - Examples?

Explanatory variable vs. Response Variable One common task is to show that one variable can be used to explain variation in the other. Explanatory variable vs. Response Variable (sometimes these are called independent vs. dependent variables) These associations can be explored both graphically and numerically: begin your analysis with graphics find a pattern & look for deviations from the pattern look for a mathematical model to describe the pattern But again we do the above depending upon what type variables we have… we'll start with quantitative vs. quantitative ...

A scatterplot is the best graph for showing relationships between two quantitative variables In a scatterplot, one axis is used to represent each of the variables, and the data are plotted as points on the graph. Student Beers BAC 1 5 0.1 2 0.03 3 9 0.19 6 7 0.095 0.07 0.02 11 4 13 0.085 8 0.12 0.04 0.06 10 0.05 12 14 0.09 15 0.01 16

Explanatory (independent) variable: Explanatory and response variables A response variable measures or records an outcome of a study. An explanatory variable explains changes in the response variable. Typically, the explanatory or independent variable is plotted on the x axis, and the response or dependent variable is plotted on the y axis. Explanatory (independent) variable: number of beers Response (dependent) variable: blood alcohol content x y

Describe the pattern of the relationship between the two variables in a scatterplot by its direction, strength, and form. direction: positive, negative or flat (no direction) strength: strong, weak, moderately strong, etc. form: linear, curved (non-linear), clusters, no pattern See example to the right…

Form and direction of an association Linear No relationship Nonlinear

Positive association: High values of one variable tend to occur together with high values of the other variable. Negative association: High values of one variable tend to occur together with low values of the other variable. The scatterplots below show perfect linear associations

No relationship: X and Y vary independently No relationship: X and Y vary independently. Knowing X tells you nothing about Y. One way to think about this is to remember the following: Imagine a line through the data points.. the equation for that line is y = 5. x is not involved.

Strength of the relationship or association ... This is a weak relationship. For a particular state median household income, you can’t predict the state per capita income very well. This is a very strong relationship. The daily amount of gas consumed can be predicted quite accurately for a given temperature value.

What if there are categorical variables involved What if there are categorical variables involved? either as the explanatory variable or as a “lurking variable”? A scatterplot sometimes can help by indicating the categories of the lurking variable with different plotting symbols or colors... Often though the best way to see the pattern if the explanatory variable is categorical is to draw side-by-side boxplots. Put the categorical variable on the horizontal axis, and draw a boxplot for each category, side-by-side. Here are some some examples of various explanatory, lurking, and response variables...

Categorical variables in scatterplots Often, things are not simple and one-dimensional. We need to group the data into categories to reveal trends. Lurking Variable! What may look like a positive linear relationship is in fact a series of negative linear associations. Plotting different habitats (the lurking variable) in different colors allows us to make that important distinction.

Comparison of men and women racing records over time. Each group shows a very strong negative linear relationship that would not be apparent without the gender categorization. Relationship between lean body mass and metabolic rate in men and women. Both men and women follow the same positive linear trend, but women show a stronger association. As a group, males typically have larger values for both variables.

Look at this figure.. Note the ordinal scale of the explanatory variable education level. Are these two variables associated ? Why? The next slide is tricky...

Example: Beetles trapped on boards of different colors Beetles were trapped on sticky boards scattered throughout a field. The sticky boards were of four different colors (categorical explanatory variable). The number of beetles trapped (response variable) is shown on the graph below. Blue White Green Yellow Board color ? What association? What relationship? Yellow White Green Blue Board color  Describe one category at a time. When both variables are quantitative, the order of the data points is defined entirely by their value. This is not true for categorical data.

HW: Start reading Notes 2.1 on Bivariate Data with R. Then . . . 1. Load the lean body mass data (lbm.csv) into R using the read.csv function. We are interested in knowing if lean body mass explains metabolic rate. > # first, save the file on your desktop … then read it into R > bodymass = read.csv(file=file.choose()) > str(bodymass) # to see the structure of the data frame > attach(bodymass) > plot(x,y) # to see a scatterplot of the two variables > # which variable is x? y? > # how would you describe the relationship you see? > # don't forget: direction, strength, and form. > # is the relationship different for males and females? 2. Bring in bivariate data on two quantitative variables in your field that you can analyze with R - we'll plot it, correlate it, do regression on it… Is one of your variables explanatory while the other is the response? Or not?