Bivariate EDA
Quantitative Bivariate EDASlide #2 Bivariate EDA –Graphically –Numerically –Model Describe the relationship between pairs of variables
1.What is the name of this plot? 2.What type of variable is latitude? 3.Which variable is considered the response variable? 4.What is the approximate percentage females at a latitude of 55? at 45? at 35? Quantitative Bivariate EDASlide #3 Figure 1. Plot of the percent female kingfishers observed at different latitudes during the Christmas Bird Count, 1992.
Quantitative Bivariate EDASlide #4 Variables & Axes Response (dependent) variable –variability is being explained or values predicted –y-axis Explanatory (independent, predictor) variable –used to explain variability or to make predictions –x-axis
Quantitative Bivariate EDASlide #5 Bivariate EDA -- Description Association/Direction – what words are used? Positive Negative None What four things are described in a bivariate EDA for quantitative data?
Quantitative Bivariate EDASlide #6 What Type of Association? X Y Negative
Quantitative Bivariate EDASlide #7 What Type of Association? X Y Positive
Quantitative Bivariate EDASlide #8 What Type of Association? X Y None
Quantitative Bivariate EDASlide #9 Items to Describe in a Bivariate EDA Association/Direction Form – what two forms will we consider? Linear Non-linear
Quantitative Bivariate EDASlide #10 Items to Describe in a Bivariate EDA Association/Direction Form Outliers X Y
Quantitative Bivariate EDASlide #11 Items to Describe in a Bivariate EDA Association/Direction Form Outliers Strength -- how closely the points cluster to the form
Strength? Quantitative Bivariate EDASlide # X Y
Which is More Strong? Quantitative Bivariate EDASlide #13
Quantitative Bivariate EDASlide #14 Correlation Coefficient 1. Standardize both X and Y 2. Product paired standardized values 3. Sum products 4. Divide by n-1 1n s yy y i s xx x i r n 1i *
Quantitative Bivariate EDASlide #15 A measure of association/direction Correlation Coefficient
Quantitative Bivariate EDASlide #16 r for positive Association? X Y Standardize both X and Y
Quantitative Bivariate EDASlide #17 r for positive Association? X Y Product paired standardized values
Quantitative Bivariate EDASlide #18 r for positive Association? X Y Sum products Positive
Quantitative Bivariate EDASlide #19 r for positive Association? X Y Divide by n Positive
Quantitative Bivariate EDASlide #20 r for Positive Association? X Y Thus, r is Positive
Quantitative Bivariate EDASlide #21 r for Negative Association? X Y Thus, r is Negative
Quantitative Bivariate EDASlide #22 A measure of association and strength Correlation Coefficient +1 0 WeakestStrongest
Quantitative Bivariate EDASlide #23 A measure of association and strength of a linear relationship with no outliers Moral … PLOT YOUR DATA!! Correlation Coefficient r = 0.817
Quantitative Bivariate EDASlide #24 Correlation Review Variables must be quantitative Form must be linear without outliers -- i.e., PLOT -1 < r < 1 No distinction between which variable is on x and which is on y (though, response variable should always be y) r does not depend on units of x and y Correlation is not causation We won’t compute r - must interpret and identify strength
Perform a bivariate EDA from Figure 1. Quantitative Bivariate EDASlide #25 Figure 1. Plot of the percent female kingfishers observed at different latitudes during the Christmas Bird Count, r =
Perform a bivariate EDA from Figure 2. Quantitative Bivariate EDASlide #26 Figure 2. Plot of the maximum temperature versus herbage yield for grassland headfires in west Texas. r=0.798
Perform a bivariate EDA from Figure 3. Quantitative Bivariate EDASlide #27 Figure 3. Plot of the number of pupae per gallery and the density of attacks for the beetle Ips cembrae r=-0.612
Quantitative Bivariate EDASlide #28 Examine handout – plot() – cor() Quantitative Bivariate EDA in R