Regression and Least Squares
The need for a mathematical construct… Insert fig 3.8
A scatterplot displays the direction, form, and strength As we saw in our example, our eyes are not good at determining strength. So we use correlation.
3.2 - Correlation The correlation measures the direction and strength of the linear relationship between two quantitative variables. Correlation is usually written as r The formula for calculating r is:
The values used in the formula for r are: n – Number of individuals data was collected on notice this is just the z-score for the x’s and this is the standardized or z-score for y **Notice that correlation is just an average of the product of the standardized values for x and y!!
Facts about correlation: 1.Correlation makes no distinction between explanatory and response variables. It makes no difference which variable you call x, or y, in calculating correlation 2.Correlation requires that both variables be quantitative, can’t be categorical 3.Because we use standardized values (z-scores) units of measure for x, y do not matter. Correlation itself has no unit of measure
Facts about correlation: 4.Positive r indicates positive association between the variables, and negative r indicates negative association. 5.When r is close to 0, there is a weak linear relationship. When r is close to –1or 1 then there is a strong linear relationship. If r = then the points lie exactly on a line. 6.Correlation only measures the strength of a linear relationship!! 7.Correlation is not resistant (recall mean and standard deviation are used in the formula and they are not resistant) and is affected by outliers
Insert fig 3.9
3.24 – Classifying Fossils With your partner work on problem 3.24 in the textbook.
Summary… r measures the direction and strength of a linear relationship If r approaches 1, there is a strong positive linear relationship We say a linear relationship is weak as r approaches 0 If r approaches -1, there is a strong negative linear relationship