Download presentation
Presentation is loading. Please wait.
Published byBeverly Reed Modified over 9 years ago
1
MAT 1000 Mathematics in Today's World
2
Last Time We saw how to use the mean and standard deviation of a normal distribution to determine the percentile of a data value from that distribution. The P th percentile of a distribution is a value which P percent of the data is less than. For instance: 80% of a distribution is less than the 80 th percentile. For a normal distribution, we can find percentiles by computing a standard score, and then using a table to look up the percentile.
3
Today Recall that “variables” are characteristics or attributes of individuals. We will consider pairs of variables. In other words, we will look at a pair of characteristics of an individual. The key question will be: are these variables related? We will discuss scatterplots, which are a way to visualize data that consists of pairs of variables. We will talk about the key features of scatterplots: form, direction, and strength.
4
Today We will also talk about correlation. For a data set consisting of pairs of numbers, correlation is a number between -1 and 1. If the data has a linear form, then correlation tells us about the strength and direction of the relationship between the variables.
5
Pairs of variables What are some examples of pairs of variables? Height and weight of people. This gives us two numbers, one for each individual. The time it takes me to run a mile and my heart rate afterwards. Each time I run a mile I get another pair of numbers.
6
Pairs of variables We collect data on pairs of variables in order to study relationships between those variables. Sometimes we are interested in cause and effect relationships. Example Will you live longer if you increase your intake of vitamin A? Cause: Amount of vitamin A taken (in IUs) Effect: Lifespan (in years) For each individual we get two numbers: average daily intake of vitamin A, and lifespan.
7
Pairs of variables When we talk about a pair of variables that we know, or at least believe or hope, have a cause and effect relationship, we use the following terms: The explanatory variable is what we believe to be the cause. The response variable is what we believe to be the effect. Statistics give us evidence for a cause and effect relationship, but statistics will not prove this.
8
Scatterplots Scatterplots are visual representations of pairs of data. There is a horizontal scale and a vertical scale. Each direction corresponds to one of the variables. Each individuals is represented by one dot. The horizontal and vertical location of the dot corresponds to the values of each variable.
9
Scatterplots Scatterplot of the life expectancy of people in many nations against each nation’s gross domestic product per person.
10
Scatterplots
11
Interpreting scatterplots To interpret a scatterplot, look for three things: 1.Form 2.Direction 3.Strength The form of a scatterplot is its overall shape. This may be a straight line, a curved line, or some other shape altogether. The strength is how close the scatterplot is to its form.
12
Interpreting scatterplots We distinguish between two directions: positive and negative. This is especially useful when the form of a scatterplot is a straight line. (In this case, the direction corresponds to the sign of the slope of the line: positive slope = positive direction.) The rule for a positive direction: larger values of the explanatory variable correspond to larger values of the response variable. The rule for a negative direction: larger values of the explanatory variable correspond to smaller values of the response variable.
13
Form: curved line Strength: fairly strong Direction: positve Interpreting scatterplots
14
Form: straight line Strength: moderate Direction: positve Interpreting scatterplots
15
If height and weight have a positive association what does that tell us? It means that taller people tend to weigh more. This is a statement about a general tendency. We don’t worry about the exceptions.
16
Interpreting scatterplots What about the time it takes me to run a mile and my heart rate afterwards? If I run faster, my time is less, and I’m working harder so my heart rate will go up. If I run slower, I will have a longer time, and I won’t be working as hard, so my heart rate won’t go up as much. What direction is this association? Negative: larger values of the explanatory variable (time) correspond to lower values of the response variable (heart rate).
17
Interpreting scatterplots In addition to form, direction, and strength, which are general features of a scatterplot, you should also note any outliers. On a scatterplot the outliers are dots that don’t fit into the overall pattern.
18
Sierra Leone is a clear outlier on this scatterplot. Interpreting scatterplots
19
Linear form To find the form of an association, look at a scatterplot. If one straight line gives a reasonable approximation to the scatterplot, the form is said to be “linear.” Let’s consider some examples.
20
Linear form
22
Non-linear form Not every relationship is linear Example Consider the relationship between the speed you drive and the gas mileage you get. As your speed increases, your mileage increases, up to a certain speed (usually around 55 or 60 mph). This will look roughly like a straight line. But around 55 or 60 mph (the exact speed depends on the type of car), your mileage begins to decrease. Let’s look at a scatterplot.
23
Non-linear form This is not a linear scatterplot
24
Non-linear form
25
Correlation
26
Interpreting correlation Value of correlationStrength of relationship 0.8 to 1.0 -1.0 to -0.8Very strong 0.6 to 0.8 -0.8 to -0.6Strong 0.4 to 0.6 -0.6 to -0.4Moderate 0.2 to 0.4 -0.4 to -0.2Weak -0.2 to 0.2Either very weak, or not a linear relationship
27
Interpreting correlation Here are some concrete examples to give you a better feel for correlations: The correlation between SAT score and college GPA is about 0.6. The correlation between height and weight for American males is about 0.4. The correlation between income and education level in the United States is about 0.4. The correlation between a person’s income and the last 4 digits of their phone number is 0.
28
Interpreting correlation Here are examples of scatterplots for various values of Notice the relationship between direction and sign, and also that the closer r is to 1 or -1, the stronger the association
29
Calculating correlation Calculating correlations by hand takes some work. Example Find the correlation between the height and weight of the following five men: Notice that our data set has five individuals and two variables. Height (inches)6772777469 Weight (pounds)155220240195175
30
Calculating correlation
33
Multiply the standard score of a person’s weight by the standard score of their height. Then we add up this last column. 67 72 77 74 69 -1.21 0.05 1.31 0.56 -0.71 155 220 240 195 175 -1.23 0.68 1.26 -0.06 -0.65 1.50 0.03 1.66 -0.03 0.46 3.61
34
Calculating correlation
36
Height (inches)6772777469 Weight (pounds)155220240195175 Height (cm)170183196188175 Weight (kg)701001098879
37
Calculating correlation Height (cm)170183196188175 Weight (kg)701001098879
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.