Presentation is loading. Please wait.

Presentation is loading. Please wait.

MAT 1000 Mathematics in Today's World. Last Time We saw how to use the mean and standard deviation of a normal distribution to determine the percentile.

Similar presentations


Presentation on theme: "MAT 1000 Mathematics in Today's World. Last Time We saw how to use the mean and standard deviation of a normal distribution to determine the percentile."— Presentation transcript:

1 MAT 1000 Mathematics in Today's World

2 Last Time We saw how to use the mean and standard deviation of a normal distribution to determine the percentile of a data value from that distribution. The P th percentile of a distribution is a value which P percent of the data is less than. For instance: 80% of a distribution is less than the 80 th percentile. For a normal distribution, we can find percentiles by computing a standard score, and then using a table to look up the percentile.

3 Today Recall that “variables” are characteristics or attributes of individuals. We will consider pairs of variables. In other words, we will look at a pair of characteristics of an individual. The key question will be: are these variables related? We will discuss scatterplots, which are a way to visualize data that consists of pairs of variables. We will talk about the key features of scatterplots: form, direction, and strength.

4 Today We will also talk about correlation. For a data set consisting of pairs of numbers, correlation is a number between -1 and 1. If the data has a linear form, then correlation tells us about the strength and direction of the relationship between the variables.

5 Pairs of variables What are some examples of pairs of variables? Height and weight of people. This gives us two numbers, one for each individual. The time it takes me to run a mile and my heart rate afterwards. Each time I run a mile I get another pair of numbers.

6 Pairs of variables We collect data on pairs of variables in order to study relationships between those variables. Sometimes we are interested in cause and effect relationships. Example Will you live longer if you increase your intake of vitamin A? Cause: Amount of vitamin A taken (in IUs) Effect: Lifespan (in years) For each individual we get two numbers: average daily intake of vitamin A, and lifespan.

7 Pairs of variables When we talk about a pair of variables that we know, or at least believe or hope, have a cause and effect relationship, we use the following terms: The explanatory variable is what we believe to be the cause. The response variable is what we believe to be the effect. Statistics give us evidence for a cause and effect relationship, but statistics will not prove this.

8 Scatterplots Scatterplots are visual representations of pairs of data. There is a horizontal scale and a vertical scale. Each direction corresponds to one of the variables. Each individuals is represented by one dot. The horizontal and vertical location of the dot corresponds to the values of each variable.

9 Scatterplots Scatterplot of the life expectancy of people in many nations against each nation’s gross domestic product per person.

10 Scatterplots

11 Interpreting scatterplots To interpret a scatterplot, look for three things: 1.Form 2.Direction 3.Strength The form of a scatterplot is its overall shape. This may be a straight line, a curved line, or some other shape altogether. The strength is how close the scatterplot is to its form.

12 Interpreting scatterplots We distinguish between two directions: positive and negative. This is especially useful when the form of a scatterplot is a straight line. (In this case, the direction corresponds to the sign of the slope of the line: positive slope = positive direction.) The rule for a positive direction: larger values of the explanatory variable correspond to larger values of the response variable. The rule for a negative direction: larger values of the explanatory variable correspond to smaller values of the response variable.

13 Form: curved line Strength: fairly strong Direction: positve Interpreting scatterplots

14 Form: straight line Strength: moderate Direction: positve Interpreting scatterplots

15 If height and weight have a positive association what does that tell us? It means that taller people tend to weigh more. This is a statement about a general tendency. We don’t worry about the exceptions.

16 Interpreting scatterplots What about the time it takes me to run a mile and my heart rate afterwards? If I run faster, my time is less, and I’m working harder so my heart rate will go up. If I run slower, I will have a longer time, and I won’t be working as hard, so my heart rate won’t go up as much. What direction is this association? Negative: larger values of the explanatory variable (time) correspond to lower values of the response variable (heart rate).

17 Interpreting scatterplots In addition to form, direction, and strength, which are general features of a scatterplot, you should also note any outliers. On a scatterplot the outliers are dots that don’t fit into the overall pattern.

18 Sierra Leone is a clear outlier on this scatterplot. Interpreting scatterplots

19 Linear form To find the form of an association, look at a scatterplot. If one straight line gives a reasonable approximation to the scatterplot, the form is said to be “linear.” Let’s consider some examples.

20 Linear form

21

22 Non-linear form Not every relationship is linear Example Consider the relationship between the speed you drive and the gas mileage you get. As your speed increases, your mileage increases, up to a certain speed (usually around 55 or 60 mph). This will look roughly like a straight line. But around 55 or 60 mph (the exact speed depends on the type of car), your mileage begins to decrease. Let’s look at a scatterplot.

23 Non-linear form This is not a linear scatterplot

24 Non-linear form

25 Correlation

26 Interpreting correlation Value of correlationStrength of relationship 0.8 to 1.0 -1.0 to -0.8Very strong 0.6 to 0.8 -0.8 to -0.6Strong 0.4 to 0.6 -0.6 to -0.4Moderate 0.2 to 0.4 -0.4 to -0.2Weak -0.2 to 0.2Either very weak, or not a linear relationship

27 Interpreting correlation Here are some concrete examples to give you a better feel for correlations: The correlation between SAT score and college GPA is about 0.6. The correlation between height and weight for American males is about 0.4. The correlation between income and education level in the United States is about 0.4. The correlation between a person’s income and the last 4 digits of their phone number is 0.

28 Interpreting correlation Here are examples of scatterplots for various values of Notice the relationship between direction and sign, and also that the closer r is to 1 or -1, the stronger the association

29 Calculating correlation Calculating correlations by hand takes some work. Example Find the correlation between the height and weight of the following five men: Notice that our data set has five individuals and two variables. Height (inches)6772777469 Weight (pounds)155220240195175

30 Calculating correlation

31

32

33 Multiply the standard score of a person’s weight by the standard score of their height. Then we add up this last column. 67 72 77 74 69 -1.21 0.05 1.31 0.56 -0.71 155 220 240 195 175 -1.23 0.68 1.26 -0.06 -0.65 1.50 0.03 1.66 -0.03 0.46 3.61

34 Calculating correlation

35

36 Height (inches)6772777469 Weight (pounds)155220240195175 Height (cm)170183196188175 Weight (kg)701001098879

37 Calculating correlation Height (cm)170183196188175 Weight (kg)701001098879


Download ppt "MAT 1000 Mathematics in Today's World. Last Time We saw how to use the mean and standard deviation of a normal distribution to determine the percentile."

Similar presentations


Ads by Google