Download presentation
1
Scatter Plots and Bivariate Data
Bridge Design Experiment: Today you will conduct an experiment which will study the relationship between a bridge’s thickness, and its strength. Because we are collecting data for two different changing quantities, this is called bivariate data. We will model bridge design by using card stock paper strips, We will model thickness by adding layers to our bridge. We will test strength by observing the number of pennies it will take to make our bridge collapse. The more pennies it takes, the stronger our bridge.
2
Methodology (controls)
As in science, because we only want to test for thickness and strength, it’s important to control for other possible variables, making sure we are all conducting our experiments consistently. Bridge height? All our bridges will rest upon the same “Pre-Algebra” text book, for uniform height. Bridge length? All our strips are the same length, and overlapping each text book by 1 inch at each end. Material integrity? Each time a new layer is added, flip the paper over so that the paper does not bend more easily and easily in just a single direction. Weight placement? All our weight must be placed at the center of the bridge, because all bridges are naturally stronger at different places, so its important we all use the same location for where we put the pennies How we place the pennies should be consistent. Pennies should be placed in cup gently, carefully. We only want to test the weight, not the force of that weight hitting the cup. After each new layer is added, empty the cup so each new trial begins with adding just one penny to an empty cup. Bridge collapse? We will all agree that the bridge “collapses” when the center touches the desk.
3
Group Directions Quality Control: 1 person will ensure that each end is consistently 1 inch overlapping each text book before adding the weight, that the cup is placed at the center of the bridge, and check that the pennies are being added GENTLY to the cup. Paper manager: 1 person will add the layers after each collapse, making sure that the previous layers are flipped Penny placer: 1 person will place the pennies in the cup, consistently gently, and counting out loud as each new penny is added. Data Recorder: 1 person will keep a table, and fill in the results of the # of pennies it took to collapse the bridge for each new layer. Extra Person: Any students not assigned to the above roles, should be an extra Data Recorder, conferring with the other Data Recorders to ensure you all have the same data. completion of the experiment, the data recorder will plot 5 points from their results on the class scatter plot with a given marker. Each point will be an ordered pair, where x = # of layers, and y = # of pennies. Thickness (# of layers) 1 2 3 4 5 Strength (# of pennies to cause collapse)
4
Scatter Plots and Bivariate Data
We’ve been learning a lot about a special mathematical relationship called a function, and certain sub categories, like linear functions, and proportional relationships. However, when one quantity, is NOT a function of another, are those two quantities then completely unrelated? Do changes to one quantity have absolutely no effect on the other? Actually in many cases, non-function relations are still very important to study and understand, even when they are not functions. Any time we collect data for two changing quantities, this is called Bivariate Data: this is data for two variables. The way we can tell if these quantities are related, and how they are related, is by analyzing a graph called a Scatter Plot: A graph where each data is represented by a specific ordered pair, but the points are not connected. When specific change in one quantity is associated with specific change in another quantity, we call this a correlation (or also just called an association) between the two quantities. There are different types of correlations which we will study.
5
Different Types of Correlations
Negative Correlation (association): As x increases the y tends to decrease. Positive Correlation (association) Atmospheric Pressure in 100’s kPa Altitude in 1,000’s feet As x increases y also tends to increase. No Correlation: As x increases, there is no noticeable trend or association in y.
6
Weaker but Positive Correlation
Vs Stronger correlation. Data Clustering: We notice trends in the data, when we see “clustering”, which means that much of the data is close together. - Weaker associations are seen when clustering is still observed, but not as tightly packed. - Stronger associations are seen when clustering is much closer together. Outliers in a data set, are data that are outside of the main clustering. Outliers generally do not affect the trend seen by a cluster.
7
How do we Analyze our Bivariate Data, using a Scatter Plot?
We will try and see if there is a certain type of Correlation between our two sets of data (or our two variables), so that we may predict how one quantity will be affected, when the other quantity changes. Positive Correlation (association): a correlation such that while one quantity increases (x), the other quantity (y) tends to increase as well. Ex: There is a positive correlation between saturated fat intake, and risk of heart attack, because as one’s saturated fat intake INCREASES, one’s risk of heart attack TENDS to INCREASE. Negative Correlation (association): a correlation such that while one quantity increases (x), the other quantity (y) tends to decrease. Ex: There is a negative correlation between how many years one has been a cigarette smoker, and lung capacity, because as the number of years someone smokes cigarettes INCREASES, their lung capacity TENDS to DECREASE No Correlation: We find there is no correlation, when a change in one quantity (x), provides no noticeable affect, or trend in another quantity (y). Ex: There is no correlation between a city’s average annual temperature and that city’s average household income, because as a city’s annual average temperature increases, that city’s average wage neither tends to decrease or increase. There is no correlation because these two quantities are unrelated
8
For each, explain why you see a positive, negative correlation, or no correlation. If you want, include if you think the association is strong or weak.
9
POD How many times farther is (2*105) miles than (4*103) miles?
10
Data For 2 Scatter Plots. City Location (degrees north latitude)
Daily Mean Temperature (F) Mean Annual Precipitation (inches) Atlanta, GA 34 61 51 Boston, MA 42 Chicago, IL 49 36 Duluth, MN 47 39 30 Honolulu, HI 21 77 22 Houston, TX 68 46 Juneau, AK 58 41 54 Miami, FL 26 76 56 Phoenix, AZ 33 73 8 Portland, ME 44 45 San Diego, CA 64 10 Wichita, KS 38 29 In both of your scatter plots, “location” (latitude) is on your x axis. One scatter plot has temperature on the y axis. The other scatter plot puts precipitation on the y axis.
11
Climate Data Temp. (F) 80 70 60 50 40 30 20 10 Location (latitude)
12
Precipitation Data Mean Pre- cipita- tion (F) 80 70 60 50 40 30 20 10
Location (latitude)
13
Climate Data Temp. (F) 80 70 60 50 40 30 20 10 Location (latitude)
14
Precipitation Data Mean Pre- cipita- tion (F) 80 70 60 50 40 30 20 10
Location (latitude)
15
Scatter Plot Analysis: Temperature versus Latitude, and Precipitation versus Latitude.
All Groups: Using your scatter plots, justify what type of correlation, if any, exists between… a.) latitude and temperature b.) latitude and precipitation Group A: Justify whether the relationship between temperature, and amount of ice cream sold, shows a positive or negative correlation. Justify whether the relationship between temperature, and the number of layers of clothing, shows a positive or negative correlation. Group B: Justify what type of correlation there is between the number of miles a car has driven in its lifetime, and its selling price. Justify what type of correlation exists between the volume of an engine, and how much horsepower it has. Justify what type of correlation exists between a person’s height, and a person’s win percentage in the game of chess. Justify what type of correlation exists between a person’s height and a person’s weight. Group C: Discuss and come up with 5 different pairs of quantities, such that… 1 pair shows a strong positive correlation , 1 pair shows a weak positive correlation , 1 pair shows a strong negative correlation , 1 pair shows a weak negative correlation , and 1 pair shows no correlation.
16
Using a scatter plot to accurately estimate a trend line.
Trend Line: A line, or linear function, which best approximates the trend of data on a scatter plot. Also called a “line of best fit” and a “linear regression line” or “linear regression model”. - In advanced statistics you learn a precise formula which places the trend line over the data, such that the distances from all the data points to the trend line deviate from the median distance as little as possible. - Many mathematicians use technology to determine this trend line, which you will do in high school (or regents) - We will use reasonable estimation to approximate where the Why are trend lines important? - The most important application of trend lines is their power to predict. Although we may discover a particular trend, positive or negative, between two quantities, a trend line can more specifically represent that trend such that we can predict where the data from one quantity will be, after the data from the other quantity reaches a certain point, beyond what we have already observed or collected. - Example: See next slide.
17
Advertising dollars spent vs. Number of items sold.
As we see below, there is a positive association between the money a company spends on advertising a product, and the number of those products sold. What if though, you are in charge of the advertising department’s budget at the company you work for. If you wanted to increase the budget, your boss may want to know more specifics about what the projected sales might be, to justify the extra spending on advertising… What if you… Needed to project how many sales are likely to be made if the company spent 8 thousand on advertising? Have a particular sales goal of 200 units, but need to know more accurately how many new dollars would need to be added to the advertising budget to help make that happen?
18
Advertising dollars spent vs. Number of items sold.
148 144 140 (8, 136) 136 132 128 8.00
19
First, lets look at the scatter plot below comparing data collected on the selling price of a used car, and how many total miles the car has driven. How would we communicate what type of association we see, and why?
20
#1 #2 We see 3 attempts at estimating where the appropriate linear trend line would fit over the data on our scatter plot. Which one do we think is the most reasonable approximation to fit the data displayed? Is the line straight? Is most of the data above or below the line? Is about half of the data above the line, while about half of the data is below? Are most of the data points about the same average distance, or close to the average distance from the trend line? #3
21
We have a winner!! From the previous slide, #3 showed the most accurate trend to approximate the data. about half of the data is above the line, while about half is below. The average distance of all the points to the trend line is close to as small as possible
22
You Practice Estimating, and Using a Trend Line
Estimate a trend line (Use pencil, and a straight edge!!) For each scatter plot provided (#1-4). We will now use our trend lines to make reasonable predictions about these quantities, about DATA WE DO NOT HAVE. Choose 2 below, and reasonably predict… #1.) …what a wife’s age would likely be if her husband is 65. #2.) … the profit, after 4 months since the recall #3.) … the money likely to be spent on snacks if one day the attendance reached 90. #4.) … the likely earnings, if 50 shoppers entered the store one day
23
#1) #2) Profit (in $ 100,000’s) Months following a product recall #3) #4) Daily store earnings ($) Total Daily $ spent on snacks Daily # of store shoppers. Daily County Fair Attendance
24
Predicting from the Trend Line: What if we want the best prediction for how many advertising dollars could get us to 200 sales? Or how many sales we could predict for an advertising budget of $15,000? What if the graph is not big enough to display as far as we want to predict? 148 144 140 (8, 136) 136 132 128 8.00
25
Forming the rule for the Trend Line: We have previously learned how to form a linear function rule (y=mx+b) from two ordered pairs. So for our trend line, we need to pick any 2 points on our line. These do not have to be points from the data already plotted. The points must lie directly on the trend line. 148 144 140 (8, 136) 136 132 128 (6, 112) 8.00
26
Review of finding the rule for a linear function from 2 ordered pairs.
1.) Use the point slope formula to find slope. Y2 – Y1 X2 – X1 2.) Substitute your slope for “m” into y=mx+b 3.) Borrow an (x, y) pair, and also substitute those into y=mx+b 4.) solve for, isolate, “b” 5.) Now write your rule with slope as “m”, and your y-intercept, as “b” 136 – 112 - 6 24 = 12 2 Y = mx + b y = 12x + b 112 = 12(6) + b 112 = 72 + b 40 = b Y=12x + 40
27
#1) #2) Profit (in $ 100,000’s) Months following a product recall #3) #4) Daily store earnings ($) Total Daily $ spent on snacks Daily # of store shoppers. Daily County Fair Attendance
28
Practice using the rules for our trend lines to make predictions
Group A: 1.) For Graph #2, find the rule for your trend line, and use your rule to predict the profit at 10 months since the product recall. 2.) How could you use your rule to determine how many months after the recall it would take for the profit to reach zero, if the trend continues? 3.) Why is it useful or necessary to have the rule for the trend line? Group B: 1.) For Graph #3, find the rule for your trend line, and use your rule to predict the money expected to be spent on snacks if 500 people attended the fair. 2.) What does the slope of your trend line represent in graph #3? 3.) Why is it useful or necessary to have the rule for a trend line? Group C: 1.) For Graph #4, find the rule for your trend line, and use your rule to predict the money expected to be earned on a day when 500 shoppers entered the store. 2.) For graph #4, Suppose you wanted your store to earn at least $2,300 every day. How many daily shoppers would likely be needed to reach this goal? 3.) Why is it useful or necessary to have the rule for a trend line?
29
Homework: (When making scatter plots, they must be on graph paper)
#17: p , read, do quickchecks #1, 2. (Scatter plot must be on graph paper). #18: p , read all examples, do quickchecks #1-3 #19: p337 #1-6 #20: p340 – 342, read all examples, do quickchecks #1-3 #21: p343, #1-6
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.