Download presentation
Presentation is loading. Please wait.
Published byJob Bailey Modified over 6 years ago
1
Statistics 200 Lecture #5 Tuesday, September 6, 2016
Textbook: Sections 2.7 through 3.2 Objectives: • Define z-scores and relate them to the empirical ( ) rule • Explore scatterplots as a tool for visualizing two quantitative variables • Familiarize yourselves with least squares regression lines: – slope interpretation – y-intercept interpretation – dangerous to extrapolate
2
Standardized z-scores
Tells us how many standard deviations an observation is from the mean. A useful measure of the relative value of any observation in a dataset Allows comparison of observations in different data sets.
3
Standardized z-scores
Z-scores correspond directly to the Empirical Rule. About 68% of values have z-scores between __ and __. About 95% of values have z-scores between __ and __. About 99.7% of values have z-scores –1 1 –2 2 –3 3
4
Example 1 What is the z-score and interpretation in the following situation? Obs = 3, mean = 4, SD = 0.5 Z-score = (observation – mean)/SD = (3 – 4) / 0.5 = –1 / 0.5 = –2 Interpretation: The observation of 3 is 2 standard deviations below the mean.
5
Example 2 What is the z-score and interpretation in the following situation? Obs = 200, mean=150, SD = 20 Z-score = (observation – mean)/SD = ( )/20 = 50/20 = 2.5 Interpretation: The observation 200 is 2.5 standard deviations above the mean.
6
More complicated example: which person has a more unusual height?
Me: a 53” tall woman My husband: a 73” tall man Women’s heights are normal with mean 54” and std. dev. 3”. Men’s heights are normal with mean 70” and std. dev. 3” These heights come from different distributions, so we cannot compare them directly. We need a tool to make them comparable… Z-score!
7
Calculate Z-scores for both:
Me: Z-score = (obs – mean)/(std. dev) = (53 – 54) / (3) = -1/3 = -0.33 Husband: Z-score = (obs – mean) / (std. dev) = (73 – 70) / 3 = 3 / 3 = 1
8
Compare Z-scores – draw them below
Me Husband
9
Compare Z-scores below .33 Me: ____ std. dev. _____ the mean
Conclusion: My husband’s height is more unusual than mine, because it is more std. dev. from the mean. below .33 Me: ____ std. dev. _____ the mean Husband: ____ std. dev. _____ the mean 1 above
10
So far… We have talked about quantitative variables, but only one at a time. Now we’re going to begin looking at the relationships between two different quantitative variables. Start with looking at a Scatterplot
11
Scatterplots: A scatterplot is a two-dimensional graph of two numeric variables. There are two axes on a scatterplot, the vertical axis (y-axis) and the horizontal axis (x-axis). The y-axis is assigned to the response variable The x-axis is assigned to the explanatory variable.
12
Example 1: Apartment size and rent
Two Variables: size of one-bed-room apartment (square feet) monthly rent ($) Size (Square Ft) Rent ($) 415 438 485 636 548 666 646 545 690 688 538 469 1000 833 1003 1089 1150 1181 1237 1225 1469 1501 1177 958
13
What is the average pattern? What is the direction of the pattern?
A positive, linear association Response / dependent / y variable Explanatory / independent / x variable
14
Linear versus curvilinear
Linear relationship a relationship that, on average, will follow a line Curvilinear or nonlinear relationship a relationship that, on average, will follow a curve
15
Association : a term used to describe direction of the pattern shown by the two variables.
A positive association occurs when the values of one variable tend to _________as the values of the other variable increase. A negative association occurs when the values of one variable tend to _________ as the values of the other variable increase. increase decrease
16
Outliers unusual combination
When we consider two variables, an outlier is a point with an _________________ of values. May be unusual and interesting data points, or may be errors. unusual combination
17
Example – Tornado Activity
Variables: year number of tornadoes (Jan – May) Unusually high observations that don’t follow trend of other observation Source: National Weather Service
18
Formalize the trend: Regression lines
Regression line: a straight line that describes how values of the response variables (y) are related, on average, to values of the explanatory variable (x). We can use the regression line to… Estimate average value of y at a specified value of x Predict the unknown value of y for an individual using that individual’s x value.
19
Specify Linear Relationships with Simple Linear Regression Model
used to find the best straight line to fit the data points Name of Procedure: ___________ Squares Least Square Model: smallest ________ of the __________ differences found with all possible lines Least sum squared
20
The regression equation
In math average value of y In statistics y-intercept slope
21
In a picture:
22
Example : Positive Linear Relationship between meal bill ($) and amount of tip ($)
data from a restaurant r = & n = 10 bills
23
Example: Tip example Question:
Use the amount of bill ($) to estimate the amount of tip left ($), on the average? Identify the Variables: Bill ($): response explanatory Tip ($): response explanatory Note: explanatory variable is also called the predictor variable
24
To fit a regression line in Minitab: Stat > Regression > Fitted Line Plot
correctly identify explanatory variable and response straight line: simple linear regression
25
Least Squares Regression Equation
The regression equation is Tip = Bill sample y-intercept (bo) sample slope (b1)
26
Slope Interpretation 1 increase 19 Tip = -0.60 + 0.19 Bill tip tip
For each additional ___ $ found on the bill, you can expect the tip to ____________ by ___ cents, on the average 1 increase 19
27
Y-intercept Interpretation
Tip = Bill bo = -$0.60 In theory it says: When you have no bill, you can expect a tip to be ________ So does the y-intercept have a logical interpretation in the context of this problem? -$0.60 No: we have no data for bill = 0
28
Estimation & Limitations
Question: If the bill is $30, estimate the average amount left for a tip? Tip = Bill 30 Tip = ×(_____) $5.1 Tip = ______ Note: Bill = $30 is not an actual observation in the sample Estimate Can: _______________ within the range of $15 to $45
29
Example 5B: Estimation & Limitations
Question: If the bill is $70, estimate the average amount left for a tip. 70 x = $_____ Tip = × Bill Extrapolate Can’t: _______________ outside the range of $15 to $45
30
To remember about regression equations:
Y-intercept: logical interpretation: restricted to data where ____ is in the range of data in the sample No Extrapolation: don’t use a regression equation to estimate a value for the response variable ___________ the range of x values Estimation: regression equation estimates the __________ value for y at a given value of x. outside average
31
Review: If you understood today’s lecture, you should be able to solve
3.1, 3.3, 3.5, 3.13, 3.15, 3.19, 3.21 Recall Objectives: • Define z-scores and relate them to the empirical ( ) rule • Explore scatterplots as a tool for visualizing two quantitative variables • Familiarize yourselves with least squares regression lines: – slope interpretation – y-intercept interpretation – dangerous to extrapolate
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.