Download presentation
Presentation is loading. Please wait.
Published byAllan York Modified over 7 years ago
1
Lecture #3 Tuesday, August 30, 2016 Textbook: Sections 2.4 through 2.6
Statistics 200 Lecture #3 Tuesday, August 30, 2016 Textbook: Sections 2.4 through 2.6 Objectives (all relating to quantitative variables): • Recognize and interpret two plots: – Histogram – Boxplot • Calculate and interpret two measures of center – Mean – Median • Calculate and interpret five-number summary • Recognize and understand effects of outliers & skewness
2
Motivating example A group of students was randomly assigned to one of two classes. One class was taught by teacher A and the other by teacher B. At the end of the semester, all students took the same exam. Investigate whether there is any difference in exam scores between the two teachers.
3
Summarizing Quantitative Variables
The distribution of a quantitative variable is the overall pattern of how often the possible values occur. Four key aspects of the distribution are: Location: center, average Spread: variability Shape: symmetric, bell, skew outliers Let’s begin with the shape, which is best seen with a visual summary
4
Visual summaries for quantitative variables
Histogram Boxplot A chart of the data that shows how many observations are in each equally spaced interval. Usually use 6-15 intervals Can use frequency or relative frequency
5
Histograms Teacher A Scores Teacher B Scores
6
Outlier An individual value that is unusual compared to the bulk of the other values. Outlier!
7
Example When considering study hours/week, what percent of the students spend: at most 3 hours? at least 11 hours? between 5 and 9 hours?
8
Shapes of distributions
Symmetric the shape of the data is similar on both sides of the center. Bell-shaped is a special case of symmetric Skewed: Values are more spread out on one side than the other. Left-skewed: lower values more spread out than higher values Right-skewed: higher values more spread out than lower values.
9
Shape Examples: Symmetric
Question: What is the fastest you have ever driven a car? Symmetric
10
Shape Examples: Right-skewed Left-skewed
Question: How many coins are you carrying? Right-skewed Left-skewed Question: What is your grade point average?
11
Breakdown of Descriptive Statistical Methods: Quantitative Data
graphs numbers: statistics Measures of center did one: histogram do now
12
Quantitative Data: Measures of Center
Mean: ___________ of all numbers symbol for sample mean: Value is sensitive to ______________ Median: middle observation of ___________ data value is resistant to ________________ Mode: observation that occurs most frequently don’t really use in this course Average Outliers ordered outliers
13
Example: Center and outliers
Sample 1 (n = 5) Sample 2 (n = 6) Sample Mean ( )/5 = 32/5 = Mean = ____ ( )/5 = 55/5 = Ordered Data/ Median Median = ____ Median = _____
14
Sensitive vs. Resistant statistics
Calculated using ALL observations Affected by skewness and / or unusual observations. Example: Mean Sensitive Statistic Resistant Statistic Calculated using only some observations Not affected much by outliers Example: Median
15
Examples: mean = 94.8 mph median = 95 mph mean = 17.3 coins
median = 9 coins
16
Work together question:
Which is most likely true when considering salaries($) in a company that employs: 1. 20 factory workers and 2 very highly paid executives: one would find with the salaries that the: mean > median mean < median mean ≈ median 2. 2 factory workers and 20 very highly paid executives: one would find with the salaries that the:
18
A percentile tells us how much of the data is below a specific value.
Percentiles What is the value (in studyhrs/week) for the: 5th percentile? 90th percentile?
19
Percentiles of Interest
25th percentile: ___________Quartile (QL) ___________ Quartile (Q1) Lower First 50th percentile: Second Quartile (Q2) ________ Median 75th percentile: __________ Quartile (QU) __________ Quartile (Q3) Upper Third
20
We use quartiles for the…
Five Number Summary smallest number (min) lower or first quartile median upper or third largest (max) Numerical method for summarizing quantitative data.
21
Example: 5-Number summary
Descriptive Statistics: Fastest_Speed Variable N Minimum Q1 Median Q3 Maximum Fastest_Speed Fill-in the five number summary 25th 50th 75th Min Q1 Median Q3 Max
22
Another look: 5-number summary
The 5-number summary divides your data into 4 quarters:
23
Approximately what percent of the fastest speeds:
Min Q1 Median Q3 Max 45 90 95 100 135 Approximately what percent of the fastest speeds: are at least 100 mph? are at most 90 mph?
24
Approximately what percent of the fastest speeds lie:
Min Q1 Median Q3 Max 45 90 95 100 135 Approximately what percent of the fastest speeds lie: between 90 and 100 mph? (at most 95) or (at least 100?) 45 90 95 100 135
25
Visual summaries for quantitative variables
Histogram Boxplot A chart of the data that shows how many observations are in each equally spaced interval. Usually use 6-15 intervals Can use frequency or relative frequency Visualization of the 5-number summary Shows Q1, Median, Q3 as lines around and through a middle box. Identifies outliers.
26
Boxplots: Examples Max 135 mph Q3 100 mph Median 95 mph Q1 90 mph Min
80 coins Q3 25 coins Median 9 coins Q1 5 coins Min 0 coins
27
Boxplot shows same shape as histogram
Symmetric
28
Boxplot shows same shape as histogram
Right-skewed
29
Boxplot shows same shape as histogram
Left-skewed
30
Link measures of center to shape
31
Another example: Parties per month
Outliers!
32
Parties per month, without the outliers
33
Median: 50% of students surveyed partied less than 4.5 times per month.
Right-skewed mean > median
34
Consider the variables Party and Year
Response How many parties do you attend in a month? What year are you in school? Explanatory
35
Consider the variables Party and Year
How many parties do you attend in a month? What year are you in school? Quantitative Categorical (ordinal)
36
Explore relationship with boxplot
37
Which year has highest median?
Largest box? Most outliers? Do we observe a trend?
38
Review: If you understood today’s lecture, you should be able to solve
Objectives (all relating to quantitative variables): • Recognize and interpret two plots: – Histogram – Boxplot • Calculate and interpret two measures of center – Mean – Median • Calculate and interpret five-number summary • Recognize and understand effects of outliers & skewness
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.