Download presentation
Presentation is loading. Please wait.
Published byPatrick Gardner Modified over 9 years ago
1
More Univariate Data Quantitative Graphs & Describing Distributions with Numbers
2
Quantitative Data Quantitative variables take numerical values for which it makes sense to do arithmetic operations like adding or averaging. Possible Graphs: dotplots, stemplots, histograms, Cumulative frequency plots, boxplots
3
Graphs Be sure to always: *Title your graphs *Label your axis including units of measure *number your axes in a consistent and reasonable manner
4
Quantitative Graphs Histograms A histogram’s vertical axis is counts while a relative frequency histogram’s vertical axis is percents.
5
Stem & Leaf This type of graph uses place values as the stems & units as the leaves. (It’s very hard to describe, we are going to make one for an example.) We can also create what’s called a back-to-back stem plot with two data sets. It is helpful for comparing to sets of univariate data. Quantitative Graphs
6
A histogram is preferred sometimes for larger data sets. It’s strongest asset is that it shows shape well. It’s weakness is that the individual data values are lost. A stem & leaf is preferred sometimes because it retains all data values but it’s very difficult to create for large data sets. Quantitative Graphs
7
Quantitative Data The distribution of a variable tells us what values the variable typically takes and how often it takes them. It is a generalization about the variable values.
8
When describing any Quantitative distribution: C – Center U – Unusual Features S – Shape S – Spread & B – Be S - Specific
9
Common Shapes of distributions/graphs Symmetric Skewed to the right Skewed to the left Bimodal Uniform
10
Once you have chosen a shape, you choose a measure of center and spread based on that shape.
11
Center when the distribution is symmetric Mean: the average formula:
12
Measure Spread or Variability when the distribution is Symmetric Standard deviation:
13
Measure of Center when the distribution is not symmetric: Median – the middle value in an ordered list. If there are two values in the middle, then average them.
14
Measure Spread or Variability when the distribution is not Symmetric We can also examine spread by looking at the range of middle 50% of the data. This is called the: Interquartile Range (IQR). IQR = Q3 – Q1
15
We also need to talk about the 5-number summary. The 5-number summary is made up of the minimum, the first quartile, Q1 (where 25% of the data lies below this value), the median, the third quartile, Q3 (where 75% of the data lies below this value), and the maximum.
16
Another Measure of Spread or Variability Range – the difference between the maximum and the minimum observations. This is the simplest measure of spread. We typically use this as preliminary information or if it is the only measure of spread we can calculate.
17
Another measure of spread or variability Variance is the average of the squares of the deviations of the observations from their mean. It is the standard deviation squared.
18
An outlier is an individual observation in data that falls outside the overall pattern of the data.
19
Using the IQR, we can perform a test for outliers. Outlier Test: Any value below Q1 – 1.5(IQR) or above Q3 + 1.5 (IQR) is considered an outlier.
20
Another Graph… When we graph the five-number summary along with outliers if present, it leads to a modified boxplot.
21
Measures that are not strongly affected by extreme values are said to be resistant. The median and IQR are more resistant than the mean and standard deviation. The standard deviation, is even less resistant than the mean.
22
Measures of Spread or Variability – Why? We measure spread because it’s an important description of what is happening with the data. We need to know about the amount of variation we can expect in a data set.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.