Download presentation
Presentation is loading. Please wait.
Published byBruno Erick Black Modified over 8 years ago
1
Section 2.1 Visualizing Distributions: Shape, Center, and Spread
2
Exploration of univariate data should follow this sequence: 1.plot → 2. shape → 3. center → 4. spread 1. Choose an appropriate plot 4 kinds of graphs will be shown: Dot plot, Stem plots, Bar charts, Histogram (or Relative Frequency) plots 2. Describe the shape 4 kinds of shapes will be demonstrated: Uniform, Normal, Skewed, Bi-modal 3. Find a measure of center appropriate to the shape 2 kinds of centers will be calculated (formulas to be taught): mean and median (sometimes mode) 4. Find a measure of spread that agrees with the measure of center. 3 kinds of spread will be calculated (formulas to be taught): deviations, standard deviation, and variance
3
Data Distributions - “shapes” We will be considering 4 types of data distribution shapes: Uniform Normal Skewed Bimodal
4
Uniform Distribution – Ideal Shapes For a uniform distribution of data, all the values occur equally often (or the same amount) AKA a rectangular distribution because of its shape
5
Dot plots – Ideal Graph Uniform Distribution – Ideal Shape Answer: a set of data (group of numbers) where each datum (single number) is graphed as a single dot on an x-y graph. Example: If you roll a 10-sided die (labeled 0 to 9) 50 times, you may get a graph of 50 dots on a uniform distribution similar to below:
6
Histogram – Not Ideal Graph Uniform Distribution – Not Ideal Shape Describing : The graph shows a roughly uniform distribution of births across the months.
7
Dot Plot – Not Ideal Graph Uniform Distribution – Not Ideal Shape Activity 2.1a, page 28 Last Digit of 30 Phone Numbers Does this graph look like a uniform distribution? Is this a uniform distribution graph? Why or why not? [Hint: Would only 2 phone number digits look uniform? How about 1000 phone number digits?]
8
Normal Distribution – Ideal Shape Idealized normal shape is: perfectly symmetric single peak, or mode, at line of symmetry curve drops off smoothly on both sides, never touching x-axis, and stretches infinitely far in both directions.
9
Normal Distribution – Ideal Shape Center & Spread Center : Mean = Mode Spread : Standard deviation (SD) is the horizontal distance from the mean to an inflection point.
10
Normal Distribution – Ideal Shape Center & Spread Description : Use the mean to describe the center and standard deviation to describe spread of a normal distribution. 68% of the area under the curve lies within 1 standard deviation (SD) of the mean
11
Normal Distribution – Ideal Shape Center & Spread Use the mean to describe the center and standard deviation to describe spread of a normal distribution. Example: A typical random sample of five workers has an average age of 47 years, give or take about 4 years. Statistically speaking: IF the population is a normal distribution, the Mean is 47 years, the standard deviation is 4 years. (That is a “big if”) Remember: 1. plot → 2. shape → 3. center → 4. spread If step 1 (plot) or step 2 (shape) is wrong, the calculations (center & spread) can be made correctly but the interpretation (i.e story) will be incorrect.
12
Skewed Distributions – Ideal Shape Uniform and normal distributions are symmetric. Many common distributions show bunching at one end and a long tail stretching out in the other direction. These distributions are called skewed.
13
Skewed Distributions – Ideal Shape Direction of tail tells whether distribution is skewed left or skewed right.
14
Skewed Distributions – Ideal Shape Often the bunching in skewed distribution occurs because values “bump up against a wall.” Examples: Either a minimum that values can not go below, such as 0 for distances, or a maximum that values can not go above, such as 100 for percentages. Note: To have a “wall”, values must be bunched up against it
15
Skewed Distributions – Ideal Shape Center & Spread Use median to describe the center with the lower and upper quartiles to describe the spread. Example: The middle 50% of the SAT math scores were between 630 and 720, with half above 680 and half below.
16
Skewed Distributions – Not Ideal Shape Center & Spread The middle 50% of the Polar Bear weights are between about 115 and 250, with half above about 155 lbs and half below.
17
Bimodal Distributions – Ideal Shape Many distributions have only one peak-unimodal. Some have two peaks (bimodal) or even more.
18
Bimodal Distributions – Not Ideal Shape Locate the two peaks. However, it is more useful if you can find another variable that divides the data into two groups centered at the two peaks.
19
Other Possible Data Characteristics Outlier: unusual value that stands apart from the bulk of the data Cluster: a group of data “clustering” close to the same value, away from other groups Gap: on a plot, the space that separates clusters of data
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.