Measures of Spread Chapter 3.3 – Tools for Analyzing Data Mathematics of Data Management (Nelson) MDM 4U.

Slides:



Advertisements
Similar presentations
Describing Quantitative Variables
Advertisements

DESCRIBING DISTRIBUTION NUMERICALLY
Class Session #2 Numerically Summarizing Data
The mean for quantitative data is obtained by dividing the sum of all values by the number of values in the data set.
Measures of Dispersion
Introduction to Summary Statistics
1 Chapter 1: Sampling and Descriptive Statistics.
DENSITY CURVES and NORMAL DISTRIBUTIONS. The histogram displays the Grade equivalent vocabulary scores for 7 th graders on the Iowa Test of Basic Skills.
Descriptive Statistics
Measures of Dispersion or Measures of Variability
B a c kn e x t h o m e Parameters and Statistics statistic A statistic is a descriptive measure computed from a sample of data. parameter A parameter is.
Looking at data: distributions - Describing distributions with numbers
1.2: Describing Distributions
Measures of Dispersion
Lecture 4 Dustin Lueker.  The population distribution for a continuous variable is usually represented by a smooth curve ◦ Like a histogram that gets.
Chapter 2 Describing Data with Numerical Measurements
Describing Data Using Numerical Measures
AP Statistics Chapters 0 & 1 Review. Variables fall into two main categories: A categorical, or qualitative, variable places an individual into one of.
Chapter 6.
Objectives 1.2 Describing distributions with numbers
STA Lecture 111 STA 291 Lecture 11 Describing Quantitative Data – Measures of Central Location Examples of mean and median –Review of Chapter 5.
Warm Up Solve for x 2) 2x + 80 The product of a number
Warm-Up If the variance of a set of data is 12.4, what is the standard deviation? If the standard deviation of a set of data is 5.7, what is the variance?
Measures of Spread Chapter 3.3 – Tools for Analyzing Data I can: calculate and interpret measures of spread MSIP/Home Learning: p. 168 #2b, 3b, 4, 6, 7,
Measures of Central Tendency and Dispersion Preferred measures of central location & dispersion DispersionCentral locationType of Distribution SDMeanNormal.
Objectives Vocabulary
What is variability in data? Measuring how much the group as a whole deviates from the center. Gives you an indication of what is the spread of the data.
Graphical Displays of Information 3.1 – Tools for Analyzing Data Learning Goal: Identify the shape of a histogram MSIP / Home Learning: p. 146 #1, 2, 4,
Graphical Displays of Information Chapter 3.1 – Tools for Analyzing Data Mathematics of Data Management (Nelson) MDM 4U.
Warm up The following graphs show foot sizes of gongshowhockey.com users. What shape are the distributions? Calculate the mean, median and mode for one.
1 CHAPTER 3 NUMERICAL DESCRIPTIVE MEASURES. 2 MEASURES OF CENTRAL TENDENCY FOR UNGROUPED DATA  In Chapter 2, we used tables and graphs to summarize a.
STATISTICS “CALCULATING DESCRIPTIVE STATISTICS –Measures of Dispersion” 4.0 Measures of Dispersion.
Copyright © 2014 by Nelson Education Limited. 3-1 Chapter 3 Measures of Central Tendency and Dispersion.
INVESTIGATION 1.
Dr. Serhat Eren 1 CHAPTER 6 NUMERICAL DESCRIPTORS OF DATA.
Measures of Central Tendency Chapter 3.2 – Tools for Analyzing Data Mathematics of Data Management (Nelson) MDM 4U.
Numerical Measures of Variability
Chapter 3 Review MDM 4U Mr. Lieff. 3.1 Graphical Displays be able to effectively use a histogram name and be able to interpret the various types of distributions.
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 3-1 Business Statistics, 4e by Ken Black Chapter 3 Descriptive Statistics.
Graphical Displays of Information
LIS 570 Summarising and presenting data - Univariate analysis.
Numerical descriptions of distributions
© 2012 W.H. Freeman and Company Lecture 2 – Aug 29.
Chapter 3 Review MDM 4U Mr. Lieff. 3.1 Graphical Displays be able to effectively use a histogram name and be able to interpret the various types of distributions.
Minds on! Two students are being considered for a bursary. Sal’s marks are Val’s marks are Which student would you award the bursary.
Describing Data: Summary Measures. Identifying the Scale of Measurement Before you analyze the data, identify the measurement scale for each variable.
Statistics -Descriptive statistics 2013/09/30. Descriptive statistics Numerical measures of location, dispersion, shape, and association are also used.
Chapter 4: Measures of Central Tendency. Measures of central tendency are important descriptive measures that summarize a distribution of different categories.
3.3 Measures of Spread Chapter 3 - Tools for Analyzing Data Learning goal: calculate and interpret measures of spread Due now: p. 159 #4, 5, 6, 8,
Introduction to Statistics
Introduction to Statistics
Lesson 11.1 Normal Distributions (Day 1)
Measures of Central Tendency
NUMERICAL DESCRIPTIVE MEASURES
Descriptive Statistics
(12) students were asked their SAT Math scores:
Summary Statistics 9/23/2018 Summary Statistics
Descriptive Statistics
Chapter 5: Describing Distributions Numerically
Describing Data with Numerical Measures
Data Analysis and Statistical Software I ( ) Quarter: Autumn 02/03
Data Analysis and Statistical Software I Quarter: Spring 2003
Summary (Week 1) Categorical vs. Quantitative Variables
MCC6.SP.5c, MCC9-12.S.ID.1, MCC9-12.S.1D.2 and MCC9-12.S.ID.3
MBA 510 Lecture 2 Spring 2013 Dr. Tonya Balan 4/20/2019.
Chapter 1: Exploring Data
Advanced Algebra Unit 1 Vocabulary
Lesson Plan Day 1 Lesson Plan Day 2 Lesson Plan Day 3
Presentation transcript:

Measures of Spread Chapter 3.3 – Tools for Analyzing Data Mathematics of Data Management (Nelson) MDM 4U

Foot size of gongshowhockey.com users What shape are the distributions?

What is spread? spread tells you how widely the data are dispersed for example, the two histograms have identical mean and median, but the spread is significantly different

Why worry about spread? spread indicates how close the values cluster around the middle value  less spread means you have greater confidence that values will fall within a particular range.

Vocabulary spread and dispersion refer to the same thing range is the difference between the largest and smallest values a quartile is one of three numerical values that divide a group of numbers into 4 equal parts the Interquartile Range (IQR) is the difference between the first and third quartiles

Quartiles Example range = 55 – 26 = 29 Q2 = 41Median Q1 = 36Median of lower half of data Q3 = 46Median of upper half of data IQR = Q3 – Q1 = 46 – 36 = 10 (contains 50% of data) if a quartile occurs between 2 values, it is calculated as the average of the two values

A More Useful Measure of Spread The interquartile range is a somewhat useful measure of spread Standard deviation is more useful To calculate it we need to find the mean and the deviation for each data point Mean is easy, as we have done that before Deviation is the difference between a particular point and the mean

Deviation The mean of these numbers is 48 The deviation for 24 is = The deviation for 84 is = 36

Standard Deviation deviation is the distance from the piece of data you are examining to the mean variance is a measure of spread found by averaging the squares of the deviation calculated for each piece of data Taking the square root of variance, you get standard deviation Standard deviation is a very important and useful measure of spread

Standard Deviation σ² (lower case sigma squared) is used to represent variance σ is used to represent standard deviation σ is commonly used to measure the spread of data, with larger values of σ indicating greater spread we are using a population standard deviation

Example of Standard Deviation mean = ( ) / 4 = 31 σ² = (26–31)² + (28-31)² + (34-31)² + (36-31)² 4 σ² = 17 σ = √17 = 4.12

Standard Deviation with Grouped Data grouped mean = (2×2 + 3×6 + 4×6 + 5×2) / 16 = 3.5 deviations:  2: 2 – 3.5 = -1.5  3: 3 – 3.5 = -0.5  4: 4 – 3.5 = 0.5  5: 5 – 3.5 = 1.5 σ² = 2(-1.5)² + 6(-0.5)² + 6(0.5)² + 2(1.5)² 16 σ² = σ = √ = 0.87 Hours of TV 2345 Frequency 2662

MSIP / Homework read through the examples on pages Complete p. 168 #2b, 3b, 4, 6, 7, 10 you are responsible for knowing how to do simple examples by hand (<10 pieces of data) however, we will use technology (Fathom) to calculate larger examples have a look at your calculator and see if you have this feature (Σσn and Σσn-1)

Normal Distribution Chapter 3.4 – Tools for Analyzing Data Mathematics of Data Management (Nelson) MDM 4U

Histograms Histograms may be skewed... Right-skewed Left-skewed

Histograms... or symmetrical

Normal? A normal distribution creates a histogram that is symmetrical and has a bell shape, and is used quite a bit in statistical analyses Also called a Gaussian Distribution It is symmetrical with equal mean, median and mode that fall on the line of symmetry of the curve

A Real Example the heights of 600 randomly chosen Canadian students from the “Census at School” data set the data approximates a normal distribution

The % Rule area under curve is 1 (i.e. it represents 100% of the population surveyed) approx 68% of the data falls within 1 standard deviation of the mean approx 95% of the data falls within 2 standard deviations of the mean approx 99.7% of the data falls within 3 standard deviations of the mean

Distribution of Data 34% 13.5% 2.35% 68% 95% 99.7% xx + 1σx + 2σx + 3σx - 1σx - 2σx - 3σ 0.15%

Normal Distribution Notation The notation above is used to describe the Normal distribution where x is the mean and σ² is the variance (square of the standard deviation) e.g. X~N (70,8 2 ) describes a Normal distribution with mean 70 and standard deviation 8 (our class at midterm?)

Percentage of data between two values The area under any normal curve is 1 The percent of data that lies between two values in a normal distribution is equivalent to the area under the normal curve between these values See examples 2 and 3 on page 175

Why is the Normal distribution so important? Many psychological and educational variables are distributed approximately normally:  reading ability, memory, etc. Normal distributions are statistically easy to work with  All kinds of statistical tests are based on it Lane (2003)

An example Suppose the time before burnout for an LED averages 120 months with a standard deviation of 10 months and is approximately Normally distributed. What is the length of time a user might expect an LED to last? 95% of the data will be within 2 standard deviations of the mean This will mean that 95% of the bulbs will be between 120 – 2×10 months and ×10 So 95% of the bulbs will last months

Example continued… Suppose you wanted to know how long 99.7% of the bulbs will last? This is the area covering 3 standard deviations on either side of the mean This will mean that 99.7% of the bulbs will be between 120 – 3×10 months and ×10 So 99.7% of the bulbs will last months This assumes that all the bulbs are produced to the same standard

Example continued… 34% 13.5% 2.35% 95% 99.7% months

Exercises try page 176 #1, 3b, 6, 8, 9, 10

References Lane, D. (2003). What's so important about the normal distribution? Retrieved October 5, 2004 from bution.html Wikipedia (2004). Online Encyclopedia. Retrieved September 1, 2004 from