Minds on! Two students are being considered for a bursary. Sal’s marks are 97 92 84 71 Val’s marks are 95 90 86 73 Which student would you award the bursary.

Slides:



Advertisements
Similar presentations
Describing Quantitative Variables
Advertisements

DESCRIBING DISTRIBUTION NUMERICALLY
Unit 1.1 Investigating Data 1. Frequency and Histograms CCSS: S.ID.1 Represent data with plots on the real number line (dot plots, histograms, and box.
CHAPTER 4 Displaying and Summarizing Quantitative Data Slice up the entire span of values in piles called bins (or classes) Then count the number of values.
Descriptive Statistics
Chapter 3 Describing Data Using Numerical Measures
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
1 Distribution Summaries Measures of central tendency Mean Median Mode Measures of spread Range Standard Deviation Interquartile Range (IQR)
Measures of Dispersion
Measures of Central Tendency
Statistics for Linguistics Students Michaelmas 2004 Week 1 Bettina Braun.
LECTURE 12 Tuesday, 6 October STA291 Fall Five-Number Summary (Review) 2 Maximum, Upper Quartile, Median, Lower Quartile, Minimum Statistical Software.
1 DATA DESCRIPTION. 2 Units l Unit: entity we are studying, subject if human being l Each unit/subject has certain parameters, e.g., a student (subject)
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 3 Descriptive Statistics: Numerical Methods.
Measures of Spread Chapter 3.3 – Tools for Analyzing Data I can: calculate and interpret measures of spread MSIP/Home Learning: p. 168 #2b, 3b, 4, 6, 7,
Chapter 3 Descriptive Statistics: Numerical Methods Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Review Measures of central tendency
1 MATB344 Applied Statistics Chapter 2 Describing Data with Numerical Measures.
STAT 280: Elementary Applied Statistics Describing Data Using Numerical Measures.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
What is variability in data? Measuring how much the group as a whole deviates from the center. Gives you an indication of what is the spread of the data.
Graphical Displays of Information 3.1 – Tools for Analyzing Data Learning Goal: Identify the shape of a histogram MSIP / Home Learning: p. 146 #1, 2, 4,
Graphical Displays of Information Chapter 3.1 – Tools for Analyzing Data Mathematics of Data Management (Nelson) MDM 4U.
Lecture 3 Describing Data Using Numerical Measures.
Warm up The following graphs show foot sizes of gongshowhockey.com users. What shape are the distributions? Calculate the mean, median and mode for one.
Statistical Measures. Measures of Central Tendency O Sometimes it is convenient to have one number that describes a set of data. This number is called.
Lecture 5 Dustin Lueker. 2 Mode - Most frequent value. Notation: Subscripted variables n = # of units in the sample N = # of units in the population x.
INVESTIGATION 1.
Chap 3-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 3 Describing Data Using Numerical.
MDM4U Chapter 3 Review Normal Distribution Mr. Lieff.
Measures of Central Tendency Chapter 3.2 – Tools for Analyzing Data Mathematics of Data Management (Nelson) MDM 4U.
Measures of Spread Chapter 3.3 – Tools for Analyzing Data Mathematics of Data Management (Nelson) MDM 4U.
Chapter 3 Review MDM 4U Mr. Lieff. 3.1 Graphical Displays be able to effectively use a histogram recognize and be able to interpret the various types.
Chapter 3 Review MDM 4U Mr. Lieff. 3.1 Graphical Displays be able to effectively use a histogram name and be able to interpret the various types of distributions.
Chapter 6: Interpreting the Measures of Variability.
Graphical Displays of Information
Statistics and Data Analysis
© 2012 W.H. Freeman and Company Lecture 2 – Aug 29.
Chapter 3 Review MDM 4U Mr. Lieff. 3.1 Graphical Displays be able to effectively use a histogram name and be able to interpret the various types of distributions.
MDM4U Chapter 3/5 Review Normal Distribution Mr. Lieff.
Statistics Unit Test Review Chapters 11 & /11-2 Mean(average): the sum of the data divided by the number of pieces of data Median: the value appearing.
Describing Data: Summary Measures. Identifying the Scale of Measurement Before you analyze the data, identify the measurement scale for each variable.
3.3 Measures of Spread Chapter 3 - Tools for Analyzing Data Learning goal: calculate and interpret measures of spread Due now: p. 159 #4, 5, 6, 8,
3.1 Graphical Displays Name and be able to analyze the various types of distributions Symmetric: Uniform, U-shaped, Mound-shaped Asymmetric: Left/Right-skewed.
Notes 13.2 Measures of Center & Spread
Chapter 5 : Describing Distributions Numerically I
Statistics Unit Test Review
NUMERICAL DESCRIPTIVE MEASURES
Description of Data (Summary and Variability measures)
Summary Statistics 9/23/2018 Summary Statistics
Chapter 3 Describing Data Using Numerical Measures
CHAPTER 1 Exploring Data
Descriptive Statistics
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
CHAPTER 1 Exploring Data
Data Analysis and Statistical Software I Quarter: Spring 2003
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Summary (Week 1) Categorical vs. Quantitative Variables
CHAPTER 1 Exploring Data
MBA 510 Lecture 2 Spring 2013 Dr. Tonya Balan 4/20/2019.
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Advanced Algebra Unit 1 Vocabulary
Lesson Plan Day 1 Lesson Plan Day 2 Lesson Plan Day 3
Presentation transcript:

Minds on! Two students are being considered for a bursary. Sal’s marks are Val’s marks are Which student would you award the bursary to? Justify your choice.

3.3 Measures of Spread Chapter 3 - Tools for Analyzing Data Learning goal: Calculate and interpret measures of spread (4) Questions? p. 159 #4, 5, 6, 8, MSIP / Home Learning: p. 168 #2b, 3b, 4, 6, 7, 10 What is more important: potential or consistency?

What is spread? Measures of central tendency do not always tell you everything! These histograms have identical mean and median, but the spread is different! Spread is how closely the values cluster around the middle value

Why worry about spread? Less spread  greater confidence that values will fall within a particular range Important for making predictions

Measures of Spread We will study 4 Measures of Spread:  Range  Interquartile Range (IQR)  Variance / Standard Deviation (Std. Dev.) All 4 measure how spread out data is  Smaller value = less spread (more consistent)  Larger value = more spread (less consistent)

Measures of Spread 1) Range = (max) – (min)  Indicates the size of the interval that contains 100% of the data 2) Interquartile Range  IQR = Q3 – Q1, where  Q1 is the lower half median  Q3 is the upper half median  Indicates the size of the interval that contains the middle 50% of the data

Quartiles Example – 15 data values Q2 = 41Median Q1 = 36Lower half median Q3 = 46Upper half median IQR = Q3 – Q1 = 46 – 36 = 10 So the middle 50% of the data is within a span of 10 units

Quartiles Example – 14 data values | If a quartile occurs between 2 values, it is calculated as the average of the two values  Q2 = 40.5 Q1 = 36 Q3 = 45 IQR = Q3 – Q1 = = 9 The middle 50% of the data is within 9 units  This data is more consistent.

Box (and Whisker) Plot Min = 26 Q1 = 36Lower half median Q2 = 41Median Q3 = 46Upper half median Max = This is one of the graph types in Fathom / Excel 2016 – you can hack Excel 2013 (see website).

A More Useful Measure of Spread Range is very basic  Does not take clusters nor outliers into account Interquartile Range is somewhat useful  Takes clusters into account  Visual in Box-and-Whisker Plot Standard deviation is very useful  Average distance from the mean for all data points

Deviation The mean of the numbers below is 48 deviation = (data) – (mean) The deviation for 24 is = The deviation for 84 is = 36

Calculating Standard Deviation (σ) 1. Find the mean (average) 2. Find the deviation for each data point data point – mean 3. Square the deviations (data point – mean) 2 4. Average the squares of the deviations (this is called the variance, σ 2 ) 5. Take the square root of the variance

Example of Standard Deviation mean = ( ) ÷ 4 = 31 σ² = (26–31)² + (28-31)² + (34-31)² + (36-31)² 4 σ² = σ² = 17 σ = √17 = 4.1 (1 dec. pl.)

Standard Deviation σ² (lower case sigma squared) is used to represent variance σ is used to represent standard deviation σ is commonly used to measure the spread of data, with larger values of σ indicating greater spread we are using a population standard deviation (next slide)

Different Types of Std. Dev. Population Standard Deviation Sample Standard Deviation

Different Types of Std. Dev. Standard Deviation Standard Deviation with Grouped Data

grouped mean = (2×2 + 3×6 + 4×6 + 5×2) / 16 = 3.5 deviations:  2: 2 – 3.5 = -1.5  3: 3 – 3.5 = -0.5  4: 4 – 3.5 = 0.5  5: 5 – 3.5 = 1.5 σ² = 2(-1.5)² + 6(-0.5)² + 6(0.5)² + 2(1.5)² 16 σ² = σ = √ = 0.9 Hours of TV 2345 Frequency2662

Measures of Spread in Excel Range  = max (data) – min (data) IQR  = quartile (data, 3) – quartile (data, 1) Population Standard Deviation  = stdev.p (data)

Measure of Spread - Recap Measures of Spread indicate how spread out data is Smaller value means data is more consistent 1) Range = Max – Min 2) Interquartile Range: IQR = Q3 – Q1, where  Q1 = first half median  Q3 = second half median 3) Standard Deviation i. Find mean (average) ii. Find all deviations (data point) – (mean) iii. Square all iv. avg them (this is variance or σ 2 ) v. Take the square root to get std. dev., σ

MSIP / Home Learning Read through the examples on pp Complete p. 168 #2b, 3b, 4, 6, 7, 10 You are responsible for knowing how to do simple examples by hand (~6 pieces of data) We will use technology (Fathom/Excel) to calculate larger examples Have a look at your calculator and see if you have this feature (Σσn and Σσn-1)

3.4 Normal Distribution Chapter 3 – Tools for Analyzing Data Learning goal: Determine the % of data within intervals of a Normal Distribution Questions?p. 168 #2b, 3b, 4, 6, 7, 10 MSIP / Home Learning: p. 176 #1, 3b, 6, 8-10 “Think of how stupid the average person is, and realize half of them are stupider than that.” -George Carlin

Histograms Histograms can be skewed... Right-skewed Left-skewed CD Collection Roll of coins

Histograms... or symmetrical

Normal? A Normal distribution is a histogram that is symmetrical and has a bell shape It is used quite a bit in statistical analysis Also called a Gaussian Distribution Symmetrical with equal mean, median and mode that fall on the line of symmetry of the curve

A Real Example the heights of 600 randomly chosen Canadian students from the “Census at School” data set the data approximates a Normal distribution

The % Rule Area under curve is 1 (i.e. it represents 100% of the data) Approx 68% of the data falls within 1 standard deviation of the mean Approx 95% of the data falls within 2 standard deviations of the mean Approx 99.7% of the data falls within 3 standard deviations of the mean

Distribution of Data 34% 13.5% 2.35% 68% 95% 99.7% xx + 1σx + 2σx + 3σx - 1σx - 2σx - 3σ 0.15%

Normal Distribution Notation The notation above is used to describe the Normal distribution where x is the mean and σ² is the variance (square of the standard deviation) e.g., X~N (74,8 2 ) describes a Normal distribution with mean 74 and standard deviation 8 (e.g., class marks)

Example 1a) The time before burnout for a brand of LED averages 120 months with a standard deviation of 10 months and is approximately Normally distributed. So X~N(120,10 2 ). What is the length of time a user might expect an LED to last with: a) 68% confidence? b) 95% confidence?

Example 1) continued… 34% 13.5% 2.35% 95% 99.7% months 68% 110 months 130 months

Example 1) cont’d 68% of the data will be within 1 standard deviation of the mean This will mean that 68% of the bulbs will be between 120–10 = 110 months and = 130 months So 68% of the bulbs will last months 95% of the data will be within 2 standard deviations of the mean This will mean that 95% of the bulbs will be between 120 – 2×10 = 100 months and ×10 = 140 months So 95% of the bulbs will last months

Example 1b) Suppose you wanted to know how long 99.7% of the bulbs will last This is the area covering 3 standard deviations on either side of the mean This will mean that 99.7% of the bulbs will be between 120 – 3×10 months and ×10 So 99.7% of the bulbs will last months This assumes that all the bulbs are produced to the same standard

Example 1c) What % of LEDs will last between  90 and 130 months?  months?

Example 1c 34% 13.5% 2.35% 95% 99.7% months What % of LEDs will last between 90 and 130 months? = 88.85% What % of LEDs will last between 110 and 140 months? = months 130 months

Percentage of data between two values The area under any normal curve is 1 The percent of data that lies between two values in a normal distribution is equivalent to the area under the normal curve between these values See examples 2 and 3 on page 175

Why is the Normal distribution so important? Many psychological and educational variables are distributed Normally  height, memory, IQ, reading ability, etc. Normal distributions are statistically easy to work with  All kinds of statistical tests are based on them Lane (2003)

MSIP / Home Learning Complete p. 176 #1, 3b, 6,

References Lane, D. (2003). What's so important about the normal distribution? Retrieved October 5, 2004 from bution.html Wikipedia (2004). Online Encyclopedia. Retrieved September 1, 2004 from