1 ES Chapter 2 ~ Descriptive Analysis & Presentation of Single-Variable Data Mean: 60.07 inches Median: 62.50 inches Range: 42 inches Variance: 117.681.

Slides:



Advertisements
Similar presentations
Measures of Dispersion
Advertisements

Chapter 3 Describing Data Using Numerical Measures
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Slides by JOHN LOUCKS St. Edward’s University.
B a c kn e x t h o m e Classification of Variables Discrete Numerical Variable A variable that produces a response that comes from a counting process.
Chap 3-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 3 Describing Data: Numerical Statistics for Business and Economics.
12.3 – Measures of Dispersion
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Describing Data: Numerical
Chapter 2 Describing Data with Numerical Measurements
Chapter Goals Learn how to present and describe sets of data.
Mean: inches Median: inches Range: 42 inches Variance:
Chapter 13 Section 5 - Slide 1 Copyright © 2009 Pearson Education, Inc. AND.
Department of Quantitative Methods & Information Systems
Describing distributions with numbers
Chapter 2 Describing Data with Numerical Measurements General Objectives: Graphs are extremely useful for the visual description of a data set. However,
Descriptive Statistics
Numerical Descriptive Techniques
Chapter 3 Averages and Variations
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 3 Descriptive Statistics: Numerical Methods.
© Copyright McGraw-Hill CHAPTER 3 Data Description.
© 2008 Brooks/Cole, a division of Thomson Learning, Inc. 1 Chapter 4 Numerical Methods for Describing Data.
Copyright © Cengage Learning. All rights reserved. 2 Descriptive Analysis and Presentation of Single-Variable Data.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics Seventh Edition By Brase and Brase Prepared by: Lynn Smith.
Chapter 3 Descriptive Statistics: Numerical Methods Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Review Measures of central tendency
STAT 280: Elementary Applied Statistics Describing Data Using Numerical Measures.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
What is variability in data? Measuring how much the group as a whole deviates from the center. Gives you an indication of what is the spread of the data.
Chapter 2 Describing Data.
Describing distributions with numbers
McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 3 Descriptive Statistics: Numerical Methods.
Lecture 3 Describing Data Using Numerical Measures.
Chapter 2: Descriptive Analysis and Presentation of Single- Variable Data.
1 CHAPTER 3 NUMERICAL DESCRIPTIVE MEASURES. 2 MEASURES OF CENTRAL TENDENCY FOR UNGROUPED DATA  In Chapter 2, we used tables and graphs to summarize a.
Larson/Farber Ch 2 1 Elementary Statistics Larson Farber 2 Descriptive Statistics.
Dr. Serhat Eren 1 CHAPTER 6 NUMERICAL DESCRIPTORS OF DATA.
Understanding Basic Statistics Fourth Edition By Brase and Brase Prepared by: Lynn Smith Gloucester County College Chapter Three Averages and Variation.
Chap 3-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 3 Describing Data Using Numerical.
© Copyright McGraw-Hill CHAPTER 3 Data Description.
Business Statistics Spring 2005 Summarizing and Describing Numerical Data.
Copyright © Cengage Learning. All rights reserved. 2 Descriptive Analysis and Presentation of Single-Variable Data.
Basic Business Statistics Chapter 3: Numerical Descriptive Measures Assoc. Prof. Dr. Mustafa Yüzükırmızı.
1 Descriptive Statistics 2-1 Overview 2-2 Summarizing Data with Frequency Tables 2-3 Pictures of Data 2-4 Measures of Center 2-5 Measures of Variation.
 The mean is typically what is meant by the word “average.” The mean is perhaps the most common measure of central tendency.  The sample mean is written.
1 Chapter 4 Numerical Methods for Describing Data.
1 Descriptive Statistics Descriptive Statistics Ernesto Diaz Faculty – Mathematics Redwood High School.
PROBABILITY AND STATISTICS WEEK 1 Onur Doğan. What is Statistics? Onur Doğan.
Summary Statistics: Measures of Location and Dispersion.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 3-1 Chapter 3 Numerical Descriptive Measures Basic Business Statistics 11 th Edition.
Data Summary Using Descriptive Measures Sections 3.1 – 3.6, 3.8
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 3-1 Business Statistics, 4e by Ken Black Chapter 3 Descriptive Statistics.
Statistics Unit 9 only requires us to do Sections 1 & 2. * If we have time, there are some topics in Sections 3 & 4, that I will also cover. They tie in.
Statistics topics from both Math 1 and Math 2, both featured on the GHSGT.
Copyright © 2016 Brooks/Cole Cengage Learning Intro to Statistics Part II Descriptive Statistics Intro to Statistics Part II Descriptive Statistics Ernesto.
Copyright © Cengage Learning. All rights reserved. 2 Descriptive Analysis and Presentation of Single-Variable Data.
Describing Data: Summary Measures. Identifying the Scale of Measurement Before you analyze the data, identify the measurement scale for each variable.
MM150 ~ Unit 9 Statistics ~ Part II. WHAT YOU WILL LEARN Mode, median, mean, and midrange Percentiles and quartiles Range and standard deviation z-scores.
Descriptive Statistics ( )
PROBABILITY AND STATISTICS
Chapter 3 Describing Data Using Numerical Measures
CHAPTER 3 Data Description 9/17/2018 Kasturiarachi.
NUMERICAL DESCRIPTIVE MEASURES
Chapter 3 Describing Data Using Numerical Measures
Numerical Descriptive Measures
Quartile Measures DCOVA
MBA 510 Lecture 2 Spring 2013 Dr. Tonya Balan 4/20/2019.
Presentation transcript:

1 ES Chapter 2 ~ Descriptive Analysis & Presentation of Single-Variable Data Mean: inches Median: inches Range: 42 inches Variance: Standard deviation: inches Minimum: 36 inches Maximum: 78 inches First quartile: inches Third quartile: inches Count: 58 bears Sum: inches Frequency Length in Inches Black Bears

2 ES Chapter Goals Learn how to present and describe sets of data Learn measures of central tendency, measures of dispersion (spread), measures of position, and types of distributions Learn how to interpret findings so that we know what the data is telling us about the sampled population

3 ES Measures of Central Tendency Numerical values used to locate the middle of a set of data, or where the data is clustered The term average is often associated with all measures of central tendency

4 ES Mean Mean: The type of average with which you are probably most familiar. The mean is the sum of all the values divided by the total number of values, n: The population mean, , (lowercase mu, Greek alphabet), is the mean of all x values for the entire population Notes: We usually cannot measure  but would like to estimate its value A physical representation: the mean is the value that balances the weights on the number line x n x n xxx in   () 

5 ES Example Example:The following data represents the number of accidents in each of the last 6 years at a dangerous intersection. Find the mean number of accidents: 8, 9, 3, 5, 2, 6, 4, 5: x  (). Solution:  In the data above, change 6 to 26: Note: The mean can be greatly influenced by outliers x  (). Solution:

6 ES Median Median: The value of the data that occupies the middle position when the data are ranked in order according to size Notes: Denoted by “x tilde”: The population median,  (uppercase mu, Greek alphabet), is the data value in the middle position of the entire population To find the median: 1.Rank the data 2.Determine the depth of the median: 3.Determine the value of the median

7 ES Example Example:Find the median for the set of data: Solution: 1.Rank the data: 2, 2, 3, 3, 4, 8, 8, 9, 11 2.Find the depth: 3.The median is the fifth number from either end in the ranked data: Suppose the data set is {4, 8, 3, 8, 2, 9, 2, 11, 3, 15}: 1.Rank the data: 2, 2, 3, 3, 4, 8, 8, 9, 11, 15 2.Find the depth: 3.The median is halfway between the fifth and sixth observations: {4, 8, 3, 8, 2, 9, 2, 11, 3}

8 ES Mode & Midrange Mode: The mode is the value of x that occurs most frequently Midrange: The number exactly midway between a lowest value data L and a highest value data H. It is found by averaging the low and the high values: Note:If two or more values in a sample are tied for the highest frequency (number of occurrences), there is no mode

9 ES Example:Consider the data set {12.7, 27.1, 35.6, 44.2, 18.0} Midrange      LH When rounding off an answer, a common rule-of-thumb is to keep one more decimal place in the answer than was present in the original data To avoid round-off buildup, round off only the final answer, not intermediate steps Notes: Example

10 ES Measures of Dispersion Measures of central tendency alone cannot completely characterize a set of data. Two very different data sets may have similar measures of central tendency. Measures of dispersion are used to describe the spread, or variability, of a distribution Common measures of dispersion: range, variance, and standard deviation

11 ES Range Range: The difference in value between the highest-valued (H) and the lowest-valued (L) pieces of data: Other measures of dispersion are based on the following quantity Deviation from the Mean: A deviation from the mean,, is the difference between the value of x and the mean

12 ES Example Example:Consider the sample {12, 23, 17, 15, 18}. Find 1) the range and 2) each deviation from the mean. Solutions: 1) DataDeviation from Mean _________________________ )

13 ES Sample Variance: The sample variance, s 2, is the mean of the squared deviations, calculated using n  1 as the divisor: Standard Deviation: The standard deviation of a sample, s, is the positive square root of the variance: Sample Variance & Standard Deviation s n xx     () where n is the sample size  SS()()xxxx n x   Note:The numerator for the sample variance is called the sum of squares for x, denoted SS(x): where

14 ES Example:Find the 1) variance and 2) standard deviation for the data {5, 7, 1, 3, 8}: Example 2.8)8.32(  s 1)  s 2) x xx  ()xx  Sum: Solutions: x  (). First:

15 ES Notes The shortcut formula for the sample variance: The unit of measure for the standard deviation is the same as the unit of measure for the data

16 ES Measures of Position Measures of position are used to describe the relative location of an observation Quartiles and percentiles are two of the most popular measures of position An additional measure of central tendency, the midquartile, is defined using quartiles Quartiles are part of the 5-number summary

17 ES Ranked data, increasing order Quartiles 1.The first quartile, Q 1, is a number such that at most 25% of the data are smaller in value than Q 1 and at most 75% are larger 2.The second quartile, Q 2, is the median 3.The third quartile, Q 3, is a number such that at most 75% of the data are smaller in value than Q 3 and at most 25% are larger Quartiles: Values of the variable that divide the ranked data into quarters; each set of data has three quartiles

18 ES Percentiles:Values of the variable that divide a set of ranked data into 100 equal subsets; each set of data has 99 percentiles. The kth percentile, P k, is a value such that at most k% of the data is smaller in value than P k and at most (100  k)% of the data is larger. Percentiles ~ xQP  250 Notes: The 1st quartile and the 25th percentile are the same: Q 1 = P 25 The median, the 2nd quartile, and the 50th percentile are all the same:

19 ES Finding P k (and Quartiles) Procedure for finding P k (and quartiles): 1.Rank the n observations, lowest to highest 2.Compute A = (nk)/100 3.If A is an integer: –d(P k ) = A.5 (depth) –P k is halfway between the value of the data in the Ath position and the value of the next data If A is a fraction: –d(P k ) = B, the next larger integer –P k is the value of the data in the Bth position

20 ES 1) k = 25: (20) (25) / 100 = 5, depth = 5.5, Q 1 = 6 Example Example:The following data represents the pH levels of a random sample of swimming pools in a California town. Find: 1) the first quartile, 2) the third quartile, and 3) the 37th percentile: 2) k = 75: (20) (75) / 100 = 15, depth = 15.5, Q 3 = ) k = 37: (20) (37) / 100 = 7.4, depth = 8, P 37 = 6.2 Solutions:

21 ES 5-Number Summary: The 5-number summary is composed of: 1.L, the smallest value in the data set 2.Q 1, the first quartile (also P 25 ) 3., the median (also P 50 and 2nd quartile) 4.Q 3, the third quartile (also P 75 ) 5.H, the largest value in the data set 5-Number Summary Notes: The 5-number summary indicates how much the data is spread out in each quarter The interquartile range is the difference between the first and third quartiles. It is the range of the middle 50% of the data

22 ES Box-and-Whisker Display Box-and-Whisker Display: A graphic representation of the 5-number summary: The five numerical values (smallest, first quartile, median, third quartile, and largest) are located on a scale, either vertical or horizontal The box is used to depict the middle half of the data that lies between the two quartiles The whiskers are line segments used to depict the other half of the data One line segment represents the quarter of the data that is smaller in value than the first quartile The second line segment represents the quarter of the data that is larger in value that the third quartile

23 ES Example: A random sample of students in a sixth grade class was selected. Their weights are given in the table below. Find the 5-number summary for this data and construct a boxplot: Solution: Example

24 ES Boxplot for Weight Data Weights from Sixth Grade Class Weight

25 ES z-Score z-Score: The position a particular value of x has relative to the mean, measured in standard deviations. The z-score is found by the formula: Notes: Typically, the calculated value of z is rounded to the nearest hundredth The z-score measures the number of standard deviations above/below, or away from, the mean z-scores typically range from to z-scores may be used to make comparisons of raw scores

26 ES Example:A certain data set has mean 35.6 and standard deviation 7.1. Find the z-scores for 46 and 33: Example Solutions: 46 is 1.46 standard deviations above the mean z xx s      is below standard deviations below the mean. 0

27 ES Interpreting & Understanding Standard Deviation Standard deviation is a measure of variability, or spread Two rules for describing data rely on the standard deviation: –Empirical rule: applies to a variable that is normally distributed –Chebyshev’s theorem: applies to any distribution

28 ES Notes: The empirical rule is more informative than Chebyshev’s theorem since we know more about the distribution (normally distributed) Also applies to populations Can be used to determine if a distribution is normally distributed 1.Approximately 68% of the observations lie within 1 standard deviation of the mean 2.Approximately 95% of the observations lie within 2 standard deviations of the mean 3.Approximately 99.7% of the observations lie within 3 standard deviations of the mean Empirical Rule: If a variable is normally distributed, then: Empirical Rule

29 ES 68% 95% 99.7% Illustration of the Empirical Rule

30 ES 1)What percentage of weights fall between 5.7 and 7.3? 2)What percentage of weights fall above 7.7? Example Example:A random sample of plum tomatoes was selected from a local grocery store and their weights recorded. The mean weight was 6.5 ounces with a standard deviation of 0.4 ounces. If the weights are normally distributed: Solutions: (,)(.(0.),.(0.))(.,.)xsxs  Approximately 95% of the weights fall between 5.7and 7.3 1) (,)(.(0.),.(0.))(.,.)xsxs  Approximately 99.7%of the weights fallbetween 5.3 and 7.7 Approximately 0.3% of the weights fall outside (5.3,7.7) Approximately (0.3/2)=0.15% of the weights fall above 7.7 2)

31 ES A Note about the Empirical Rule 1.Find the mean and standard deviation for the data 2.Compute the actual proportion of data within 1, 2, and 3 standard deviations from the mean 3.Compare these actual proportions with those given by the empirical rule 4.If the proportions found are reasonably close to those of the empirical rule, then the data is approximately normally distributed Note:The empirical rule may be used to determine whether or not a set of data is approximately normally distributed

32 ES Chebyshev’s Theorem: The proportion of any distribution that lies within k standard deviations of the mean is at least 1  (1/k 2 ), where k is any positive number larger than 1. This theorem applies to all distributions of data. Illustration: Chebyshev’s Theorem

33 ES  Chebyshev’s theorem is very conservative and holds for any distribution of data Important Reminders!  Chebyshev’s theorem also applies to any population  The two most common values used to describe a distribution of data are k = 2, 3  The table below lists some values for k and 1 - (1/k 2 ):

34 ES Example:At the close of trading, a random sample of 35 technology stocks was selected. The mean selling price was and the standard deviation was Use Chebyshev’s theorem (with k = 2, 3) to describe the distribution. Example Using k=3:At least 89% of the observations lie within 3 standard deviations of the mean: Solutions: Using k=2:At least 75% of the observations lie within 2 standard deviations of the mean: