Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Descriptive Statistics Frequency Tables Visual Displays Measures of Center.

Similar presentations


Presentation on theme: "1 Descriptive Statistics Frequency Tables Visual Displays Measures of Center."— Presentation transcript:

1 1 Descriptive Statistics Frequency Tables Visual Displays Measures of Center

2 2  Descriptive Statistics summarize or describe the important characteristics of a known set of population data  Inferential Statistics use sample data to make inferences (or generalizations) about a population Overview

3 3 1. Center: A representative or average value that indicates where the middle of the data set is located 2. Variation: A measure of the amount that the values vary among themselves 3. Distribution: The nature or shape of the distribution of data (such as bell-shaped, uniform, or skewed) 4. Outliers: Sample values that lie very far away from the vast majority of other sample values 5. Time: Changing characteristics of the data over time Important Characteristics of Data

4 4  Frequency Table lists classes (or categories) of values, along with frequencies (or counts) of the number of values that fall into each class Summarizing Data With Frequency Tables

5 5 Qwerty Keyboard Word Ratings Table 2-1 2251263342 40577566810 72210582542 6261727238 15252142263 17

6 6 Frequency Table of Qwerty Word Ratings Table 2-3 0 - 220 3 - 514 6 - 815 9 - 11 2 12 - 14 1 RatingFrequency

7 7 Frequency Table Definitions

8 8 Lower Class Limits Lower Class Limits 0 - 220 3 - 514 6 - 815 9 - 11 2 12 - 14 1 RatingFrequency are the smallest numbers that can actually belong to different classes

9 9 Upper Class Limits Upper Class Limits 0 - 220 3 - 514 6 - 815 9 - 11 2 12 - 14 1 RatingFrequency are the largest numbers that can actually belong to different classes

10 10 Class Boundaries Class Boundaries 0 - 220 3 - 514 6 - 815 9 - 11 2 12 - 14 1 RatingFrequency - 0.5 2.5 5.5 8.5 11.5 14.5 number separating classes

11 11 midpoints of the classes Class Midpoints Class Midpoints 0 - 1 220 3 - 4 514 6 - 7 815 9 - 10 11 2 12 - 13 14 1 RatingFrequency

12 12 Class Width 3 0 - 220 3 3 - 514 3 6 - 815 3 9 - 11 2 3 12 - 14 1 RatingFrequency is the difference between two consecutive lower class limits or two consecutive class boundaries

13 13 1. Be sure that the classes are mutually exclusive. 2. Include all classes, even if the frequency is zero. 3. Try to use the same width for all classes. 4. Select convenient numbers for class limits. 5. Use between 5 and 20 classes. 6. The sum of the class frequencies must equal the number of original data values. Guidelines For Frequency Tables

14 14 3. Select for the first lower limit either the lowest score or a convenient value slightly less than the lowest score. 4. Add the class width to the starting point to get the second lower class limit, add the width to the second lower limit to get the third, and so on. 5. List the lower class limits in a vertical column and enter the upper class limits. 6. Represent each score by a tally mark in the appropriate class. Total tally marks to find the total frequency for each class. Constructing A Frequency Table 1. Decide on the number of classes. 2. Determine the class width by dividing the range by the number of classes (range = highest score - lowest score) and round up. class width  round up of range number of classes

15 15 Inputting a Frequency Table into the TI83 Calculator Push Stat Select Edit Input Class Marks in List 1 Input Frequencies in List 2

16 16 Relative Frequency Table relative frequency = class frequency sum of all frequencies

17 17 Relative Frequency Table 0 - 220 3 - 514 6 - 815 9 - 11 2 12 - 14 1 RatingFrequency 0 - 238.5% 3 - 526.9% 6 - 828.8% 9 - 11 3.8% 12 - 14 1.9% Rating Relative Frequency 20/52 = 38.5% 14/52 = 26.9% etc. Total frequency = 52

18 18 Cumulative Frequency Table Cumulative Frequencies 0 - 220 3 - 514 6 - 815 9 - 11 2 12 - 14 1 RatingFrequency Less than 320 Less than 634 Less than 949 Less than 12 51 Less than 15 52 Rating Cumulative Frequency

19 19 Frequency Tables 0 - 2 20 3 - 5 14 6 - 8 15 9 - 11 2 12 - 14 1 Rating Frequency 0 - 2 38.5% 3 - 5 26.9% 6 - 8 28.8% 9 - 11 3.8% 12 - 14 1.9% Rating Relative Frequency Less than 3 20 Less than 6 34 Less than 9 49 Less than 12 51 Less than 15 52 Rating Cumulative Frequency

20 20 depict the nature or shape of the data distribution Histogram a bar graph in which the horizontal scale represents classes and the vertical scale represents frequencies Pictures of Data

21 21 Histogram of Qwerty Word Ratings 0 - 220 3 - 514 6 - 815 9 - 11 2 12 - 14 1 Rating Frequency

22 22 Relative Frequency Histogram of Qwerty Word Ratings 0 - 238.5% 3 - 526.9% 6 - 828.8% 9 - 11 3.8% 12 - 14 1.9% Rating Relative Frequency

23 23 Histogram and Relative Frequency Histogram

24 24 Viewing a Histogram Using the TI83 Calculator First input class marks and frequencies in L1 & L2 2 nd y= (Statplot) Select 1 Select “on” Select histogram type (top row, 3 rd graph) X list = 2 nd 1 (L1) Freq = 2 nd 2 (L2) Zoom 9 (ZoomStat)

25 25 Frequency Polygon

26 26 Viewing a Frequency Polygon Using the TI83 Calculator First input class marks and frequencies in L1 & L2 2 nd y= (Statplot) Select 1 Select “on” Select polygon type (top row, 2 nd graph) X list = 2 nd 1 (L1) Freq = 2 nd 2 (L2) Zoom 9 (ZoomStat)

27 27 Ogive

28 28 Dot Plot

29 29 Stem-and Leaf Plot Raw Data (Test Grades) 67 72 85 75 89 89 88 90 99 100 StemLeaves 6 7 8 9 10 7 2 5 5 8 9 9 0 9 0

30 30 Pareto Chart 5,000 10,000 15,000 20,000 25,000 30,000 35,000 40,000 45,000 0 Motor Vehicle Falls Poison Drowning Fire Ingestion of food or object Firearms Frequency Accidental Deaths by Type

31 31 Pie Chart Firearms (1400. 1.9%) Ingestion of food or object (2900. 3.9% Fire (4200. 5.6%) Drowning (4600. 6.1%) Poison (6400. 8.5%) Falls (12,200. 16.2%) Motor vehicle (43,500. 57.8%) Accidental Deaths by Type

32 32 Scatter Diagram 0 0.0 0.5 1.01.5 10 20 NICOTINE TAR

33 33  Boxplots (textbook section 2-7)  Pictographs  Pattern of data over time Other Graphs

34 34 a value at the center or middle of a data set Measures of Center

35 35 Mean (Arithmetic Mean) AVERAGE the number obtained by adding the values and dividing the total by the number of values Definitions

36 36 Notation  denotes the addition of a set of values x is the variable usually used to represent the individual data values n represents the number of data values in a sample N represents the number of data values in a population

37 37 Notation µ is pronounced ‘mu’ and denotes the mean of all values in a population is pronounced ‘x-bar’ and denotes the mean of a set of sample values Calculators can calculate the mean of data x = n  x x x N µ =  x x

38 38 Finding a Mean Using the TI83 Calculator Push Stat Select Edit Enter data into L1 2 nd mode (quit) Push Stat >Calc Select 1 (1-VarStats) 2 nd 1 (L1) The first output item is the mean

39 39 Definitions  Median the middle value when the original data values are arranged in order of increasing (or decreasing) magnitude  often denoted by x (pronounced ‘x-tilde’)  is not affected by an extreme value ~

40 40 6.72 3.46 3.606.44 26.70 3.46 3.60 6.446.72 26.70 (in order - odd number of values) exact middle MEDIAN is 6.44

41 41 6.72 3.46 3.606.44 3.46 3.60 6.446.72 no exact middle -- shared by two numbers 3.60 + 6.44 2 (even number of values) MEDIAN is 5.02

42 42 Finding a Median Using the TI83 Calculator Push Stat Select Edit Enter data into L1 2 nd mode (quit) Push Stat >Calc Select 1 (1-VarStats) 2 nd 1 (L1) Arrow down to “Med” to see the value for median

43 43 Definitions  Mode the score that occurs most frequently Bimodal Multimodal No Mode denoted by M the only measure of central tendency that can be used with nominal data

44 44 a. 5 5 5 3 1 5 1 4 3 5 b. 1 2 2 2 3 4 5 6 6 6 7 9 c. 1 2 3 6 7 8 9 10 Examples  Mode is 5  Bimodal 2 and 6  No Mode

45 45  Midrange the value midway between the highest and lowest values in the original data set Definitions Midrange = highest score + lowest score 2

46 46 Finding a Midrange Using the TI83 Calculator Push Stat Select Edit Enter data into L1 2 nd mode (quit) Push Stat >Calc Select 1 (1-VarStats) 2 nd 1 (L1) Arrow down to see the “min” and “max” values for the midrange calculation

47 47 Carry one more decimal place than is present in the original set of values Round-off Rule for Measures of Center

48 48 use class midpoint of classes for variable x Mean from a Frequency Table x = class midpoint f = frequency  f = n f = n x = f  (f x) 

49 49 Finding a Grouped Mean Using the TI83 Calculator Push Stat Select Edit Enter class marks into L1 and frequencies into L2 2 nd mode (quit) Push Stat >Calc Select 1 (1-VarStats) 2 nd 1, 2 nd 2 (L1,L2) The first output item is the mean

50 50 Descriptive Statistics Measures of Variability Measures of Position

51 51  Symmetric Data is symmetric if the left half of its histogram is roughly a mirror of its right half.  Skewed Data is skewed if it is not symmetric and if it extends more to one side than the other. Definitions

52 52 Skewness Mode = Mean = Median SKEWED LEFT (negatively ) SYMMETRIC Mean Mode Median SKEWED RIGHT (positively) Mean Mode Median

53 53 Jefferson Valley Bank Bank of Providence 6.5 4.2 6.6 5.4 6.7 5.8 6.8 6.2 7.1 6.7 7.3 7.7 7.4 7.7 8.5 7.7 9.3 7.7 10.0 Jefferson Valley Bank 7.15 7.20 7.7 7.10 Bank of Providence 7.15 7.20 7.7 7.10 Mean Median Mode Midrange Waiting Times of Bank Customers at Different Banks in minutes

54 54 Dotplots of Waiting Times

55 55 Measures of Variation Range value highest lowest value

56 56 a measure of variation of the scores about the mean (average deviation from the mean) Measures of Variation Standard Deviation

57 57 Sample Standard Deviation Formula calculators can compute the sample standard deviation of data  ( x - x ) 2 n - 1 S =S =

58 58 Sample Standard Deviation Shortcut Formula n ( n - 1) s = n (  x 2 ) - (  x ) 2 calculators can compute the sample standard deviation of data

59 59 Population Standard Deviation calculators can compute the population standard deviation of data 2  ( x - µ ) N  =

60 60 Symbols for Standard Deviation Sample Population   x x  n s S x x  n-1 Textbook Some graphics calculators Some non-graphics calculators Textbook Some graphics calculators Some non-graphics calculators Articles in professional journals and reports often use SD for standard deviation and VAR for variance.

61 61 Finding a Standard Deviation Using the TI83 Calculator Push Stat Select Edit Enter data into L1 2 nd mode (quit) Push Stat >Calc Select 1 (1-VarStats) 2 nd 1 (L1) The 4th output item is the sample s.d. and the 5 th output item is the population s.d.

62 62 Measures of Variation Variance standard deviation squared s  2 2 } use square key on calculator Notation

63 63 Sample Variance Population Variance  ( x - x ) 2 n - 1 s 2 =  (x - µ)2 (x - µ)2 N  2 =

64 64 Round-off Rule for measures of variation Carry one more decimal place than is present in the original set of values. Round only the final answer, never in the middle of a calculation.

65 65 Standard Deviation from a Frequency Table Use the class midpoints as the x values Calculators can compute the standard deviation for frequency table n ( n - 1) S =S = n [  ( f x 2 )] -[  ( f x )] 2

66 66 Finding a Grouped Standard Deviation Using the TI83 Calculator Push Stat Select Edit Enter class marks into L1 and frequencies into L2 2 nd mode (quit) Push Stat >Calc Select 1 (1-VarStats) 2 nd 1, 2 nd 2 (L1, L2) The 4 th output item is the sample s.d and the 5 th output item is the population s.d

67 67 Estimation of Standard Deviation Range Rule of Thumb x - 2 s x x + 2 s Range  4 s or (minimum usual value) (maximum usual value) Range 4 s  = highest value - lowest value 4

68 68 minimum ‘usual’ value  (mean) - 2 (standard deviation) minimum  x - 2(s) maximum ‘usual’ value  (mean) + 2 (standard deviation) maximum  x + 2(s) Usual Sample Values

69 69 x The Empirical Rule (applies to bell-shaped distributions )

70 70 x - s x x + sx + s 68% within 1 standard deviation 34% The Empirical Rule (applies to bell-shaped distributions )

71 71 x - 2s x - s x x + 2s x + sx + s 68% within 1 standard deviation 34% 95% within 2 standard deviations The Empirical Rule (applies to bell-shaped distributions ) 13.5%

72 72 x - 3s x - 2s x - s x x + 2s x + 3s x + sx + s 68% within 1 standard deviation 34% 95% within 2 standard deviations 99.7% of data are within 3 standard deviations of the mean The Empirical Rule (applies to bell-shaped distributions ) 0.1% 2.4% 13.5%

73 73 Chebyshev’s Theorem  applies to distributions of any shape.  the proportion (or fraction) of any set of data lying within K standard deviations of the mean is always at least 1 - 1/K 2, where K is any positive number greater than 1.  at least 3/4 (75%) of all values lie within 2 standard deviations of the mean.  at least 8/9 (89%) of all values lie within 3 standard deviations of the mean.

74 74 Measures of Variation Summary For typical data sets, it is unusual for a score to differ from the mean by more than 2 or 3 standard deviations.

75 75  z Score (or standard score) the number of standard deviations that a given value x is above or below the mean Measures of Position

76 76 Sample z = x - x s Population z = x - µ  Round to 2 decimal places Measures of Position z score

77 77 - 3- 2- 101 23 Z Unusual Values Unusual Values Ordinary Values Interpreting Z Scores

78 78 Measures of Position Quartiles, Deciles, Percentiles

79 79 Q 1, Q 2, Q 3 divides ranked scores into four equal parts Quartiles 25% Q3Q3 Q2Q2 Q1Q1 (minimum)(maximum) (median)

80 80 D 1, D 2, D 3, D 4, D 5, D 6, D 7, D 8, D 9 divides ranked data into ten equal parts Deciles 10% D 1 D 2 D 3 D 4 D 5 D 6 D 7 D 8 D 9

81 81 99 Percentiles Percentiles

82 82 Quartiles, Deciles, Percentiles Fractiles (Quantiles) partitions data into approximately equal parts

83 83 Quartiles Q 1 = P 25 Q 2 = P 50 Q 3 = P 75 D 1 = P 10 D 2 = P 20 D 3 = P 30 D 9 = P 90 Deciles

84 84 Viewing Quartiles Using the TI83 Calculator Push Stat Select Edit Enter data into L1 2 nd mode (quit) Push Stat >Calc Select 1 (1-VarStats) 2 nd 1 (L1) Arrow down to view the values for Q2 and Q3 (the median “med” is Q2)

85 85 Exploratory Data Analysis the process of using statistical tools (such as graphs, measures of center, and measures of variation) to investigate the data sets in order to understand their important characteristics

86 86 Outliers  a value located very far away from almost all of the other values  an extreme value  can have a dramatic effect on the mean, standard deviation, and on the scale of the histogram so that the true nature of the distribution is totally obscured

87 87 Boxplots ( Box-and-Whisker Diagram) Reveals the:  center of the data  spread of the data  distribution of the data  presence of outliers Excellent for comparing two or more data sets

88 88 Boxplots 5 - number summary  Minimum  first quartile Q 1  Median (Q 2 )  third quartile Q 3  Maximum

89 89 Boxplots Boxplot of Qwerty Word Ratings 2 46 8101214 0 2 46 0

90 90 Viewing Boxplots Using the TI83 Calculator First input class marks and frequencies in L1 & L2 2 nd y= (Statplot) Select 1 Select “on” Select boxplot type (bottom row, 2 nd graph) X list = 2 nd 1 (L1) Freq = 2 nd 2 (L2) Zoom 9 (ZoomStat)

91 91 Bell-ShapedSkewed Boxplots Uniform

92 92 Exploring  Measures of center: mean, median, and mode  Measures of variation: standard deviation and range  Measures of spread & relative location: minimum values, maximum value, and quartiles  Unusual values: outliers  Distribution: histograms, stem-leaf plots, and boxplots


Download ppt "1 Descriptive Statistics Frequency Tables Visual Displays Measures of Center."

Similar presentations


Ads by Google