General Divisions Descriptive Statistics –Goal is to summarize or describe the data Inferential Statistics –Using data from a sample to make inferences (generalizations) about the population
Major Descriptors Center: Where the “middle” of the data is Variation: How spread out the data is Distribution: The shape of the distribution of the data (if the data follows a pattern) Outliers: Data that is unusually separated from rest of data Time: How data changes over time
Frequency Distribution A frequency distribution lists data values (or groups of data values) along with how many data had that value (the frequency, or count)
Some data: Quiz scores
Quiz scores: Frequency Distribution ScoreFrequency
Quiz scores: Frequency Distribution Using Classes ScoreFrequency
Definitions Lower class limits –Smallest numbers that can belong to a class Upper class limits –Largest numbers that can belong to a class Class boundaries –Numbers used to separate classes so that there are no gaps –For our purposes, we will just use lower class limits Class midpoint –Add upper and lower limits and divide by 2 Class width –The difference between consecutive lower class limits
Example Lower class limits 12, 15, 17 Upper class limits 14, 17, 20 Class midpoints 13, 16, 19 Class width 3 ScoreFrequency
Constructing a Frequency Distribution Choose number of classes you want –Usually 5 to 20, based on data and convenience Calculate class width –(highest value – lowest)/number classes –Usually round (up) –Sometimes handy to work backwards Choose starting point –Usually lowest value, or a little smaller
Constructing a Frequency Distribution Use starting point and class width to list other lower class limits –Add class width to previous lower limit Add upper class limits Tally data into frequency table
Example: Hours slept by caffeine drinkers Data: 1.2, 2.9, 3.1, 3.5, 4.1, 4.6, 4.8, 5.0, 5.3, 5.3, 5.4, 5.7, 6.3, 6.7, 6.7, 6.8, 6.9, 7.0, 7.1, 7.2, 7.2, 7.4, 7.5, 7.5, 7.8, 8.1, 8.2, 8.2, 8.5, 8.6, 9.3, 10.1 Choose number of classes: 5 (?) Class width: (10.1 – 1.2)/5 = 1.78 Lets round up to 2 and use that
Example: Hours slept by caffeine drinkers Data: 1.2, 2.9, 3.1, 3.5, 4.1, 4.6, 4.8, 5.0, 5.3, 5.3, 5.4, 5.7, 6.3, 6.7, 6.7, 6.8, 6.9, 7.0, 7.1, 7.2, 7.2, 7.4, 7.5, 7.5, 7.8, 8.1, 8.2, 8.2, 8.5, 8.6, 9.3, 10.1 Starting point: Probably 1.0 (could start at 0.0)
Example: Hours slept by caffeine drinkers Data: 1.2, 2.9, 3.1, 3.5, 4.1, 4.6, 4.8, 5.0, 5.3, 5.3, 5.4, 5.7, 6.3, 6.7, 6.7, 6.8, 6.9, 7.0, 7.1, 7.2, 7.2, 7.4, 7.5, 7.5, 7.8, 8.1, 8.2, 8.2, 8.5, 8.6, 9.3, 10.1 List lower class limits Lower Class limits
Example: Hours slept by caffeine drinkers Data: 1.2, 2.9, 3.1, 3.5, 4.1, 4.6, 4.8, 5.0, 5.3, 5.3, 5.4, 5.7, 6.3, 6.7, 6.7, 6.8, 6.9, 7.0, 7.1, 7.2, 7.2, 7.4, 7.5, 7.5, 7.8, 8.1, 8.2, 8.2, 8.5, 8.6, 9.3, 10.1 Add upper class limits Hours slept 1 – – – – – 10.9
Example: Hours slept by caffeine drinkers Data: 1.2, 2.9, 3.1, 3.5, 4.1, 4.6, 4.8, 5.0, 5.3, 5.3, 5.4, 5.7, 6.3, 6.7, 6.7, 6.8, 6.9, 7.0, 7.1, 7.2, 7.2, 7.4, 7.5, 7.5, 7.8, 8.1, 8.2, 8.2, 8.5, 8.6, 9.3, 10.1 Tally data Hours sleptFrequency 1 – – – – – 10.92
Relative Frequency Relative frequency = class frequency / sum of all frequencies Relative frequencies are expressed as percents
Example: Hours slept by caffeine drinkers Hours sleptFrequencyRelative Frequency 1 – 2.922/32 = 6% 3 – 4.955/32 = 16% 5 – /32 = 31% 7 – /32 = 41% 9 – /32 = 6% Sum of Frequencies: 32 = sample size
Cumulative Frequency Distribution Class limits are replaced with “less than” statements Frequency is frequency of data less than the class
Example: Hours slept by caffeine drinkers Hours sleptCumulative Frequency Less than 32 Less than 57 Less than 717 Less than 930 Less than 1132 Hours sleptFrequency 1 – – – – – 10.92
Homework 2-2: 1, 5, 9, 15 The answer the books gives for class boundaries will be different than what we’ve discussed in class.