Descriptive Statistics. Frequency Distributions and Their Graphs What you should learn: How to construct a frequency distribution including midpoints,

Slides:



Advertisements
Similar presentations
Describing Quantitative Variables
Advertisements

Descriptive Measures MARE 250 Dr. Jason Turner.
Descriptive Statistics
Chapter 2 Summarizing and Graphing Data
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Lecture Slides Elementary Statistics Eleventh Edition and the Triola.
EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES
Slide Slide 1 Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley. Lecture Slides Elementary Statistics Tenth Edition and the.
1 Chapter 1: Sampling and Descriptive Statistics.
Slide 1 Copyright © 2004 Pearson Education, Inc..
ISE 261 PROBABILISTIC SYSTEMS. Chapter One Descriptive Statistics.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter Two Treatment of Data.
B a c kn e x t h o m e Classification of Variables Discrete Numerical Variable A variable that produces a response that comes from a counting process.
Descriptive Statistics
CHAPTER 1: Picturing Distributions with Graphs
Descriptive Statistics
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Frequency Distributions and Graphs
2.1: Frequency Distributions and Their Graphs. Is a table that shows classes or intervals of data entries with a count of the number of entries in each.
Descriptive Statistics
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Definitions Data: A collection of information in context.
Copyright © 2004 Pearson Education, Inc.
Chapter 1 – Exploring Data YMS Displaying Distributions with Graphs xii-7.
STATISTICAL GRAPHS.
Descriptive Statistics
Chapter 2 Summarizing and Graphing Data
Descriptive Statistics
Census A survey to collect data on the entire population.   Data The facts and figures collected, analyzed, and summarized for presentation and.
MM207-Statistics Unit 2 Seminar-Descriptive Statistics Dr Bridgette Stevens AIM:BStevensKaplan (add me to your Buddy list) 1.
2011 Summer ERIE/REU Program Descriptive Statistics Igor Jankovic Department of Civil, Structural, and Environmental Engineering University at Buffalo,
Copyright © 2004 Pearson Education, Inc.. Chapter 2 Descriptive Statistics Describe, Explore, and Compare Data 2-1 Overview 2-2 Frequency Distributions.
1 Laugh, and the world laughs with you. Weep and you weep alone.~Shakespeare~
What is variability in data? Measuring how much the group as a whole deviates from the center. Gives you an indication of what is the spread of the data.
Chapter 2 Describing Data.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.
An Introduction to Statistics. Two Branches of Statistical Methods Descriptive statistics Techniques for describing data in abbreviated, symbolic fashion.
1 Elementary Statistics Larson Farber Descriptive Statistics Chapter 2.
Larson/Farber Ch 2 1 Elementary Statistics Larson Farber 2 Descriptive Statistics.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Section 2-2 Frequency Distributions.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
1 Descriptive Statistics 2-1 Overview 2-2 Summarizing Data with Frequency Tables 2-3 Pictures of Data 2-4 Measures of Center 2-5 Measures of Variation.
2.1 Frequency Distribution and Their Graphs NOTES Coach Bridges.
+ CHAPTER 2 Descriptive Statistics SECTION 2.1 FREQUENCY DISTRIBUTIONS.
Understandable Statistics Seventh Edition By Brase and Brase Prepared by: Lynn Smith Gloucester County College Chapter Two Organizing Data.
Larson/Farber Ch 2 1 Elementary Statistics Larson Farber 2 Descriptive Statistics.
Chapter 4 Histograms Stem-and-Leaf Dot Plots Measures of Central Tendency Measures of Variation Measures of Position.
Descriptive Statistics Unit 6. Variable Any characteristic (data) recorded for the subjects of a study ex. blood pressure, nesting orientation, phytoplankton.
Chapter 4 Measures of Central Tendency Measures of Variation Measures of Position Dot Plots Stem-and-Leaf Histograms.
Statistics Unit Test Review Chapters 11 & /11-2 Mean(average): the sum of the data divided by the number of pieces of data Median: the value appearing.
Describing Data Week 1 The W’s (Where do the Numbers come from?) Who: Who was measured? By Whom: Who did the measuring What: What was measured? Where:
Chapter 2 Summarizing and Graphing Data  Frequency Distributions  Histograms  Statistical Graphics such as stemplots, dotplots, boxplots, etc.  Boxplots.
Chapter 4 Histograms Stem-and-Leaf Dot Plots Measures of Central Tendency Measures of Variation Measures of Position.
Slide 1 Copyright © 2004 Pearson Education, Inc.  Descriptive Statistics summarize or describe the important characteristics of a known set of population.
Exploratory Data Analysis
Chapter 2 Descriptive Statistics.
Chapter 2 Descriptive Statistics.
ISE 261 PROBABILISTIC SYSTEMS
Statistics Unit Test Review
4. Interpreting sets of data
NUMERICAL DESCRIPTIVE MEASURES
Chapter 2: Descriptive Statistics
Displaying Distributions with Graphs
Displaying and Summarizing Quantitative Data
Descriptive Statistics
Honors Statistics Review Chapters 4 - 5
Descriptive Statistics
Descriptive Statistics
Descriptive Statistics
Essentials of Statistics 4th Edition
Chapter 2 Describing, Exploring, and Comparing Data
Presentation transcript:

Descriptive Statistics

Frequency Distributions and Their Graphs What you should learn: How to construct a frequency distribution including midpoints, relative frequencies, and cumulative frequencies. How to construct frequency histograms, frequency polygons, relative frequency histograms and ogives.

Frequency Distribution - Vocabulary A table that shows classes or intervals of data entries with a count of the number of entries in each class. Frequency - The number of data entries in the class.

Class width- Vocabulary The difference between two consecutive lower class limits in a frequency distribution table. Lower Class Limit - The least number that can belong to a specific class. Upper Class Limit - The greatest number that can belong to a specific class.

Guidelines for Constructing a Frequency Distribution Table  Decide on the number of classes or intervals. This should be between 5 and 20.  Calculate the class width. Round up to the next convenient number. Maximum data entry – Minimum data entry Number of classes  Find the class limits. Use the minimum data entry as the lower limit of the first class. To find the remaining lower limits, add the class width to the lower limit of the preceding class.

Midpoint of a class - Vocabulary The average of the lower and upper limits of the class. Relative Frequency - The portion or percent of the data that falls in that class. Cumulative Frequency of a class - The sum of the frequency for that class and all previous classes. The cumulative frequency of the last class is equal to the sample size, n.

Frequency Histogram- Vocabulary A bar graph that represents the frequency distribution of a data set. Properties of a Histogram  The horizontal scale is quantitative and measures the data values.  The vertical scale measures the frequencies of the classes.  Consecutive bars must touch.

Sample Histogram (using midpoints)

Sample Histogram (using class boundaries)

Histogram Checklist  Does your histogram have a title?  Does your horizontal axis have a title?  Does your vertical axis have a title?  Did you number your horizontal axis?  Did you number your vertical axis?  Did you leave a space between your vertical axis and the first tower?  Is your histogram attactive to the eye?

Class Boundaries- Vocabulary Numbers that separate classes without forming gaps between them. Since consecutive bars of a histogram must touch, bars must begin and end at class boundaries instead of class limits.

Guidelines for Constructing a Frequency Histogram  Find the class boundaries by subtracting.5 from the lower class limits and adding.5 to the upper class limits.  Use either the class boundaries or the class midpoints as the horizontal scale.  Construct rectangular towers to represent the frequency of each class of data.

Other data displays based on Frequency Distribution Tables  Relative Frequency Histogram  Ogive (Cumulative Frequency Graph)

What will a Relative Frequency Histogram Look Like? Frequency Histogram Relative Frequency Histogram Just a different vertical scale.

Frequency Polygon  Like a Line Graph, but used for grouped data.  They serve the same purpose as a histogram.  More practical for comparing two sets of data.

Sample Frequency Polygon (using midpoints)

Guidelines for Constructing a Frequency Polygon  Use the class midpoints to label the horizontal axis.  Plot the frequency of each class against the vertical axis.  Be sure to leave space between the first midpoint and the vertical axis.  Plot your frequencies.  Close your polygon. Be sure that you have even spacing at the beginning and the end.

Assignment: pg. 39: 3,4,15,18,22,27,30,32

More Graphs and Displays What you should learn: How to graph quantitative data sets using stem-and-leaf plots and dot plots How to graph qualitative data sets using pie charts and Pareto charts How to graph paired data sets using scatter plots and time series charts

Stem-and-leaf plot - Vocabulary An alternative method to the histogram of displaying quantitative data Stem - Left most digits of the data Leaf - The right most digit of the data.

Advantages of Stem-and-Leaf Plots The graphs contains the original data. Provides an easy way to order the data.

Unordered stem-and-leaf plot Ordered stem-and-leaf plot Double rows for each stem Leaves on each side of the stem (Back to Back Stem-and-Leaf) Types of Stem-and-Leaf Plots

Ordered Stem-and-Leaf Plot Double Stem Stem-and-Leaf Plot Back to Back Stem-and-Leaf Plot

Constructing Stem-and-Leaf Plots Write the stems in a vertical column with the smallest at the top. Draw a vertical line at the right of the above mentioned column Write each leaf in the row to the right of its stem Sort the leaves in increasing order as they move out from the stem.

Dot Plot- Vocabulary An alternative method to the histogram of displaying quantitative data whereby each data entry is plotted, using a point, above a horizontal axis. These allow you to see the distribution of the data

Sample Dot Plot

Constructing Dot Plots Choose an appropriate scale for the horizontal axis. Plot a point above the horizontal axis to represent each data entry.

Up until this point, our data displays were all Quantitative. Quantitative Data Displays Histograms Stem-and-leaf Plots Dot Plots

Qualitative Data Displays Pie Chart- A circle graph that shows relationships of parts to a whole. Pareto Chart- A vertical bar graph in which the height of each bar represents frequency or relative frequency. The bars are positioned in order of decreasing height, with the tallest bar positioned at the left.

Sample Pie Chart

Constructing Pie Charts Calculate the fractional part of the data that each category covers. Convert the fractional part to a percent. Round the percent to the nearest whole number. Calculate the number of degrees that the percent would represent in a circle. Using a protractor, measure the number of degrees represented by each category. Create a key and a title for your pie chart.

Be sure the 0  lines up on the radius of the circle. Measure the appropriate degree in a counterclockwise direction. Mark your location. Draw another radius from the center to the mark you just created. Realign your protractor so that the center of the protractor is at the center of the circle, but zero degrees is now lined up with the new radius just drawn. Using a Protractor Draw a line from the center of the circle to the edge of the circle (Radius) Line the center of the protractor up with the center of the circle.

Assignment: pg. 51: 3-10,15,16,21,22,23

Sample Pareto

Constructing Pareto Charts After collecting your data, determine the frequency of each data entry. Label your left vertical axis with frequency. Label your right vertical axis with percentages. Construct a bar chart tower for the most frequent piece of data. Construct a second bar chart tower for the next most frequent piece of data. Continue in this manner until all the data is represented. Then plot the cumulative percentage above each tower. Connect the plots. Be sure to title your Pareto Chart and label your towers.

Graphing Paired Data Sets One piece of data corresponds with another piece of data. Example: grams of fat vs. calorie count in a particular food product Represent the data using a scatter plot.

Scatter Plot- Vocabulary The ordered pairs are graphed as points in a coordinate plane.        

The scatter plot below shows the women's winning marathon times and the high temperatures on the marathon days. Sample Scatter Plot

Time Series Chart Vocabulary A data set that is composed of entries taken at regular intervals over a period of time.

Toontown is a web based computer game. The chart below represents the plays over a 24 hour period. Sample Series Time Chart

Assignment: pg. 53: 24, 26, 30

Measures of Central Tendency What you should learn: How to find the mean, median, and mode of a population and a sample. How to find a weighted mean and the mean of a frequency distribution. How to describe the shape of a distribution as symmetric, uniform, or skewed.

Measure of Central Tendency - Vocabulary A value that represents a typical, or central, entry of a data set.

Mean - Measures of Central Tendency The sum of the data entries divided by the number of entries, more commonly known as the average. There are two different types of mean depending on the data… Population Mean Sample Mean

Formulas Population Mean:  = xNxN The lower case Greek letter mu. The upper case Greek letter sigma. It indicates the sum of.

Formulas Sample Mean: Read as x bar. The upper case Greek letter sigma. It indicates the sum of. x = xnxn

Median - Measures of Central Tendency The middle data entry when the data set is sorted in ascending or descending order. If the data set has an even number of entries, the median is the mean of the two middle data entries.

Mode - Measures of Central Tendency The data entry that occurs with the greatest frequency. If no entry is repeated, the data set has no mode. If two entries occur with the same greatest frequency, each entry is a mode and the data set is called BIMODAL.

Outlier- Measures of Central Tendency A data entry that is far removed from the other entries in the data set.

Mean of a frequency distribution- Mean of Grouped Data x =  (x∙f) n x represents the midpoints of the class f represents the frequency of the class.

Guidelines for Determining the Mean of a Frequency Distribution  Find the midpoint of each class.  Find the sum of the products of the midpoints and the frequencies.  Find the sum of the frequencies.  Find the mean of the frequency distribution.

Assignment: pg. 63:13,15,18,27,28,34-35

Weighted Mean- Weighted Mean The mean of the data whose entries have varying weights.

Weighted Mean x =  (x∙w)  w w represents the weight of each entry, x.

Symmetric Distribution- Shape of Distributions When a vertical line can be drawn through the middle of a graph of the distribution and the resulting halves are approximately mirror images. Example:

Uniform or Rectangular Distribution- Shape of Distributions When all entries, or classes, in the distribution have equal frequencies. Also Symmetric! Example:

Skewed Distribution- Shape of Distributions When the “tail” of the graph elongates more to one side than to the other. Example: Tail…

Also known as negatively skewed. Skewed-Left Distribution- Skewed Distributions When the “tail” of the graph elongates more to the left side. Example: Tail…

Also known as positively skewed. Skewed-Right Distribution- Skewed Distributions When the “tail” of the graph elongates more to the right side. Example: Tail…

FYI The mean will always fall in the direction the distribution is skewed. For example, when a distribution is skewed left, the mean is to the left of the median.

Assignment: pg. 63: 5-12,37,39,45

Measures of Variation What you should learn: How to find the range of a data set. How to find the variance and standard deviation of a population and a sample. How to use the Empirical Rule and Chebychev’s Theorem to interpret standard deviation. How to approximate the sample and standard deviation for grouped data.

Range - Vocabulary The difference between the maximum and minimum entries in the set. Range = maximum data entry – minimum data entry

Deviation - Vocabulary The difference between the entry and the mean,  or x, of the data set. Deviation of x = x - µ or Deviation of x = x - x

FYI Notice that the sum of the deviations is zero. Therefore, it is pointless to find the average of the deviations. By squaring each deviation, we can make the data useful.

Population Variance - Vocabulary The mean of the squares of the deviations. The lower case Greek letter sigma.

Standard Deviation - Vocabulary A measure of the typical amount an entry deviates from the mean. The more the data entries are spread out, the greater the standard deviation.

Population Standard Deviation - Vocabulary The square root of the population variance.

Sample Variance - Vocabulary The mean of the squares of the deviations.

Sample Standard Deviation - Vocabulary The square root of the sample variance.

Assignment: pg. 78: 1,3-5,8,10-13,22

Bell Shaped Distribution

AKA Rule Empirical Rule For data with a bell-shaped distribution, the standard deviation has the following characteristics. About 68% of the data lies within 1 standard deviation of the mean. About 95% of the data lies within 2 standard deviations of the mean. About 99.7% of the data lies within 3 standard deviations of the mean.

13.5% 34% 13.5% 34% 2.35% Empirical ( ) Rule

13.5% 34% 13.5% 34% 2.35% Empirical Rule x x+ SD x+ 2SD x+ 3SD x- SD x- 2SD x- 3SD

Chebychev’s Theorem The portion of any data set lying within k standard deviations (k > 1) of the mean is at least k 2

Example using Chebychev’s Theorem Chebychev’s Theorem If k = 2: k Interpretation: In any data set, at least ¾ or 75% of the data lies within 2 standard deviations of the mean.

Example using Chebychev’s Theorem Chebychev’s Theorem If k = 3: k Interpretation: In any data set, at least 8 / 9 or 88.9% of the data lies within 3 standard deviations of the mean.

Example using Chebychev’s Theorem Chebychev’s Theorem If k = 4: k Interpretation: In any data set, at least 15 / 16 or 93.8% of the data lies within 4 standard deviations of the mean.

Summary of Chebychev’s Theorem Chebychev’s Theorem At least 75% of the data fall in the interval from  - 2  to  + 2  At least 88.9% of the data fall in the interval from  - 3  to  + 3  At least 93.8% of the data fall in the interval from  - 4  to  + 4 

34% Chebychev’s Theorem 75% 88.9% 93.8%

Chebychev’s Theorem vs. Empirical Rule Empirical Rule only works on BELL-SHAPED curves Chebychev’s Theorem works for ALL distribtuions.

Assignment: Worksheet on Empirical Rule & Chebychev’s Theorem

Standard Deviation for Grouped Data To use this formula, you need to create a frequency distribution.

Standard Deviation for Grouped Data This formula can also be used if the data is grouped into classes. You would then use the midpoint.

Assignment: pg. 81: 29-32

Measures of Position What you should learn: How to find the first, second, and third quartiles of a data set. How to find the interquartile range of a data set. How to represent a data set graphically using a box-and- whisker plot. How to interpret other fractiles such as percentiles.

Quartiles- Vocabulary Numbers that divide an ordered data set into four equal parts.

FYI There are 3 quartiles to a data set. About one quarter of the data falls on or below the first quartile (Q1). About one half of the data falls on or below the second quartile (Q2). About three quarters of the data falls on or below the third quartile (Q3).

Examples of Quartiles 3, 4, 4, 5, 6, 8, 8 2 nd quartile (Q2) AKA Median 1 st quartile (Q1) AKA Lower Quartile 3 rd quartile (Q3) AKA Upper Quartile

Guidelines for Determining the Quartiles of Data  Arrange the data in order.  Determine the median of the data.  Split the data into two halves.  The first quartile is the median of the lower half of data.  The third quartile is the median of the upper half of data.

Interquartile Range (IQR) - Vocabulary The difference between the upper quartile and the lower quartile. IQR = Q3 – Q1

Box and Whisker Plots - Vocabulary An exploratory data analysis tool that highlights the important features of a data set. You need 5 key values: The minimum data entry The first quartile (Q1) The median (Q2) The third quartile (Q3) The maximum data entry

Components of Box and Whisker Plots

Example of Box and Whisker Plots

Percentiles- Vocabulary Numbers that divide an ordered data set into one hundred equal parts. Percentiles are used quite often in education and health-related fields. They are used to indicate how one individual compares with others in a group.

Percentiles It is important that you understand what a percentile means… If the weight of a 6-month old infant is at the 78 th percentile, it means that the infant’s weight is greater than 78% of all six-month-old infants. It does not mean that the infant weights 78% of some ideal weight.

Assignment: pg. 91: 13, 14, 21, 22

Review for Test: pg. 97: 1-3, 6-12,15,17- 18,22-26,28,30,32- 34,37-41,44