1 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 3 Graphical Methods for Describing Data.

Slides:



Advertisements
Similar presentations
Very simple to create with each dot representing a data value. Best for non continuous data but can be made for and quantitative data 2004 US Womens Soccer.
Advertisements

Analyzing Data (C2-5 BVD) C2-4: Categorical and Quantitative Data.
Chapter 3 Graphic Methods for Describing Data. 2 Basic Terms  A frequency distribution for categorical data is a table that displays the possible categories.
Chapter 3 Graphical Methods for Describing Data
Copyright © 2014 Pearson Education, Inc. All rights reserved Chapter 2 Picturing Variation with Graphs.
Chapter 2 Summarizing and Graphing Data
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 2 Exploring Data with Graphs and Numerical Summaries Section 2.2 Graphical Summaries.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Lecture Slides Elementary Statistics Eleventh Edition and the Triola.
Slide Slide 1 Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley. Lecture Slides Elementary Statistics Tenth Edition and the.
Chapter 2 Presenting Data in Tables and Charts
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 2-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Chapter 2 Graphs, Charts, and Tables – Describing Your Data
ISE 261 PROBABILISTIC SYSTEMS. Chapter One Descriptive Statistics.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter Two Treatment of Data.
Introduction to Statistics
CHAPTER 1: Picturing Distributions with Graphs
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Frequency Distributions and Graphs
Objective To understand measures of central tendency and use them to analyze data.
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Let’s Review for… AP Statistics!!! Chapter 1 Review Frank Cerros Xinlei Du Claire Dubois Ryan Hoshi.
Descriptive Statistics: Tabular and Graphical Methods
Variable  An item of data  Examples: –gender –test scores –weight  Value varies from one observation to another.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 1 Overview and Descriptive Statistics.
© 2008 Brooks/Cole, a division of Thomson Learning, Inc. 1 Chapter 4 Numerical Methods for Describing Data.
Lecture 2 Graphs, Charts, and Tables Describing Your Data
Business Statistics: Communicating with Numbers By Sanjiv Jaggia and Alison Kelly McGraw-Hill/Irwin Copyright © 2013 by The McGraw-Hill Companies, Inc.
Chapter 1 The Role of Statistics. Three Reasons to Study Statistics 1.Being an informed “Information Consumer” Extract information from charts and graphs.
 Frequency Distribution is a statistical technique to explore the underlying patterns of raw data.  Preparing frequency distribution tables, we can.
1 Laugh, and the world laughs with you. Weep and you weep alone.~Shakespeare~
Chapter 2 Describing Data.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 2-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Elementary Statistics Eleventh Edition Chapter 2.
Categorical vs. Quantitative…
1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 3 Graphical Methods for Describing Data.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Section 2-2 Frequency Distributions.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 2 Descriptive Statistics: Tabular and Graphical Methods.
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 2-1 Chapter 2 Presenting Data in Tables and Charts Statistics For Managers 4 th.
Unit 4 Statistical Analysis Data Representations.
Dr. Serhat Eren Other Uses for Bar Charts Bar charts are used to display data for different categories where the data are some kind of quantitative.
Chap 2-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course in Business Statistics 4 th Edition Chapter 2 Graphs, Charts, and Tables.
To be given to you next time: Short Project, What do students drive? AP Problems.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Essential Statistics Chapter 11 Picturing Distributions with Graphs.
Displaying Distributions with Graphs. the science of collecting, analyzing, and drawing conclusions from data.
Applied Quantitative Analysis and Practices
Graphical and Tabular Descriptive Techniques Statistics for Management and Economics Chapter 2 Updated: 11/28/2015.
1 Chapter 4 Numerical Methods for Describing Data.
1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 3 Graphical Methods for Describing Data.
UNIT #1 CHAPTERS BY JEREMY GREEN, ADAM PAQUETTEY, AND MATT STAUB.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 1 Exploring Data 1.2 Displaying Quantitative.
Copyright 2011 by W. H. Freeman and Company. All rights reserved.1 Introductory Statistics: A Problem-Solving Approach by Stephen Kokoska Chapter 2 Tables.
Chapter 2 Frequency Distributions and Graphs 1 Copyright © 2012 The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Displaying Data  Data: Categorical and Numerical  Dot Plots  Stem and Leaf Plots  Back-to-Back Stem and Leaf Plots  Grouped Frequency Tables  Histograms.
1 Take a challenge with time; never let time idles away aimlessly.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 2 Section 2 – Slide 1 of 37 Chapter 2 Section 2 Organizing Quantitative Data.
Copyright © 2009 Pearson Education, Inc. 3.2 Picturing Distributions of Data LEARNING GOAL Be able to create and interpret basic bar graphs, dotplots,
Describing Data Week 1 The W’s (Where do the Numbers come from?) Who: Who was measured? By Whom: Who did the measuring What: What was measured? Where:
1.2 Displaying Quantitative Data with Graphs.  Each data value is shown as a dot above its location on the number line 1.Draw a horizontal axis (a number.
Chapter 2: Methods for Describing Data Sets
Laugh, and the world laughs with you. Weep and you weep alone
CHAPTER 1: Picturing Distributions with Graphs
Displaying and Summarizing Quantitative Data
CHAPTER 1 Exploring Data
Basic Practice of Statistics - 3rd Edition
Basic Practice of Statistics - 3rd Edition
CHAPTER 1 Exploring Data
Organizing, Displaying and Interpreting Data
Presentation transcript:

1 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 3 Graphical Methods for Describing Data

2 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. 3.1 Displaying Categorical Data Comparative Bar Charts – Two or more bar charts that use the same horizontal and vertical axes. Relative frequencies are used instead of frequency so that the comparisons are not affected by sample size. Will also have a key so the reader knows which bars represent which data. Pie Charts – The circle represents the whole data set and “slices” that are proportional to the frequencies or relative frequencies represent the different categories. Works best when there are not too many categories. Slice size = 360 o x relative frequency of category Segmented Bar Chart – a rectangular bar represents the whole data set and segments of the bar each category. Segment area is proportional to the relative frequency of the category. In addition to summarizing categorical data, these types of graphical methods can be used to represent categories given quantities instead of frequencies.

3 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Frequency Distribution Example The data in the column labeled vision for the student data set introduced in the slides for chapter 1 is the answer to the question, “What is your principle means of correcting your vision?” The results are tabulated below

4 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Bar Chart Examples This comparative bar chart is based on frequencies and it can be difficult to interpret and misleading. Would you mistakenly interpret this to mean that the females and males use contacts equally often? You shouldn’t. The picture is distorted because the frequencies of males and females are not equal.

5 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Bar Chart Examples When the comparative bar chart is based on percents (or relative frequencies) (each group adds up to 100%) we can clearly see a difference in pattern for the eye correction proportions for each of the genders. Clearly for this sample of students, the proportion of female students with contacts is larger than the proportion of males with contacts.

6 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Bar Chart Examples Stacking the bar chart can also show the difference in distribution of eye correction method. This graph clearly shows that the females have a higher proportion using contacts and both the no correction and glasses group have smaller proportions then for the males.

7 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Pie Chart - Example Using the vision correction data we have :

8 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Pie Chart - Example Using side-by-side pie charts we can compare the vision correction for males and females.

9 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Another Example This data constitutes the grades earned by the distance learning students during one term in the Winter of 2002.

10 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Pie Chart – Another Example Using the grade data from the previous slide we have:

11 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Using the grade data we have: By pulling a slice (exploding) we can accentuate and make it clear how A was the predominate grade for this course. Pie Chart – Another Example

12 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. 3.2 Displaying Numerical Data Stem-and-Leaf Plots – most compact way to summarize univariate numerical data. Must have a key so the reader can correctly interpret the values in the graph. Stem – largest place value the values have in common. List of stems cannot skip any values. Leaf – place value immediately to the right of the stem. Numbers may have to be rounded. There is a leaf for every value in the data. Therefore this type is not good for data where there are a large number of values. Comparative or Back-to-Back Stem and Leaf Plots – used to compare two sets of data one on each side of the leaves.  This type of graph gives information about:  a typical value  the spread of the values  gaps in the values  symmetry in the distribution  number and location of peaks

13 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Stem and Leaf Choosing the 1 st two digits as the stem and the 3 rd digit as the leaf we have the following For our first example, we use the weights of the 25 female students.

14 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Stem and Leaf Typically we sort the stems in increasing order. We also note on the diagram the units for stems and leaves 15|3 = 153 lbs. Probable outliers

15 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Stem-and-leaf – GPA example The following are the GPAs for the 20 advisees of a faculty member. If the ones digit is used as the stem, you only get three groups. You can expand this a little by breaking up the stems by using each stem twice letting the 2 nd digits 0-4 go with the first and the 2 nd digits 5-9 with the second. The next slide gives two versions of the stem-and-leaf diagram. GPA

16 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Stem-and-leaf – GPA example 1L 1H 2L 2H 3L 3H 65,75 04,22,26,27 66,69,74,80 09,13,15,23,38 50,70,72,89,94 1L 1H 2L 2H 3L 3H Stem: Ones digit Leaf: Tenths digits Note: The characters in a stem-and-leaf diagram must all have the same width, so if typing usea fixed character width font such as courier. Stem:Ones digit Leaf:Tenths and hundredths digits

17 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Comparative Stem and Leaf Diagram Student Weight (Comparing two groups) When it is desirable to compare two groups, back-to-back stem and leaf diagrams are useful. Here is the result from the student weights. From this comparative stem and leaf diagram, it is clear that the males weigh more (as a group not necessarily as individuals) than the females FemalesMales

18 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Comparative Stem and Leaf Diagram Student Age female male From this comparative stem and leaf diagram, it is clear that the male ages are all more closely grouped then the females. Also the females had a number of outliers.

19 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. 3.3 Frequency Distributions and Histograms Histogram – a graph of the frequency distribution of discrete numerical data. Each frequency or relative frequency is represented by a rectangle centered over the corresponding value and the area of the rectangle is proportional to the corresponding frequency (or relative frequency.) Horizontal Scale – possible values Vertical Scale – frequencies or relative frequencies What to look for: central or typical value extent of spread or variation general shape location and number of peaks presence of gaps and outliers

20 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Frequency Distributions & Histograms When working with discrete data, the frequency tables are similar to those produced for qualitative data. For example, a survey of local law firms in a medium sized town gave

21 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Frequency Distributions & Histograms The number of lawyers in the firm will have the following histogram. Clearly, the largest group are single member law firms and the frequency decreases as the number of lawyers in the firm increases.

22 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Histograms for Continuous Data Histograms for continuous data - use class intervals (also called classes) where the bar represents data from the lower number to < upper number. Most use class intervals of equal width, but occasionally it works better to use class intervals of unequal width. Using smaller intervals where the data is concentrated and wider intervals where the data is more dispersed. Typically want between 5 and 20. A quick estimate is the square root of the number of observations. A good rule of thumb is to think having an average of more than 5 per group. Horizontal Scale – boundaries of the class intervals Vertical Scale – frequencies or relative frequencies (or in the case of unequal intervals density scale.) Density = rectangle height = relative frequency of class interval class interval width (The use of a density scale ensures that the area of each rectangle in the histogram will be proportional to the corresponding relative frequency.)

23 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Frequency Distributions & Histograms 50 students were asked the question, “How many textbooks did you purchase last term?” The result is summarized below and the histogram is on the next slide.

24 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Frequency Distributions & Histograms “How many textbooks did you purchase last term?” The largest group of students bought 5 or 6 textbooks with 3 or 4 being the next largest frequency.

25 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Frequency Distributions & Histograms Another version with the scales produced differently.

26 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Example of Frequency Distribution Consider the student weights in the student data set. The data values fall between 103 (lowest) and 239 (highest). The range of the dataset is =136. There are 79 data values, so to have an average of at least 5 per group, we need 16 or fewer groups. We need to choose a width that breaks the data into 16 or fewer groups. Any width 10 or large would be reasonable.

27 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Example of Frequency Distribution Choosing a width of 15 we have the following frequency distribution.

28 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Histogram for Continuous Data The following histogram is for the frequency table of the weight data.

29 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Histogram for Continuous Data The following histogram is the Minitab output of the relative frequency histogram. Notice that the relative frequency scale is in percent.

30 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Cumulative Relative Frequency Cumulative relative frequency – used when we want to know what proportion of data falls below a certain value instead of in a particular class. It is the sum of the relative frequencies for the classes up to the value. Cumulative relative frequency plot – graph of the ordered pairs consisting of (upper endpoint of interval, cumulative relative frequency)

31 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Cumulative Relative Frequency Table If we keep track of the proportion of that data that falls below the upper boundaries of the classes, we have a cumulative relative frequency table.

32 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Cumulative Relative Frequency Plot If we graph the cumulative relative frequencies against the upper endpoint of the corresponding interval, we have a cumulative relative frequency plot.

33 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Histogram for Continuous Data Another version of a frequency table and histogram for the weight data with a class width of 20.

34 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Histogram for Continuous Data The resulting histogram.

35 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Histogram for Continuous Data The resulting cumulative relative frequency plot.

36 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Histogram for Continuous Data Yet, another version of a frequency table and histogram for the weight data with a class width of 20.

37 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Histogram for Continuous Data The corresponding histogram.

38 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Histogram for Continuous Data A class width of 15 or 20 seems to work well because all of the pictures tell the same story. The bulk of the weights appear to be centered around 150 lbs with a few values substantially large. The distribution of the weights is unimodal and is positively skewed.

39 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Histogram Shapes Unimodal – one peak Bimodal – two peaks Multimodal – more than two peaks Skewed – a unimodal histogram that is not symmetric Positively skewed – upper tail is longer than the lower tail Negatively skewed – lower tail is loner than the upper tail. (The tale is in the tail.) Normal – is symmetric and the curve from the top of the peak decreases at a well-defined rate when moving toward either tail. Heavy-tailed – curves do not decrease as rapidly as the normal (longer tails.) Light-tailed – curves decrease more rapidly than the normal (smaller tails.) Sampling Variability – the extent to which samples differ from one another and from the populations and is a central theme in statistics.

40 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Illustrated Distribution Shapes Unimodal BimodalMultimodal Skew negatively Symmetric Skew positively

41 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Histograms with uneven class widths Consider the following frequency histogram of ages based on A with class widths of 2. Notice it is a bit choppy. Because of the positively skewed data, sometimes frequency distributions are created with unequal class widths.

42 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Histograms with uneven class widths For many reasons, either for convenience or because that is the way data was obtained, the data may be broken up in groups of uneven width as in the following example referring to the student ages.

43 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Histograms with uneven class widths If a frequency (or relative frequency) histogram is drawn with the heights of the bars being the frequencies (relative frequencies), the result is distorted. Notice that it appears that there are a lot of people over 28 when there is only a few.

44 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Histograms with uneven class widths To correct the distortion, we create a density histogram. The vertical scale is called the density and the density of a class is calculated by This choice for the density makes the area of the rectangle equal to the relative frequency.

45 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Histograms with uneven class widths Continuing this example we have

46 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Histograms with uneven class widths The resulting histogram is now a reasonable representation of the data.

47 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. 3.4 Displaying Bivariate Data Scatterplot – used to represent bivariate data where each observation is represented by an ordered pair (point in the coordinate plane) Time-series plot – graph of data collected over time that can be invaluable in identifying trends or patterns. x- axis is a scale for the time of the observations and the y-axis is the scale for the observed values.

48 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. 3.5 Interpreting and Communicating the Results of Statistical Analyses Effective communication with graphical displays: Be sure to select a display that is appropriate for the given type of data Be sure to include scales and labels on the axes of the graphical display In comparative plots, be sure to include labels or legends so that it is clear The vertical axis in a bar chart, histogram, or scatterplot should always start at 0. Keep your graph simple, the message should be clear and straightforward Keep your graphical display honest and include a brief discussion of the features of the data distribution  With categorical data, a few sentences on the relative proportion for each category pointing out categories that were common or rare  With numerical data, summarize the information that the display provides on three characteristics of the data – center, shape, and spread  With bivariate data focus on the nature of the relationship between the two variables  With data collected over time, describe any trends or patterns in the time-series plot

49 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. 3.5 (cont.) What to look for in published data: Is the chose display appropriate for the type of data? With numerical data, what is the shape and what does it say about the variable? Are there any potential outliers? Is there a plausible explanation for them? Where do most of the data values fall? What is a typical value for the data set? What does this say about the variable? Is there much variability in the data values? What does this say about the variable?