3A-1. Describing Data Visually (Part 1) Visual Description Visual Description Dot Plots Dot Plots Frequency Distributions and Histograms Frequency Distributions.

Slides:



Advertisements
Similar presentations
Descriptive Measures MARE 250 Dr. Jason Turner.
Advertisements

Chapter 2 Organizing Data Understandable Statistics Ninth Edition
BCOR 1020 Business Statistics Lecture 4 – January 29, 2008.
Copyright © 2014 Pearson Education, Inc. All rights reserved Chapter 2 Picturing Variation with Graphs.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 2 Exploring Data with Graphs and Numerical Summaries Section 2.2 Graphical Summaries.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Lecture Slides Elementary Statistics Eleventh Edition and the Triola.
Histogram Most common graph of the distribution of one quantitative variable.
Displaying & Summarizing Quantitative Data
McGraw-Hill/Irwin Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. A PowerPoint Presentation Package to Accompany Applied Statistics.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Calculating & Reporting Healthcare Statistics
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
BCOR 1020 Business Statistics
B a c kn e x t h o m e Classification of Variables Discrete Numerical Variable A variable that produces a response that comes from a counting process.
FOUNDATIONS OF NURSING RESEARCH Sixth Edition CHAPTER Copyright ©2012 by Pearson Education, Inc. All rights reserved. Foundations of Nursing Research,
McGraw-Hill/Irwin Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. A PowerPoint Presentation Package to Accompany Applied Statistics.
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 2 Descriptive Statistics: Tabular and Graphical Methods.
Today: Central Tendency & Dispersion
Welcome to Data Analysis and Interpretation
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 12 Describing Data.
Objective To understand measures of central tendency and use them to analyze data.
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
May 06th, Chapter - 7 INFORMATION PRESENTATION 7.1 Statistical analysis 7.2 Presentation of data 7.3 Averages 7.4 Index numbers 7.5 Dispersion from.
Descriptive Statistics: Tabular and Graphical Methods
Chapter 3: Central Tendency. Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately.
©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Describing Data: Displaying and Exploring Data Unit 1: One Variable Statistics CCSS: N-Q (1-3);
Graphical Summary of Data Distribution Statistical View Point Histograms Skewness Kurtosis Other Descriptive Summary Measures Source:
CHAPTER 1 Basic Statistics Statistics in Engineering
Smith/Davis (c) 2005 Prentice Hall Chapter Four Basic Statistical Concepts, Frequency Tables, Graphs, Frequency Distributions, and Measures of Central.
© Copyright McGraw-Hill CHAPTER 3 Data Description.
Worked examples and exercises are in the text STROUD (Prog. 28 in 7 th Ed) PROGRAMME 27 STATISTICS.
3A-1. Describing Data Visually (Part 1) Visual Description Visual Description Dot Plots Dot Plots Frequency Distributions and Histograms Frequency Distributions.
Describing Data Visually (Part 1) Chapter33 Visual Description Dot Plots Frequency Distributions and Histograms Line Charts Bar Charts McGraw-Hill/Irwin.
Business Statistics: Communicating with Numbers By Sanjiv Jaggia and Alison Kelly McGraw-Hill/Irwin Copyright © 2013 by The McGraw-Hill Companies, Inc.
COURSE: JUST 3900 INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Instructor: Dr. John J. Kerbs, Associate Professor Joint Ph.D. in Social Work and Sociology.
Copyright © Cengage Learning. All rights reserved. 2 Descriptive Analysis and Presentation of Single-Variable Data.
Descriptive Statistics: Numerical Methods
Chapter 2 Describing Data.
1.1 EXPLORING STATISTICAL QUESTIONS Unit 1 Data Displays and Number Systems.
Chapter 2 Organizing Data Understanding Basic Statistics Fifth Edition By Brase and Brase Prepared by Jon Booze.
Measures of central tendency are statistics that express the most typical or average scores in a distribution These measures are: The Mode The Median.
Categorical vs. Quantitative…
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Section 2-2 Frequency Distributions.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 2 Descriptive Statistics: Tabular and Graphical Methods.
Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Chapter 2 Descriptive Statistics: Tabular and Graphical Methods.
Unit 4 Statistical Analysis Data Representations.
Dr. Serhat Eren Other Uses for Bar Charts Bar charts are used to display data for different categories where the data are some kind of quantitative.
Central Tendency & Dispersion
Describing Data Visually
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 2 Descriptive Statistics: Tabular and Graphical Methods.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 1 Exploring Data 1.2 Displaying Quantitative.
Chapter 2: Frequency Distributions. Frequency Distributions After collecting data, the first task for a researcher is to organize and simplify the data.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall2(2)-1 Chapter 2: Displaying and Summarizing Data Part 2: Descriptive Statistics.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 2 Descriptive Statistics: Tabular and Graphical Methods.
Worked examples and exercises are in the text STROUD PROGRAMME 27 STATISTICS.
STROUD Worked examples and exercises are in the text Programme 28: Data handling and statistics DATA HANDLING AND STATISTICS PROGRAMME 28.
Statistical Fundamentals: Using Microsoft Excel for Univariate and Bivariate Analysis Alfred P. Rovai Charts Overview PowerPoint Prepared by Alfred P.
Statistics Unit Test Review Chapters 11 & /11-2 Mean(average): the sum of the data divided by the number of pieces of data Median: the value appearing.
(Unit 6) Formulas and Definitions:. Association. A connection between data values.
Describing Data Week 1 The W’s (Where do the Numbers come from?) Who: Who was measured? By Whom: Who did the measuring What: What was measured? Where:
Chapter 4: Measures of Central Tendency. Measures of central tendency are important descriptive measures that summarize a distribution of different categories.
Exploratory Data Analysis
Analysis and Empirical Results
MAT 135 Introductory Statistics and Data Analysis Adjunct Instructor
Chapter 2: Methods for Describing Data Sets
Describing Data Visually
Histograms: Earthquake Magnitudes
Organizing, Displaying and Interpreting Data
Essentials of Statistics 4th Edition
Presentation transcript:

3A-1

Describing Data Visually (Part 1) Visual Description Visual Description Dot Plots Dot Plots Frequency Distributions and Histograms Frequency Distributions and Histograms Line Charts Line Charts Bar Charts Bar Charts Chapter 3A3A McGraw-Hill/Irwin© 2008 The McGraw-Hill Companies, Inc. All rights reserved.

3A-3 Visual Description Methods of organizing, exploring and summarizing data include:Methods of organizing, exploring and summarizing data include: - Visual (charts and graphs) provides insight into characteristics of a data set without using mathematics. - Numerical (statistics or tables) provides insight into characteristics of a data set using mathematics.

3A-4 Begin with univariate data (a set of n observations on one variable) and consider the following:Begin with univariate data (a set of n observations on one variable) and consider the following:CharacteristicInterpretationMeasurement What are the units of measurement? Are the data integer or continuous? Any missing observations? Any concerns with accuracy or sampling methods? Visual Description Central Tendency Where are the data values concentrated? What seem to be typical or middle data values?

3A-5CharacteristicInterpretationDispersion How much variation is there in the data? How spread out are the data values? Are there unusual values? Visual Description Shape Are the data values distributed symmetrically? Skewed? Sharply peaked? Flat? Bimodal?

3A-6 P/E ratios are current stock price divided by earnings per share in the last 12 months. For example:P/E ratios are current stock price divided by earnings per share in the last 12 months. For example:  Example: Price/Earnings Ratios Visual Description

3A-7 Look at the data and visualize how it was collected and measured.Look at the data and visualize how it was collected and measured. Sort the data and then summarize in a graphical display. Here are the sorted P/E ratios:Sort the data and then summarize in a graphical display. Here are the sorted P/E ratios: A histogram graphically displays sorted data.A histogram graphically displays sorted data.  Measurement Visual Description  Sorting

3A-8 Sorting allows you to observe central tendency, dispersion and shape as well as minimum, maximum and range.Sorting allows you to observe central tendency, dispersion and shape as well as minimum, maximum and range.  Sorting Visual Description What else do you observe?What else do you observe?

3A-9 A dot plot is the simplest graphical display of n individual values of numerical data. - Easy to understand - Not good for large samples (e.g., > 5,000).A dot plot is the simplest graphical display of n individual values of numerical data. - Easy to understand - Not good for large samples (e.g., > 5,000). 1. Make a scale that covers the data range 2. Mark the axes and label them 3. Plot each data value as a dot above the scale at its approximate location If more than one data value lies at about the same axis location, the dots are piled up vertically.  Steps in Making a Dot Plot Dot Plots

3A-10 Creating a dot plot in MegaStat

3A-11 Range of data shows dispersion.Range of data shows dispersion. Can add annotations (text boxes) to call attention to specific features.Can add annotations (text boxes) to call attention to specific features. Clustering shows central tendency.Clustering shows central tendency. Dot plots do not tell much of shape of distribution.Dot plots do not tell much of shape of distribution. Dot Plots

3A-12 Consider the following median home prices for nine U.S. Cities.Consider the following median home prices for nine U.S. Cities. Metropolitan Area Median Home Price (000) Akron OH Bergen-Passaic NJ Bradenton FL Colorado Springs CO Hartford CT Milwaukee WI Raleigh-Durham NC San Francisco CA Topeka KS Dot Plots  Small Sample: Home Prices

3A-13 A dot plot is useful to realtors as they discuss patterns in home selling prices within their community.A dot plot is useful to realtors as they discuss patterns in home selling prices within their community. Dot Plots  Small Sample: Home Prices

3A-14 A stacked dot plot compares two or more groups using a common X-axis scale.A stacked dot plot compares two or more groups using a common X-axis scale. Dot Plots  Comparing Groups

3A-15 A frequency distribution is a table formed by classifying n data values into k classes (bins).A frequency distribution is a table formed by classifying n data values into k classes (bins). Bin limits define the values to be included in each bin. Widths must all be the same.Bin limits define the values to be included in each bin. Widths must all be the same. Frequencies are the number of observations within each bin.Frequencies are the number of observations within each bin. Express as relative frequencies (frequency divided by the total) or percentages (relative frequency times 100).Express as relative frequencies (frequency divided by the total) or percentages (relative frequency times 100). Frequency Distributions and Histograms  Bins and Bin Limits 3A-15

3A Sort data in ascending order (e.g., P/E ratios) Frequency Distributions and Histograms  Constructing a Frequency Distribution 2. Choose the number of bins (k) - k should be much smaller than n. - Too many bins results in sparsely populated bins, too few and dissimilar data values are lumped together. 3A-16

3A-17 - Herbert Sturges proposes the following rule: Sample Size (n) Number of Bins (k) Sample Size (n) Number of Bins (k) Frequency Distributions and Histograms  Constructing a Frequency Distribution 3A-17

3A Set the bin limits: Bin width  For example, for k = 7 bins, the approximate bin width is: Bin width  To obtain “nice” limits, we round the width to 10 and start the first bin at 0 to get bin limits: 0, 10, 20, 30, 40, 50, 60, 70 Frequency Distributions and Histograms  Constructing a Frequency Distribution 3A-18

3A Put the data values in the appropriate bin In general, the lower limit is included in the bin while the upper limit is excluded. 5. Create the table, you can include Frequencies – counts for each bin Relative frequencies – absolute frequency divided by total number of data values. Cumulative frequencies – accumulated relative frequency values as bin limits increase. Frequency Distributions and Histograms  Constructing a Frequency Distribution 3A-19

3A-20 Bin RangeFrequency Relative Frequency Cumulative Relative Frequency 0<P/E Ratio< <P/E Ratio< <P/E Ratio< <P/E Ratio< <P/E Ratio< <P/E Ratio< <P/E Ratio< What are the bin limits for the P/E ratio data? Frequency Distributions and Histograms 3A-20

3A-21 A histogram is a graphical representation of a frequency distribution. A histogram is a bar chart. Y-axis shows frequency within each bin. X-axis ticks shows end points of each bin. Frequency Distributions and Histograms  Histograms 3A-21

3A-22 Consider 3 histograms for the P/E ratio data with different bin widths. What do they tell you? Frequency Distributions and Histograms  Histograms 3A-22

3A-23 Obtaining a histogram in Excel

3A-24 Specify a range of cells containing the bin limits or accept Excel’s default.Specify a range of cells containing the bin limits or accept Excel’s default. Frequency Distributions and Histograms  Excel Histograms 3A-24

3A-25 Once created, you can modify the resulting histogram to make it more attractive.Once created, you can modify the resulting histogram to make it more attractive. Frequency Distributions and Histograms  Excel Histograms 3A-25

3A-26 In MegaStat, you can specify the interval width and lower limit of the first interval or accept the default

3A-27 MegaStat shows percents on the Y-axis instead of frequencies.MegaStat shows percents on the Y-axis instead of frequencies. Frequency Distributions and Histograms  MegaStat Histograms 3A-27

3A-28 MegaStat also provides a frequency distribution including cumulative frequencies.MegaStat also provides a frequency distribution including cumulative frequencies. Frequency Distributions and Histograms  MegaStat Histograms 3A-28

3A-29 In MINITAB, choose Graphs > Histograms and accept all defaults.

3A-30 Right-click the X-axis to adjust the bins, axis tick marks, etc.Right-click the X-axis to adjust the bins, axis tick marks, etc. Frequency Distributions and Histograms  MINITAB Histograms 3A-30

3A-31 A histogram bar that is higher than those on either side.A histogram bar that is higher than those on either side. Monomodal – a single modal class.Monomodal – a single modal class. Bimodal – two modal classes.Bimodal – two modal classes. Multimodal – more than two modal classes.Multimodal – more than two modal classes. Modal classes may be artifacts of the way bin limits are chosen.Modal classes may be artifacts of the way bin limits are chosen. Frequency Distributions and Histograms  Modal Class 3A-31

3A-32 A histogram suggests the shape of the population.A histogram suggests the shape of the population. Skewness – indicated by the direction of the longer tail of the histogram.Skewness – indicated by the direction of the longer tail of the histogram. It is influenced by number of bins and bin limits.It is influenced by number of bins and bin limits. Left-skewed – (negatively skewed) a longer left tail. Right-skewed – (positively skewed) a longer right tail. Symmetric – both tail areas approximately the same. Frequency Distributions and Histograms  Shape 3A-32

3A-33

3A-34 Used to display a time series or spot trends, or to compare time periods.Used to display a time series or spot trends, or to compare time periods. Can display several variables at once.Can display several variables at once. Line Charts  Simple Line Charts 3A-34

3A-35 Two-scale line chart – used to compare variables that differ in magnitude or are measured in different units.Two-scale line chart – used to compare variables that differ in magnitude or are measured in different units. Line Charts  Simple Line Charts 3A-35

3A-36 A line graph usually has no vertical grid lines. Horizontal lines can be added to make it easier to establish the y value. Which is easier to read?A line graph usually has no vertical grid lines. Horizontal lines can be added to make it easier to establish the y value. Which is easier to read? Line Charts  Grid Lines 3A-36

3A-37 Arithmetic scale – distances on the Y-axis are proportional to the magnitude of the variable being displayed.Arithmetic scale – distances on the Y-axis are proportional to the magnitude of the variable being displayed. Logarithmic scale – (ratio scale) equal distances represent equal ratios.Logarithmic scale – (ratio scale) equal distances represent equal ratios. Use a log scale for the vertical axis when data vary over a wide range, say, by more than an order of magnitude.Use a log scale for the vertical axis when data vary over a wide range, say, by more than an order of magnitude. This will reveal more detail for small data values.This will reveal more detail for small data values. Line Charts  Log Scales 3A-37

3A-38 Log scale is only suited for positive data values.Log scale is only suited for positive data values. Reveals whether the quantity is growing at an increasing percent (concave upward), constant percent (straight line), or declining percent (concave downward)Reveals whether the quantity is growing at an increasing percent (concave upward), constant percent (straight line), or declining percent (concave downward) Line Charts  Log Scales Arithmetic scale Log scale 3A-38

3A-39 What does the log scale graph tell you about growth rate for both series?What does the log scale graph tell you about growth rate for both series? Arithmetic scale Log scale Line Charts  Example: U.S. Trade 3A-39

3A-40 Useful for - time series data that might be expected to grow at a compound annual percentage rate (e.g., GDP, national debt, future income) - financial charts that cover long periods of time-data that grow rapidly (e.g., revenues)Useful for - time series data that might be expected to grow at a compound annual percentage rate (e.g., GDP, national debt, future income) - financial charts that cover long periods of time-data that grow rapidly (e.g., revenues) Line Charts  When to Use Log Scales 3A-40

3A Line charts are used for time series data (never for cross-sectional data). 2. Y-axis shows numerical variable while X-axis shows time units with time increasing left to right. 3. Use a zero origin on the Y-axis unless more detail is needed. Line Charts  Tips for Effective Line Charts 3A-41

3A Omit numerical labels on a line chart to avoid clutter. Use gridlines if needed. 5. Use data markers (squares, triangles, circles) if they don’t clutter the graph. 6. Don’t make lines too thick. Line Charts  Tips for Effective Line Charts 3A-42

3A-43 Most common way to display attribute data. - Bars represent categories or attributes. - Lengths of bars represent frequencies.Most common way to display attribute data. - Bars represent categories or attributes. - Lengths of bars represent frequencies. Vertical Bar Chart Vertical Bar Chart Horizontal Bar Chart Bar Charts  Plain Bar Charts 3A-43

3A-44 3-D Bar Chart Pyramid Chart Bar Charts  3-D and Novelty Bar Charts 3A-44

3A-45 Special type of bar chart used in quality management to display the frequency of defects or errors of different types.Special type of bar chart used in quality management to display the frequency of defects or errors of different types. Categories are displayed in descending order of frequency.Categories are displayed in descending order of frequency. Focus on significant few (i.e., few categories that account for most defects or errors).Focus on significant few (i.e., few categories that account for most defects or errors). Bar Charts  Pareto Charts 3A-45

3A-46 Bar height is the sum of several subtotals. Areas may be compared by color to show patterns in the subgroups and total.Bar height is the sum of several subtotals. Areas may be compared by color to show patterns in the subgroups and total. Bar Charts  Stacked Bar Chart 3A-46

3A-47 Bar charts can be used for time series data although it may be harder to compare trends.Bar charts can be used for time series data although it may be harder to compare trends. Line Chart Bar Chart Bar Charts  Bar Charts for Time Series Data 3A-47

3A Show the numerical variable of interest with vertical bars on the Y-axis, category labels on the X-axis. 2. For time series quantities, display the category labels on the horizontal X-axis with time increasing from left to right. 3. The height or length of each bar should be proportional to the quantity displayed. 4. Put numerical values at the top of each bar, except if too cluttered. Bar Charts  Tips for Effective Bar Charts 3A-48