Exploring, Displaying, and Examining Data

Slides:



Advertisements
Similar presentations
Chapter 2 Exploring Data with Graphs and Numerical Summaries
Advertisements

Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 2 Exploring Data with Graphs and Numerical Summaries Section 2.2 Graphical Summaries.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Lecture Slides Elementary Statistics Eleventh Edition and the Triola.
Chapter 2 Presenting Data in Tables and Charts
Chapter Sixteen EXPLORING, DISPLAYING, AND EXAMINING DATA
ISE 261 PROBABILISTIC SYSTEMS. Chapter One Descriptive Statistics.
Exploring, Displaying, and Examining Data
B a c kn e x t h o m e Classification of Variables Discrete Numerical Variable A variable that produces a response that comes from a counting process.
CHAPTER 1: Picturing Distributions with Graphs
5 Number Summary Box Plots. The five-number summary is the collection of The smallest value The first quartile (Q 1 or P 25 ) The median (M or Q 2 or.
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 12 Describing Data.
Chapter 16 Exploring, Displaying, and Examining Data McGraw-Hill/Irwin Copyright © 2011 by The McGraw-Hill Companies, Inc. All Rights Reserved.
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Let’s Review for… AP Statistics!!! Chapter 1 Review Frank Cerros Xinlei Du Claire Dubois Ryan Hoshi.
Census A survey to collect data on the entire population.   Data The facts and figures collected, analyzed, and summarized for presentation and.
©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Describing Data: Displaying and Exploring Data Unit 1: One Variable Statistics CCSS: N-Q (1-3);
ITEC6310 Research Methods in Information Technology Instructor: Prof. Z. Yang Course Website: c6310.htm Office:
Sta220 - Statistics Mr. Smith Room 310 Class #3. Section
2011 Summer ERIE/REU Program Descriptive Statistics Igor Jankovic Department of Civil, Structural, and Environmental Engineering University at Buffalo,
Chapter 2 Describing Data.
1 STAT 500 – Statistics for Managers STAT 500 Statistics for Managers.
Data Once the data starts to flow, our attention turns to data analysis –Data preparation – includes editing, coding and data entry –Exploring, displaying.
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 2-1 Chapter 2 Presenting Data in Tables and Charts Statistics For Managers 4 th.
McGraw-Hill/Irwin Business Research Methods, 10eCopyright © 2008 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 16 Exploring, Displaying,
Engineering Statistics KANCHALA SUDTACHAT. Statistics  Deals with  Collection  Presentation  Analysis and use of data to make decision  Solve problems.
Chapter 3: Organizing Data. Raw data is useless to us unless we can meaningfully organize and summarize it (descriptive statistics). Organization techniques.
Exploring, Displaying, and Examining Data
Class Two Before Class Two Chapter 8: 34, 36, 38, 44, 46 Chapter 9: 28, 48 Chapter 10: 32, 36 Read Chapters 1 & 2 For Class Three: Chapter 1: 24, 30, 32,
Chapter 5: Organizing and Displaying Data. Learning Objectives Demonstrate techniques for showing data in graphical presentation formats Choose the best.
Graphs with SPSS Aravinda Guntupalli. Bar charts  Bar Charts are used for graphical representation of Nominal and Ordinal data  Height of the bar is.
(Unit 6) Formulas and Definitions:. Association. A connection between data values.
1 By maintaining a good heart at every moment, every day is a good day. If we always have good thoughts, then any time, any thing or any location is auspicious.
McGraw-Hill/Irwin © 2003 The McGraw-Hill Companies, Inc.,All Rights Reserved. Part Four ANALYSIS AND PRESENTATION OF DATA.
COMPLETE BUSINESS STATISTICS
Exploring, Displaying, and Examining Data
UNIT ONE REVIEW Exploring Data.
Exploratory Data Analysis
Part Four ANALYSIS AND PRESENTATION OF DATA
Chapter 1.1 Displaying Distributions with graphs.
Exploring Data: Summary Statistics and Visualizations
ISE 261 PROBABILISTIC SYSTEMS
Chapter 3 Describing Data Using Numerical Measures
BUSINESS MATHEMATICS & STATISTICS.
Chapter 2: Methods for Describing Data Sets
Lesson 8 Introduction to Statistics
Unit 6 Day 2 Vocabulary and Graphs Review
CHAPTER 3 Data Description 9/17/2018 Kasturiarachi.
Statistical Reasoning
Description of Data (Summary and Variability measures)
Laugh, and the world laughs with you. Weep and you weep alone
Graphical Descriptive Techniques
Chapter 3 Describing Data Using Numerical Measures
Chapter 2 Presenting Data in Tables and Charts
CHAPTER 1: Picturing Distributions with Graphs
Topic 5: Exploring Quantitative data
Numerical Measures: Skewness and Location
Warmup What five numbers need to be mentioned in the complete sentence you write when the data distribution is skewed?
Exploring, Displaying, and Examining Data
Displaying Distributions with Graphs
Displaying and Summarizing Quantitative Data
Good Morning AP Stat! Day #2
Basic Practice of Statistics - 3rd Edition
Basic Practice of Statistics - 3rd Edition
Day 52 – Box-and-Whisker.
CHAPTER 1: Picturing Distributions with Graphs
Honors Statistics Review Chapters 4 - 5
Organizing, Displaying and Interpreting Data
Biostatistics Lecture (2).
Presentation transcript:

Exploring, Displaying, and Examining Data Chapter 16 Exploring, Displaying, and Examining Data This chapter presents the use of charts to present data and the initial exploration of data using tools like cross-tabulation.

Learning Objectives Understand . . . That exploratory data analysis techniques provide insights and data diagnostics by emphasizing visual representations of the data. How cross-tabulation is used to examine relationships involving categorical variables, serves as a framework for later statistical testing, and makes an efficient tool for data visualization and later decision-making.

Exploratory Data Analysis Confirmatory In exploratory data analysis, the researcher has the flexibility to respond to the patterns revealed in the preliminary analysis of the data. Patterns in the collected data guide the data analysis or suggest revisions to the preliminary data analysis plan. This flexibility is an important attribute of this approach. When the researcher is attempting to show causation, confirmatory data analysis is required. Confirmatory data analysis is an analytical process guided by classical statistical inference in its use of significance and confidence.

Data Exploration, Examination, and Analysis in the Research Process Exhibit 16-1 Exhibit 16-1 reminds one of the importance of data visualization as an integral element in the data analysis process and as a necessary step prior to hypothesis testing.

Frequency: Appropriate Social Networking Age Exhibit 16-2 A frequency table is a simple device for arraying data. It arrays category codes from lowest value to highest value, with columns for count (frequency), percent, valid percent (percent when missing data is extracted), and cumulative percent. It arrays data by assigned numerical value, with columns for percent, valid percent (percent adjusted for missing data), and cumulative percent. This nominal variable describes the perceived desirable minimum age to be permitted to own a social networking account. The same data are presented in Exhibit 16-3 using a pie chart and a bar chart. The values and percentages are more readily understood in this graphic format.

Bar Chart Exhibit 16-3, bottom In this slide, the same data are presented in the form of a bar chart.

Pie Chart Exhibit 16-3, top This portion of Exhibit 16-3 illustrates the observations of ad recall in the form of a pie chart. Data may be more readily understood when presented graphically.

Frequency Table Exhibit 16-4 When the variable of interest is measured on an interval-ratio scale and is one with many potential values, these techniques are not particularly informative. Exhibit 16-4, shown in the slide, is a condensed frequency table of the average annual purchases of PrimeSell’s top 50 customers. Only two values, 59.9 and 66, have a frequency greater than 1. Thus, the primary contribution of this table is an ordered list of values. If the table were converted to a bar chart, it would have 48 bars of equal length and two bars with two occurrences.

Histogram Exhibit 16-5 The histogram is the conventional solution for the display of interval-ratio data. Histograms are used when it is possible to group the variable’s values into intervals. A histogram is a graphical bar chart that groups continuous data values into equal intervals, with one bar for each interval. Data analysts find histograms useful for 1) displaying all intervals in a distribution, even those without observed values, and 2) examining the shape of the distribution for skewness, kurtosis, and the modal pattern. The values for the average annual purchases variable presented in Exhibit 16-4 were measured on a ratio scale and are easily grouped. Histograms are not useful for nominal variables like ad recall that has no order to its categories.

Pareto Diagram Exhibit 16-7 Pareto diagrams represent frequency data as a bar chart, ordered from most to least, overlayed with a line graph denoting the cumulative percentage at each variable level. The percentages sum to 100 percent. The data are derived from a multiple-choice-single-response scale, a multiple-choice-multiple-response scale, or frequency counts of words or themes from content analysis. Exhibit 16-7, shown in the slide, depicts an analysis of MindWriter customer complaints as a Pareto diagram

Boxplot Components Exhibit 16-8 The boxplot, or box-and-whisker plot, is another technique used frequently in exploratory data analysis. A boxplot reduces the detail of the stem-and-leaf display and provides a different visual image of the distribution’s location, spread, shape, tail length, and outliers. Boxplots are extensions of the five-number summary of a distribution. This summary consists of the median, the upper and lower quartiles, and the largest and smallest observations. The median and quartiles are used because they are particularly resistant statistics. Resistance is a characteristic that provides insensitivity to localized misbehavior in data. The mean and standard deviation are considered nonresistant statistics, because they are susceptible to the effects of extreme values in the tails of the distribution and do not represent typical values well under conditions of asymmetry. Boxplots may be constructed easily by hand or by computer programs. The ingredients of the plot are The rectangular plot that encompasses 50% of the data values, A center line--marking the median and going through the width of the box, The edges of the box, called hinges, and The whiskers that extend from the right and left hinges to the largest and smallest values. These values may be found within 1.5 times the interquartile range (IQR) from either edge of the box.

SPSS Cross-Tabulation Exhibit 16-11 Cross-tabulation is a technique for comparing data from two or more categorical variables. It is used with demographic variables and the study’s target variables. The technique uses tables having rows and columns that correspond to the levels or code values of each variable’s categories. Exhibit 16-11 is an example of a computer-generated cross-tabulation. This table has two rows for gender and two columns for assignment selection. The combination produces four cells. Depending on what you request for each cell, it can contain a count of the cases of the joint classification and also the row, column, and/or the total percentages. The number of row cells and column cells is often used to designate the size of the table, as in this 2 x 2 table. Row and column totals, called marginals, appear at the bottom and right “margins” of the table. When tables are constructed for statistical testing, we call them contingency tables and the test determines if the classification variables are independent of each other. This is discussed in Chapter 20.

Exploratory Data Analysis This Booth Research Services ad suggests that the researcher’s role is to make sense of data displays. Great data exploration and analysis delivers insight from data.

Key Terms Confirmatory data analysis Control variable Cross-tabulation Exploratory data analysis (EDA) Frequency table Histogram Outliers