Part Four ANALYSIS AND PRESENTATION OF DATA McGraw-Hill/Irwin © 2003 The McGraw-Hill Companies, Inc.,All Rights Reserved.
Chapter Sixteen EXPLORING, DISPLAYING, AND EXAMINING DATA
Types of Data Analysis Exploratory data analysis the data guide the choice of analysis--or a revision of the planned analysis Confirmatory data analysis closer to classical statistical inference in its use of significance and confidence may use information from a closely related data set or by validating findings through the gathering and analyzing of new data
Techniques to Display and Examine Distributions Frequency Table Visual Displays Histograms Stem-and-leaf display Box-plot Crosstabulation of Variables
Techniques to Display and Examine Distributions Histograms Display all intervals in a distribution, even without observed values Examine the shape of the distribution for skewness, kurtosis, and the modal pattern
Techniques to Display and Examine Distributions (cont.) Box-plot (box and whisker-plot) Rectangular plot encompasses 50% of the data values Edges of the box (hinges) Center line through the width of the box marks the median Whiskers extend from the right and left hinges to the largest and smallest values
Techniques to Display and Examine Distributions (cont.) Transformation To improve interpretation and compatibility with other data sets To enhance symmetry and stabilize spread To improve linear relationships between and among variables
Improvement & Control Analysis Statistical process control Uses statistical tools to analyze, monitor, and improve process performance Total Quality Management Control chart Displays sequential measurements of a process together with a center line and control limits Upper control limit Lower control limit
Types of Control Charts Variables data (ratio or interval measurements) X-bar R-charts s-charts Pareto Diagrams Bar chart whose percentages sum to 100 percent
Geographic Information Systems Systems of hardware, software, and procedures that capture, store, manipulate, integrate, and display spatially-referenced data
Geographic Information Systems Minimum four components Integrating information from various sources Capturing data Projection and restructuring Modeling
Crosstabulation A technique for comparing two classification variables Cells Marginals Contingency tables
Percentaging Errors Averaging percentages without weighting Using too-large percentages (>100%) Using percentage with very small sample Citing percentage decrease exceeding 100 percent
Other Table-based Analysis Automatic Interaction Detection (AID) Sequential partitioning procedure that uses a dependent variable and set of predictors Searches among up to 300 variables for the best single division of data into subsets according to each predictor variable, Chooses one division approach Splits the sample using chi-square tests to create multi-way splits.