Lecture 7 Sections 2.3 – 2.4 Objectives: More Detailed Summary Quantities − Quartiles and IQR − Boxplots − Quantile Plots.

Slides:



Advertisements
Similar presentations
Chapter 2 Exploring Data with Graphs and Numerical Summaries
Advertisements

Descriptive Measures MARE 250 Dr. Jason Turner.
Understanding and Comparing Distributions 30 min.
Statistics 100 Lecture Set 6. Re-cap Last day, looked at a variety of plots For categorical variables, most useful plots were bar charts and pie charts.
SECTION 3.3 MEASURES OF POSITION Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Descriptive Statistics
1 The Islamic University of Gaza Civil Engineering Department Statistics ECIV 2305 ‏ Chapter 6 – Descriptive Statistics.
BPS - 5th Ed. Chapter 21 Describing Distributions with Numbers.
Measures of Relative Standing and Boxplots
QBM117 Business Statistics
Percentiles Def: The kth percentile is the value such that at least k% of the measurements are less than or equal to the value. I.E. k% of the measurements.
Understanding and Comparing Distributions
Vocabulary for Box and Whisker Plots. Box and Whisker Plot: A diagram that summarizes data using the median, the upper and lowers quartiles, and the extreme.
Numerical Descriptive Measures
Describing distributions with numbers
LECTURE 12 Tuesday, 6 October STA291 Fall Five-Number Summary (Review) 2 Maximum, Upper Quartile, Median, Lower Quartile, Minimum Statistical Software.
Copyright © 2010, 2007, 2004 Pearson Education, Inc.
Objectives 1.2 Describing distributions with numbers
1 Statistical Analysis - Graphical Techniques Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering EMIS 7370/5370 STAT 5340 : PROBABILITY AND.
Exploration of Mean & Median Go to the website of “Introduction to the Practice of Statistics”website Click on the link to “Statistical Applets” Select.
Boxplots The boxplot is an informative way of displaying the distribution of a numerical variable.. It uses the five-figure summary: minimum, lower quartile,
1 Stat 1510 Statistical Thinking & Concepts Describing Distributions with Numbers.
STA Lecture 131 STA 291 Lecture 13, Chap. 6 Describing Quantitative Data – Measures of Central Location – Measures of Variability (spread)
Review Measures of central tendency
Section 1 Topic 31 Summarising metric data: Median, IQR, and boxplots.
Lecture 5 Dustin Lueker. 2 Mode - Most frequent value. Notation: Subscripted variables n = # of units in the sample N = # of units in the population x.
Measures of Dispersion How far the data is spread out.
Percentiles For any whole number P (between 1 and 99), the Pth percentile of a distribution is a value such that P% of the data fall at or below it. The.
Chapter 2 Section 5 Notes Coach Bridges

Essential Statistics Chapter 21 Describing Distributions with Numbers.
Chapter 2 Describing Distributions with Numbers. Numerical Summaries u Center of the data –mean –median u Variation –range –quartiles (interquartile range)
Chapter 5: Boxplots  Objective: To find the five-number summaries of data and create and analyze boxplots CHS Statistics.
Chapter 5 Describing Distributions Numerically.
+ Chapter 1: Exploring Data Section 1.3 Describing Quantitative Data with Numbers The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE.
Why use boxplots? ease of construction convenient handling of outliers construction is not subjective (like histograms) Used with medium or large size.
BPS - 5th Ed. Chapter 21 Describing Distributions with Numbers.
Lecture 5 Dustin Lueker. 2 Mode - Most frequent value. Notation: Subscripted variables n = # of units in the sample N = # of units in the population x.
Using Measures of Position (rather than value) to Describe Spread? 1.
Statistics topics from both Math 1 and Math 2, both featured on the GHSGT.
Box Plots March 20, th grade. What is a box plot? Box plots are used to represent data that is measured and divided into four equal parts. These.
What is a box-and-whisker plot? 5-number summary Quartile 1 st, 2 nd, and 3 rd quartiles Interquartile Range Outliers.
Why use boxplots? ease of construction convenient handling of outliers construction is not subjective (like histograms) Used with medium or large size.
Lecture 16 Sec – Tue, Feb 12, 2008 The Five-Number Summary.
BPS - 5th Ed.Chapter 21 Describing Distributions with Numbers.
1 Statistical Analysis - Graphical Techniques Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering EMIS 7370/5370 STAT 5340 : PROBABILITY AND.
© 2012 W.H. Freeman and Company Lecture 2 – Aug 29.
Chapter 5 Describing Distributions Numerically Describing a Quantitative Variable using Percentiles Percentile –A given percent of the observations are.
Exploratory Data Analysis
Probability plots.
Chapter 16: Exploratory data analysis: numerical summaries
Chapter 16: Exploratory data analysis: Numerical summaries
Unit 2 Section 2.5.
Box and Whisker Plots Algebra 2.
Topic 5: Exploring Quantitative data
Numerical Measures: Skewness and Location
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
Basic Practice of Statistics - 3rd Edition
Measures of Central Tendency
Define the following words in your own definition
6F Measuring the Spread of Data, 6G Box and Whisker Plots
Measures of Position Section 3.3.
Essential Statistics Describing Distributions with Numbers
Basic Practice of Statistics - 3rd Edition
. . Box and Whisker Measures of Variation Measures of Variation 8 12
Basic Practice of Statistics - 3rd Edition
Box Plots CCSS 6.7.
Describing Distributions with Numbers
Univariate Data Univariate Data: involving a single variable
Presentation transcript:

Lecture 7 Sections 2.3 – 2.4 Objectives: More Detailed Summary Quantities − Quartiles and IQR − Boxplots − Quantile Plots

More Detailed Summary Quantities Percentiles The median divides a data set into two equal parts. A finer partition can be obtained by dividing a data set into more than two parts. The (100p)th percentile separates the smallest 100p% of the data or distribution from the remaining values.

Quartiles and the Interquartile Range Certain percentiles are particularly important. Quartiles (first quartile, median, third quartile) separates a data set or distribution into four equal parts: 25%th percentile=first quartile or lower quartile, denoted by Q 1. 50%th percentile=median, 75%th percentile=third quartile or upper quartile, denoted by Q 3. Sample quartiles Separate the n ordered sample observations into a lower half and an upper half. If n is odd, include the median in each half. Then, Q 1 =median of the lower half of the data Q 3 =median of the upper half of the data Note that there are several different sensible ways to define the sample quartiles. R uses different ways of finding sample quartiles.

Examples Example. n = Find Q1,Median and Q3. Example. n= Find Q1, Median and Q3.

Population Quartiles

IQR and Outlier Detection Determining outliers Suspected (mild) outlier – any observation is a suspected outlier if it is farther than 1.5 IQR from the closest quartile (i.e., falls beyond Q1- 1.5IQR and Q3-1.5IQR). Interquartile range (IQR) IQR = Q3 - Q1 Resistant to the effect of outliers. Useful for the estimation of the variability when the distribution is skewed. Highly suspected (extreme) outlier – any observation is an extreme outlier if it is farther than 3IQR form the nearest quartile (i.e., falls beyond Q1-3IQR and Q3-3IQR).

Boxplots A boxplot is a visual display of data based on the following five-number summary: Min, Q1, Median, Q3, Max Note: Boxplots always run from bottom-to- up or from left-to-right. A central box spans Q1 and Q3 and a line in the box marks the median. Outliers are marked with “o”. In a box plot the upper whisker extends to the largest data value within the upper limit, Q IQR, and the lower whisker extends to the smallest value within the lower limit, Q1 -1.5IQR.

Boxplot Examples Ultrasound was used to gather the accompanying corrosion data on the thickness of the floor plate of an aboveground tank used to store crude oil (“Statistical Analysis of UT Corrosion Data from Floor Plates of a Crude Oil Aboveground Storage Tank”, Material Eval., 1994: ). Each observation is the largest pit depth in the plate, expressed in milli- in Find the five-number summary and plot the boxplot. The effects of partial discharges on the degradation of insulation cavity material have important implications for the lifetimes of high-voltage components. Consider the following sample of n=25 pulse widths from slow discharges in a cylindrical cavity made of polyethylene: Find the five-number summary and plot the boxplot.

Comparative Boxplots Comparative boxplot (or side-by-side boxplot) provides a very effective way of revealing similarities and differences between two or more data sets consisting of observations on the same variable. Example. The article “Compression of Single-Wall Corrugated Shipping Containers Using Fixed and Floating Test Platens” (J. of Testing and Evaluation, 1992: ) describes an experiment in which several different types of boxes were compared with respect to compression strength. Consider the following observations on four different types of boxes: Type of Box Compression Strength (lb)

Quantile Plots An investigator frequently wishes to know whether data was selected from a particular type of population distribution (e.g., normal distribution). For one thing, many inferential procedures are based on the assumption that the underlying distribution is of a specified type. The use of such procedures is inappropriate if the actual distribution differs greatly from the assumed type. Additionally, understanding the underlying distribution can sometimes give insight into the physical mechanisms involved in generating the data. An effective way to check distributional assumption is to construct a quantile plot (or probability plot). Idea: Plot the sample quantiles vs. the theoretical quantiles (population quantiles). If the data come from the correct distribution, the points in the plot will fall close to a straight line. If the actual distribution is quite different from the one used to construct a plot, the points should depart substantially from a linear pattern.

Normal Quantile Plot A Normal Quantile Plot is a plot of the (z quantile, sample quantile) pairs. Example. The accompanying sample consisting of n=20 observations on dielectric breakdown voltage of a piece of epoxy resin appeared in the article “Maximum Likelihood Estimation in the 3-Parameter Weibull Distribution” (IEEE Trans on Dielectrics and Elec. Insul., 1996: 43-55) Is the population distribution of dielectric breakdown voltage normal?

Review of Concepts