Download presentation
Presentation is loading. Please wait.
Published byΦρίξος Μοσχοβάκης Modified over 6 years ago
1
Exploratory data analysis: numerical summaries
CIS Based on Textbook: A Modern Introduction to Probability and Statistics. 2007 Slides: QUINCY R WALKER Modified by the instructor: Dr. Longin Jan Latecki Chapter 16 Exploratory data analysis: numerical summaries
2
16.1 The Center of the Data Set
Center of the Data= sample mean: n = the sample size Example: Sample mean of the following data is 44.7 43, 43, 41, 41, 41, 42, 43, 58, 58, 41, 41
3
Outliers an outlier is an observation that is numerically distant from the rest of the data Sample median is more robust in the presence of outliers.
4
Variability in A Data Set
Variance: Standard Deviation: where n is the number samples Why we choose the factor 1/(n−1) instead of 1/n will be explained later (in Chapter 19).
5
Variability cont. Medn= median of sample
Median of Absolute Deviation (MAD): The Median of the Absolute Deviations of a Sample. Medn= median of sample Absolute Deviation: The absolute value of the distance Of a point xi in a data set from the median
6
Empirical quantiles The order statistics consist of the same elements as the original dataset x1, x2 x3,…, xk , but in ascending order. Denote by the kth element in the ordered list. Then: The pth quartile corresponds to pth quartile of a cdf: Finv(p) where F(p) is the cumulative distribution function of the data
8
Quartiles Lower quartile: qn(.25) Upper quartile: qn(.75)
Interquartile Range (IQR) IQR = qn(0.75) − qn(0.25) Median(Middle Quartile): qn(.50)
9
The box-and-whisker plot
Advantages: Good representation of statistical data Shows quartiles, median and outliers Disadvantages poor graphical display of the dataset histogram and kernel density estimate are more informative displays of a single dataset
10
Using boxplots to compare several datasets
Boxplots become useful if we want to compare several sets of data in a simple graphical display:
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.