Presentation is loading. Please wait.

Presentation is loading. Please wait.

Slide 1 Statistics Workshop Tutorial 6 Measures of Relative Standing Exploratory Data Analysis.

Similar presentations


Presentation on theme: "Slide 1 Statistics Workshop Tutorial 6 Measures of Relative Standing Exploratory Data Analysis."— Presentation transcript:

1 Slide 1 Statistics Workshop Tutorial 6 Measures of Relative Standing Exploratory Data Analysis

2 Slide 2 Copyright © 2004 Pearson Education, Inc. Created by Tom Wegleitner, Centreville, Virginia Section 2-6 Measures of Relative Standing

3 Slide 3 Copyright © 2004 Pearson Education, Inc.  z Score (or standard score) the number of standard deviations that a given value x is above or below the mean. Definition

4 Slide 4 Copyright © 2004 Pearson Education, Inc. SamplePopulation x - µ z =  Round to 2 decimal places Measures of Position z score z = x - x s

5 Slide 5 Copyright © 2004 Pearson Education, Inc. Interpreting Z Scores Whenever a value is less than the mean, its corresponding z score is negative Ordinary values: z score between –2 and 2 sd Unusual Values:z score 2 sd FIGURE 2-14

6 Slide 6 Copyright © 2004 Pearson Education, Inc. Definition  Q 1 (First Quartile) separates the bottom 25% of sorted values from the top 75%.  Q 2 (Second Quartile) same as the median; separates the bottom 50% of sorted values from the top 50%.  Q 1 (Third Quartile) separates the bottom 75% of sorted values from the top 25%.

7 Slide 7 Copyright © 2004 Pearson Education, Inc. Q 1, Q 2, Q 3 divides ranked scores into four equal parts Quartiles 25% Q3Q3 Q2Q2 Q1Q1 (minimum)(maximum) (median)

8 Slide 8 Copyright © 2004 Pearson Education, Inc. Percentiles Just as there are quartiles separating data into four parts, there are 99 percentiles denoted P 1, P 2,... P 99, which partition the data into 100 groups.

9 Slide 9 Copyright © 2004 Pearson Education, Inc. Finding the Percentile of a Given Score Percentile of value x = 100 number of values less than x total number of values

10 From Percentile to Data Value What score is at the kth percentile? (1)Rank the data from lowest to highest (2)Find L (locator) L = k% * n a) If L is not a whole number, round up and find the score in that position b) If L is a whole #, find the average of the scores in positions L and L+1

11 Slide 11 Copyright © 2004 Pearson Education, Inc.  Interquartile Range (or IQR): Q 3 - Q 1  10 - 90 Percentile Range: P 90 - P 10  Semi-interquartile Range: 2 Q 3 - Q 1  Midquartile: 2 Q 3 + Q 1 Some Other Statistics

12

13 Slide 13 Copyright © 2004 Pearson Education, Inc. Created by Tom Wegleitner, Centreville, Virginia Section 2-7 Exploratory Data Analysis (EDA)

14 Slide 14 Copyright © 2004 Pearson Education, Inc.  Exploratory Data Analysis is the process of using statistical tools (such as graphs, measures of center, and measures of variation) to investigate data sets in order to understand their important characteristics Definition

15 Outliers An outlier is a very high or very low value that stand apart from the rest of the data They may be from data collection errors, data entry errors, or simply valid but unusual data values. Always identify and examine outliers to determine if they are in error

16 Slide 16 Copyright © 2004 Pearson Education, Inc. Important Principles  An outlier can have a dramatic effect on the mean  An outlier have a dramatic effect on the standard deviation  An outlier can have a dramatic effect on the scale of the histogram so that the true nature of the distribution is totally obscured

17 Slide 17 Copyright © 2004 Pearson Education, Inc.  For a set of data, the 5-number summary consists of the minimum value; the first quartile Q 1 ; the median (or second quartile Q 2 ); the third quartile, Q 3 ; and the maximum value  A boxplot ( or box-and-whisker-diagram) is a graph of a data set that consists of a line extending from the minimum value to the maximum value, and a box with lines drawn at the first quartile, Q 1 ; the median; and the third quartile, Q 3 Definitions

18 Slide 18 Copyright © 2004 Pearson Education, Inc. Boxplots Figure 2-16

19 Outliers A data point is considered an outlier if it is 1.5 times the interquartile range above the 75 th percentile or 1.5 times the interquartile range below the 25 th percentile In other words, outliers are numbers outside the interval [Q1-1.5*IQR, Q3+1.5*IQR]

20 Box Plots and Histograms When looking at one variable, it’s a good idea to look at the box plot and histogram together Box plots complement histograms by providing more specific information about the center, the quartiles, and outliers

21 Slide 21 Copyright © 2004 Pearson Education, Inc. Figure 2-17 Boxplots

22 Shape, Center and Spread What should you tell about a quantitative variable? Always report the shape, center and spread If the distribution is skewed, report the median and IQR In a symmetric distribution, report the mean and standard deviation If there are any clear outliers and you are reporting the mean and the standard deviation, report them with the outliers and without them

23 Slide 23 Now we are ready for Part 21 of Day 1


Download ppt "Slide 1 Statistics Workshop Tutorial 6 Measures of Relative Standing Exploratory Data Analysis."

Similar presentations


Ads by Google