Download presentation
Presentation is loading. Please wait.
1
Numerical Measures: Skewness and Location
PSYSTA1 โ Week 6
2
Measure of Skewness statistical measure used to describe the distribution of the data relative to symmetry Goal: quantify the degree of asymmetry (e.g., location of tails, difference between โcentersโ, etc.) in a data set Sample Skewness: ๐๐ ๐ฑ = ๐[ ๐ โ๐ฆ๐๐ ๐ ] ๐
3
Some PROPERTIES In relation with histograms (i.e., locating the centers):
4
Example 1 Compute the coefficient of skewness for the data given below. Then, describe the skewness of the data based on computed coefficient
5
Measures of Location statistical measures used to describe the (relative) standing or location of an observation relative to the rest of the data Goal: locate the observation relative to the rest of the observations Most Commonly used Measures: Percentiles (including deciles and quartiles) z-Scores
6
Percentiles defined as the value on the measurement scale below which a specified percentage of the scores in the distribution fall denoted by ๐ ๐ , they divide the ranked data set into 100 equal parts A percentile ๐ท ๐ would indicate that at least k% of the data is less than or equal to the value of ๐ท ๐ (thus, 100%โ๐% of the data is greater than ๐ ๐ ).
7
Percentiles Calculating Percentiles:
The (approximate) value of the ๐ ๐กโ percentile, denoted by ๐ท ๐ , is ๐ท ๐ โ๐ฏ๐๐ฅ๐ฎ๐ ๐จ๐ ๐ญ๐ก๐ ๐๐ ๐๐๐ ๐๐ ๐ญ๐๐ซ๐ฆ ๐ข๐ง ๐ ๐ซ๐๐ง๐ค๐๐ ๐ฌ๐๐ญ where ๐ denotes the number of the percentile and n represents the sample size.
8
Percentiles Note: ๐= ๐๐ ๐๐๐
9
Percentiles Some Special Percentiles:
Deciles - divide the data set into ten equal parts Quartiles - divide the data set into four equal parts
10
Percentile Rank (๐) defined as the percentage of scores with values lower than the score in question Finding Percentile Rank of a Value: ๐= ๐ง๐ฎ๐ฆ๐๐๐ซ ๐จ๐ ๐ฏ๐๐ฅ๐ฎ๐๐ฌ ๐ฅ๐๐ฌ๐ฌ ๐ญ๐ก๐๐ง ๐ ๐ญ๐จ๐ญ๐๐ฅ ๐ง๐ฎ๐ฆ๐๐๐ซ ๐จ๐ ๐ฏ๐๐ฅ๐ฎ๐๐ฌ ๐ข๐ง ๐ญ๐ก๐ ๐๐๐ญ๐ ๐ฌ๐๐ญ ร๐๐๐
11
Some PROPERTIES ๐ฆ๐๐ ๐ฑ = ๐ ๐๐ = ๐ ๐ = ๐ ๐ *[median]
๐ท ๐ in relation with probability (particularly cumulative %): ๐ท ๐ฟโค ๐ท ๐ โ ๐ ๐๐๐ i.e., the cumulative percentage of observations with values less than or equal to ๐ท ๐ is approximately ๐% A more (statistically) robust measure of variability can be defined using quartiles, i.e., the inter- quartile range (IQR) defined as ๐๐๐= ๐ธ ๐ โ ๐ธ ๐
12
Example 2 Consider the following data set which relates again to the studentโs number of hours studied each day over a 2-week period Compute, and interpret whenever appropriate, for the following: a.) ๐ท ๐๐ e.) ๐ธ ๐ b.) ๐ท ๐๐ f.) ๐ธ ๐ c.) ๐ซ ๐ g.) ๐๐๐ d.) ๐ธ ๐ h.) ๐ฉ๐๐ซ๐๐๐ง๐ญ๐ข๐ฅ๐ ๐ซ๐๐ง๐ค ๐จ๐ ๐.๐
13
BoxPlot (Box-and-Whiskers Plot)
a graphical representation of a summary of five important values: the minimum value, the first quartile, the median (or the second quartile), the third quartile, and the maximum value [i.e., the five-number summary]
14
BoxPlot (Box-and-Whiskers Plot)
Steps in Constructing a Boxplot: Rank the data in increasing order and calculate the values of the median ( ๐ 2 ), first quartile ( ๐ 1 ), and third quartile ( ๐ 3 ). Also find the interquartile range (IQR). Find the lower and upper inner fences. ๐๐จ๐ฐ๐๐ซ ๐๐ง๐ง๐๐ซ ๐
๐๐ง๐๐= ๐ ๐ โ๐.๐ร๐๐๐ ๐๐ฉ๐ฉ๐๐ซ ๐๐ง๐ง๐๐ซ ๐
๐๐ง๐๐= ๐ ๐ +๐.๐ร๐๐๐ Determine the smallest and the largest values in the given data set within the two inner fences.
15
BoxPlot (Box-and-Whiskers Plot)
Draw a horizontal line and mark the levels on it such that all the values in the given data set are covered. Above or below the horizontal line, draw a box with its left side at the position of the first quartile and the right side at the position of the third quartile. Inside the box, draw a vertical line at the position of the median. By drawing two lines, join the points of the smallest and the largest values within the two inner fences of the box. These two lines are called whiskers.
16
BoxPlot (Box-and-Whiskers Plot)
The observations that fall outside the two inner fences are called outliers. They are either mild or extreme outliers. To determine such, there is a need to find the lower and upper outer fences. ๐๐จ๐ฐ๐๐ซ ๐๐ฎ๐ญ๐๐ซ ๐
๐๐ง๐๐= ๐ ๐ โ๐.๐ร๐๐๐ ๐๐ฉ๐ฉ๐๐ซ ๐๐ฎ๐ญ๐๐ซ ๐
๐๐ง๐๐= ๐ ๐ +๐.๐ร๐๐๐ Values outside of the inner fences but inside of the outer fences (yellow card zone) are referred to as mild outliers. Values outside of both fences (red card zone) are referred to as extreme outliers.
17
BoxPlot (Box-and-Whiskers Plot)
18
Some PROPERTIES In relation with skewness (i.e., characterizing asymmetry):
19
Example 3 Construct a boxplot for the data given below
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.