Presentation is loading. Please wait.

Presentation is loading. Please wait.

Numerical Measures: Skewness and Location

Similar presentations


Presentation on theme: "Numerical Measures: Skewness and Location"โ€” Presentation transcript:

1 Numerical Measures: Skewness and Location
PSYSTA1 โ€“ Week 6

2 Measure of Skewness statistical measure used to describe the distribution of the data relative to symmetry Goal: quantify the degree of asymmetry (e.g., location of tails, difference between โ€œcentersโ€, etc.) in a data set Sample Skewness: ๐’๐Š ๐ฑ = ๐Ÿ‘[ ๐’™ โˆ’๐ฆ๐ž๐ ๐’™ ] ๐’”

3 Some PROPERTIES In relation with histograms (i.e., locating the centers):

4 Example 1 Compute the coefficient of skewness for the data given below. Then, describe the skewness of the data based on computed coefficient

5 Measures of Location statistical measures used to describe the (relative) standing or location of an observation relative to the rest of the data Goal: locate the observation relative to the rest of the observations Most Commonly used Measures: Percentiles (including deciles and quartiles) z-Scores

6 Percentiles defined as the value on the measurement scale below which a specified percentage of the scores in the distribution fall denoted by ๐‘ƒ ๐‘˜ , they divide the ranked data set into 100 equal parts A percentile ๐‘ท ๐’Œ would indicate that at least k% of the data is less than or equal to the value of ๐‘ท ๐‘˜ (thus, 100%โˆ’๐‘˜% of the data is greater than ๐‘ƒ ๐‘˜ ).

7 Percentiles Calculating Percentiles:
The (approximate) value of the ๐‘˜ ๐‘กโ„Ž percentile, denoted by ๐‘ท ๐’Œ , is ๐‘ท ๐’Œ โ‰ˆ๐ฏ๐š๐ฅ๐ฎ๐ž ๐จ๐Ÿ ๐ญ๐ก๐ž ๐’Œ๐’ ๐Ÿ๐ŸŽ๐ŸŽ ๐’•๐’‰ ๐ญ๐ž๐ซ๐ฆ ๐ข๐ง ๐š ๐ซ๐š๐ง๐ค๐ž๐ ๐ฌ๐ž๐ญ where ๐‘˜ denotes the number of the percentile and n represents the sample size.

8 Percentiles Note: ๐’‘= ๐’Œ๐’ ๐Ÿ๐ŸŽ๐ŸŽ

9 Percentiles Some Special Percentiles:
Deciles - divide the data set into ten equal parts Quartiles - divide the data set into four equal parts

10 Percentile Rank (๐‘˜) defined as the percentage of scores with values lower than the score in question Finding Percentile Rank of a Value: ๐’Œ= ๐ง๐ฎ๐ฆ๐›๐ž๐ซ ๐จ๐Ÿ ๐ฏ๐š๐ฅ๐ฎ๐ž๐ฌ ๐ฅ๐ž๐ฌ๐ฌ ๐ญ๐ก๐š๐ง ๐’™ ๐ญ๐จ๐ญ๐š๐ฅ ๐ง๐ฎ๐ฆ๐›๐ž๐ซ ๐จ๐Ÿ ๐ฏ๐š๐ฅ๐ฎ๐ž๐ฌ ๐ข๐ง ๐ญ๐ก๐ž ๐๐š๐ญ๐š ๐ฌ๐ž๐ญ ร—๐Ÿ๐ŸŽ๐ŸŽ

11 Some PROPERTIES ๐ฆ๐ž๐ ๐ฑ = ๐ ๐Ÿ“๐ŸŽ = ๐ƒ ๐Ÿ“ = ๐ ๐Ÿ *[median]
๐‘ท ๐’Œ in relation with probability (particularly cumulative %): ๐‘ท ๐‘ฟโ‰ค ๐‘ท ๐’Œ โ‰ˆ ๐’Œ ๐Ÿ๐ŸŽ๐ŸŽ i.e., the cumulative percentage of observations with values less than or equal to ๐‘ท ๐’Œ is approximately ๐’Œ% A more (statistically) robust measure of variability can be defined using quartiles, i.e., the inter- quartile range (IQR) defined as ๐ˆ๐๐‘= ๐‘ธ ๐Ÿ‘ โˆ’ ๐‘ธ ๐Ÿ

12 Example 2 Consider the following data set which relates again to the studentโ€™s number of hours studied each day over a 2-week period Compute, and interpret whenever appropriate, for the following: a.) ๐‘ท ๐Ÿ‘๐Ÿ‘ e.) ๐‘ธ ๐Ÿ b.) ๐‘ท ๐Ÿ–๐Ÿ“ f.) ๐‘ธ ๐Ÿ‘ c.) ๐‘ซ ๐Ÿ g.) ๐ˆ๐๐‘ d.) ๐‘ธ ๐Ÿ h.) ๐ฉ๐ž๐ซ๐œ๐ž๐ง๐ญ๐ข๐ฅ๐ž ๐ซ๐š๐ง๐ค ๐จ๐Ÿ ๐Ÿ‘.๐Ÿ–

13 BoxPlot (Box-and-Whiskers Plot)
a graphical representation of a summary of five important values: the minimum value, the first quartile, the median (or the second quartile), the third quartile, and the maximum value [i.e., the five-number summary]

14 BoxPlot (Box-and-Whiskers Plot)
Steps in Constructing a Boxplot: Rank the data in increasing order and calculate the values of the median ( ๐‘„ 2 ), first quartile ( ๐‘„ 1 ), and third quartile ( ๐‘„ 3 ). Also find the interquartile range (IQR). Find the lower and upper inner fences. ๐‹๐จ๐ฐ๐ž๐ซ ๐ˆ๐ง๐ง๐ž๐ซ ๐…๐ž๐ง๐œ๐ž= ๐ ๐Ÿ โˆ’๐Ÿ.๐Ÿ“ร—๐ˆ๐๐‘ ๐”๐ฉ๐ฉ๐ž๐ซ ๐ˆ๐ง๐ง๐ž๐ซ ๐…๐ž๐ง๐œ๐ž= ๐ ๐Ÿ‘ +๐Ÿ.๐Ÿ“ร—๐ˆ๐๐‘ Determine the smallest and the largest values in the given data set within the two inner fences.

15 BoxPlot (Box-and-Whiskers Plot)
Draw a horizontal line and mark the levels on it such that all the values in the given data set are covered. Above or below the horizontal line, draw a box with its left side at the position of the first quartile and the right side at the position of the third quartile. Inside the box, draw a vertical line at the position of the median. By drawing two lines, join the points of the smallest and the largest values within the two inner fences of the box. These two lines are called whiskers.

16 BoxPlot (Box-and-Whiskers Plot)
The observations that fall outside the two inner fences are called outliers. They are either mild or extreme outliers. To determine such, there is a need to find the lower and upper outer fences. ๐‹๐จ๐ฐ๐ž๐ซ ๐Ž๐ฎ๐ญ๐ž๐ซ ๐…๐ž๐ง๐œ๐ž= ๐ ๐Ÿ โˆ’๐Ÿ‘.๐ŸŽร—๐ˆ๐๐‘ ๐”๐ฉ๐ฉ๐ž๐ซ ๐Ž๐ฎ๐ญ๐ž๐ซ ๐…๐ž๐ง๐œ๐ž= ๐ ๐Ÿ‘ +๐Ÿ‘.๐ŸŽร—๐ˆ๐๐‘ Values outside of the inner fences but inside of the outer fences (yellow card zone) are referred to as mild outliers. Values outside of both fences (red card zone) are referred to as extreme outliers.

17 BoxPlot (Box-and-Whiskers Plot)

18 Some PROPERTIES In relation with skewness (i.e., characterizing asymmetry):

19 Example 3 Construct a boxplot for the data given below


Download ppt "Numerical Measures: Skewness and Location"

Similar presentations


Ads by Google