Download presentation
Presentation is loading. Please wait.
Published byJanel Powers Modified over 9 years ago
1
Asian School of Business PG Programme in Management (2005-06) Course: Quantitative Methods in Management I Instructor: Chandan Mukherjee Session 2: Summarising a Distribution
2
Modern/EDA Terminology Classical Terminology Cluster, Level, CentreCentral Tendency, Location Scatter, SpreadDispersion ShapeSkewness TailsKurtosis
3
Numerical Summaries (Descriptive Statistics) FeatureMean based summaryOrder based summary LevelArithmatic MeanMedian SpreadStandard DeviationMidspread The order based summaries are resistant to extreme values i.e. that are not unduly influenced by a small part of the data. That is why they are called Resistant Summaries.
4
Numerical Summaries (Descriptive Statistics) Both the mean based and the order based summaries of the spread of a distribution (Standard Deviation and Midspread) are scale dependent. That is why we need to neutralise the scale effect by dividing by their respective summaries of the Center. Coefficient of Variation = Standard Deviation / Mean Relative Midspread = Midspread / Median
5
Definitions Variance = Average Squared Distance from the Mean = Serial No. Data (X) X–Mean(X–Mean) 2 1 6 -4 16 2 7 -3 9 3 8 -2 4 4 9 1 5 20 10 100 Total 50 0 130 Average 10 26 = Variance where
6
DEFINITION (contd.) Standard Deviation (SD ) = Square root of Variance Co-efficient Variation = SD/Mean = 0.0517 (or, 5.17%)
7
Median = The value that divides the ordered data values into two equal halves To compute Median: Sort the data in ascending order Divide the number of observations (data values) by 2 If the result is an integer, say 9, then median is the average of the 9th and 10th observations If the result is not an integer, say 9.3, then round it up to the next integer above i.e. 10 in this case. The 10th observation is the median DEFINITION (contd.)
8
Average of 26 th and 27 th observations = 61.80 Example: Finding the Median
9
DEFINITION (contd.) Quartiles = The three values that divide the ordered observations into four equal parts 25% of the observations lie below the First (Lower) Quartile 50% of the observations lie below the Second (Middle) Quartile 75% of the observations lie below the Third (Upper) Quartile The Third or the Middle Quartile is obviously the Median
10
DEFINITION (contd.) To compute the Lower and the Upper Quartile: Sort the data in ascending order Divide the total number of observations by 4 If the result is an integer, say 12, then take the average of the 12 th and the 13 th observations from the lowest observation (downward) as the Lower Quartile. Similarly, take the 12th and the 13th observations from the highest observation (upward) as the Upper Quartile If the result is not an integer, say, 12.7, then round it up to the next integer above, i.e. 13 in this case. The Lower Quartile is the 13 th observations from the lowest, and the Upper quartile is the 13 th observation from the highest
11
DEFINITION (contd.) Example: Finding the Lower & Upper Quatiles Lower Quartile (39.67 + 40.40)/2 = 40.04 Upper quartile (126.54 + 128.40)/2 = 127.47
12
DEFINITION (contd.) Midspread = Upper Quartile – Lower Quartile The range that holds the middle 50% of the observations Relative Midspread = Midspread / Median = (127.47 – 40.04)/61.80 = 1.41
13
Five Number Summary Five numbers can comprehensively summarise the features of a distribution without being unduly affected by a small part of the data Minimum (MN) Lower Quartile (LQ) Median (MD) Upper Quartile (UQ) Maximum (MX)
14
Lower Tail Upper Tail Five Number Summary is Comprehensive: The Grand Summary of a Distribution
15
Indentifying the Extreme Values: Who Are The Outliers? Here is a thumb rule (based on theory): Step = 1.5 times Midspread Lower Fence = Lower Quartile – Step Upper Fence = Upper Quartile + Step All observations below the Lower Fence are Negative Outliers All observations above the Upper Fence are Positive Outliers
16
Cotton Blended Yearn Companies: Five Number Summary & Fences MN 0.07 LQ 40.04 MD 61.80 UQ127.47 MX445.63 Midspread 87.43 Step131.14 Lower Fence -91.10 Upper Fence258.61
17
Box & Whisker Plot MD UQ LQ MX MN Outliers! Upper Fence Lower Fence
18
Comparing Two Distributions of Gross Fixed Asset: Fabrics and Yarn Companies
19
Summary Of The Points
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.