Presentation is loading. Please wait.

Presentation is loading. Please wait.

Numerical Descriptive Measures

Similar presentations


Presentation on theme: "Numerical Descriptive Measures"— Presentation transcript:

1 Numerical Descriptive Measures
Chapter 2 Borrowed from

2 Chapter Topics Measures of central tendency Quartiles
Mean, median, mode, midrange Quartiles Measure of variation Range, interquartile range, variance and Standard deviation, coefficient of variation Shape Symmetric, skewed (+/-) Coefficient of correlation

3 Coefficient of Variation
Summary Measures Summary Measures Central Tendency Quartile Variation Arithmetic Mean Median Mode Coefficient of Variation Range Variance Geometric Mean Standard Deviation

4 Measures of Central Tendency
Average (Mean) Median Mode

5 Mean (Arithmetic Mean)
Mean (arithmetic mean) of data values Sample mean Population mean Sample Size Population Size

6 Mean (Arithmetic Mean)
(continued) The most common measure of central tendency Affected by extreme values (outliers) Mean = 5 Mean = 6

7 Median Robust measure of central tendency
Not affected by extreme values In an Ordered array, median is the “middle” number If n or N is odd, median is the middle number If n or N is even, median is the average of the two middle numbers Median = 5 Median = 5

8 Mode A measure of central tendency Value that occurs most often
Not affected by extreme values Used for either numerical or categorical data There may may be no mode There may be several modes No Mode Mode = 9

9 Quartiles Split Ordered Data into 4 Quarters Position of i-th Quartile
and Are Measures of Noncentral Location = Median, A Measure of Central Tendency 25% 25% 25% 25% Data in Ordered Array:

10 Coefficient of Variation
Measures of Variation Variation Variance Standard Deviation Coefficient of Variation Range Population Variance (σ2) Population Standard deviation (σ) Inter-quartile Range Sample Variance (S2) Sample Standard deviation (S)

11 Range Measure of variation
Difference between the largest and the smallest observations: Ignores the way in which data are distributed Range = = 5 Range = = 5

12 Interquartile Range Measure of variation Also known as midspread
Spread in the middle 50% Difference between the first and third quartiles Not affected by extreme values Data in Ordered Array:

13 Variance Important measure of variation Shows variation about the mean
Sample variance: Population variance:

14 Standard Deviation Most important measure of variation
Shows variation about the mean Has the same units as the original data Sample standard deviation: Population standard deviation:

15 Comparing Standard Deviations
Data A Mean = 15.5 S = 3.338 Data B Mean = 15.5 S = .9258 Data C Mean = 15.5 S = 4.57

16 Coefficient of Variation
Measures relative variation Always in percentage (%) Shows variation relative to mean Is used to compare two or more sets of data measured in different units

17 Comparing Coefficient of Variation
Stock A: Average price last year = $50 Standard deviation = $5 Stock B: Average price last year = $100 Coefficient of variation:

18 Shape of a Distribution
Describes how data is distributed Measures of shape Symmetric or Skewed (-) Left-Skewed Symmetric (+) Right-Skewed Mean < Median < Mode Mean = Median =Mode Mode < Median < Mean

19 Coefficient of Correlation
Measures the strength of the linear relationship between two quantitative variables

20 Features of Correlation Coefficient
Unit free Ranges between –1 and 1 The closer to –1, the stronger the negative linear relationship The closer to 1, the stronger the positive linear relationship The closer to 0, the weaker any positive linear relationship

21 Scatter Plots of Data with Various Correlation Coefficients
Y X r = -1 r = -.6 r = 0 r = .6 r = 1

22 Chapter Summary Described measures of central tendency
Mean, median, mode, midrange Discussed quartile Described measure of variation Range, interquartile range, variance and standard deviation, coefficient of variation Illustrated shape of distribution Symmetric, Skewed Discussed correlation coefficient

23 Stem-n-leaf plot A class took a test. The students' got the following scores. First let us draw the stem and leaf plot This makes it easy to see that the mode, the most common score is an 85 since there are three of those scores and all of the other scores have frequencies of either one or tow. It will also make it easier to rank the scores

24 Stem-n-leaf This will make it easier to find the median and the quartiles. There are as many scores above the median as below. Since there are ten scores, and ten is an even number, we can divide the scores into two equal groups with no scores left over. To find the median, we divide the scores up into the upper five and the lower five.

25 Box Plot The five-number summary is an abbreviated way to describe a sample. The five number summary is a list of the following numbers: Minimum First (Lower) Quartile, Median, Third (Upper) Quartile, Maximum The five number summary leads to a graphical representation of a distribution called the boxplot. Boxplots are ideal for comparing two nearly-continuous variables. To draw a boxplot (see the example in the figure below), follow these simple steps: The ends of the box (hinges) are at the quartiles, so that the length of the box is the .

26 Box plots The median is marked by a line within the box.
The two vertical lines (called whiskers) outside the box extend to the smallest and largest observations within of the quartiles. Observations that fall outside of are called extreme outliers and are marked, for example, with an open circle. Observations between and are called mild outliers and are distinguished by a different mark, e.g., a closed circle.

27 Example of boxplot The Density of Nitrogen - A Comparison of Two Samples Lord Raleigh was one of the earliest scientists to study the density of nitrogen. In his studies, he noticed something peculiar. The density of nitrogen produced from chemical compounds tended to be smaller than the density of nitrogen produced from the air However, he was working with fairly small samples, and the question is, was he correct in his conjecture? Lord Raleigh's measurements which first appeared in Proceedings, Royal Society (London, 55, 1894 pp ) are produced below. The units are the mass of nitrogen filling a certain flask under specified pressure and temperature. Calculate the summary statistics for each set of data. Construct side-by-side box plots.

28 Chemical Atmospheric 2.2989 2.3101 2.2994

29 Box Plot Atmospheric pressure has 9 observations and chemical pressure has 10 values. Use following five points to draw box plots and compare the two variables Max Min Q Median 2.3101 Q


Download ppt "Numerical Descriptive Measures"

Similar presentations


Ads by Google