Presentation is loading. Please wait.

Presentation is loading. Please wait.

Central Tendency & Variability

Similar presentations


Presentation on theme: "Central Tendency & Variability"— Presentation transcript:

1 Central Tendency & Variability
Heibatollah Baghi, Mastaneh Badii, and Farrokh Alemi Ph.D. This lecture was organized by Dr. Alemi. It is based on the work done by Dr. Baghi and Mastaneh Badii. The purpose of these slides are to Define measures of central tendency and dispersion. We help you select the appropriate measures to use for a particular dataset. Graphs may be useful, but the information they offer is often inexact. A frequency distribution provides many details, but often we want to condense a distribution further. Two concepts can help: Measures of Central Tendency. Measures of Variability or Scatter.

2 Measures of Central Tendency: Mean
Add up observations Measures of Central Tendency: Mean Count of observations 23, 23, 24, 25, 25 ,25, 26, 26, 27, 28. 25.2 The mean describes the center or the balance point of a frequency distribution.

3 Measures of Central Tendency: Mode
20, 21, 21, 22, 22, 22, 22, 23, 23, 24 The most frequent value or category in a distribution is called a mode. We Calculate the mode for the following set of values to be 22.

4 Measures of Central Tendency: Median
21, 22, 22, 23, 24, 26, 26, 27, 28, 29 50% 50% The median is the middle value of a set of ordered numbers. 50% of data are above it and 50% below it. The calculation of median depends on whether we have an even or odds number of data points. For example to calculate for an even number of data points shown here. The median is 25

5 Median 22, 23, 23, 24, 25, 26, 27, 27, 28 50% 50% Here are data with odd number of data points with no duplicates. The median is 25

6 Most frequently occurring value
Mode Most frequently occurring value Nominal, Ordinal, and (sometimes) Interval/Ratio-Level Data Median Exact center (when odd N) of rank-ordered data or average of two middle values (when even N) Ordinal-Level Data and Interval/Ratio-Level data (particularly when skewed) Mean Arithmetic average (Sum of Xs/N) Interval/Ratio-Level Data One of the common ways of describing data is to describe the central tendency of the data. We want to answer the question, “Around what value does the data tend to cluster?” Which statistics we use to answer that question depends on the level of the data. With nominal-level data, the only appropriate measure of central tendency is the mode, or the most frequently occurring value. The mode may also be used with ordinal-level data, and sometimes with interval-level data.

7 Comparison of Measures of Central Tendency
Mode Most frequently occurring value Nominal, Ordinal, and (sometimes) Interval/Ratio-Level Data Median Exact center (when odd N) of rank-ordered data or average of two middle values (when even N) Ordinal-Level Data and Interval/Ratio-Level data (particularly when skewed) Mean Arithmetic average (Sum of Xs/N) Interval/Ratio-Level Data With Ordinal-level data, the most common measure of central tendency is the median, or the value of the middle case in rank-ordered data. Put another way, the median is the value at the 50th percentile. Note that, with highly skewed, interval-level data, the median might be more useful than the mean. We’ll return to that notion later.

8 Comparison of Measures of Central Tendency
Mode Most frequently occurring value Nominal, Ordinal, and (sometimes) Interval/Ratio-Level Data Median Exact center (when odd N) of rank-ordered data or average of two middle values (when even N) Ordinal-Level Data and Interval/Ratio-Level data (particularly when skewed) Mean Arithmetic average (Sum of Xs/N) Interval/Ratio-Level Data With Interval-Level data, the mean is commonly used. This is the arithmetic average, arrived at by adding up all the values and dividing by the number of cases.

9 Comparison of Measures of Central Tendency in Normal Distribution
Frequency When Mean, median and mode are the same, then the shape of distribution is symmetric

10 Comparison of Measures of Central Tendency in Bimodal Distribution
Frequency In this distribution we have the situation where Mean & median are the same but there are two modes different from mean and median

11 Outliers pull the mean away From the median Skewed to Left Frequency
Negatively Skewed Skewed to Left Frequency Outliers pull the mean away From the median When Mean, median & mode are different, if Mode is greater than Median and median is greater than Mean then we have a distribution skewed to the left.

12 Outliers pull the mean away From the median Skewed to Right Frequency
Positively Skewed Skewed to Right Frequency Outliers pull the mean away From the median Mean, median & mode are different Mean > Median > Mode

13 Uniform Frequency In a uniform distribution, like normal distribution, Mean, median & mode are at the same point

14 Frequency Exponential Here the Mode is to extreme right and
Mean is to the right of median

15 Measures of Variability or Scatter
Same Average Different Variability Reporting only an average without an accompanying measure of variability may misrepresent a set of data. Two datasets can have the same average but very different variability.

16 Measures of Variability or Scatter: Range
Max 110, 120, 130, 140, 150, 160, 170, 180, 190 Min Range is the difference between the highest and lowest score. It is Easy to calculate but highly unstable. If we were to calculate range for the data displayed here, The highest value is 190 and the lowest value is 110 and the difference or range is 80

17 Measures of Variability or Scatter: Semi Inter-quartile Range
Standard Quartile Range SQR = (Q3-Q1)/2 Standard quartile range Is Half of the difference between the 25% quartile and 75% quartile. It is more stable than range

18 Sum Squares &Variance ss = Σ (X - M)2
Sample variance is The sum of squared differences between observations and their mean divided by n -1.

19 Standard Deviation Standard deviation is The squared root of the variance

20 Sum of Squares Variance Standard Deviation
Sum of Squares is key in calculation of the Sample Variance and Sample Standard Deviation Standard Deviation

21 Calculating Standard Deviation
Data X-M (X - M)2  110 -40 1600 120 -30 900 130 -20 400 140 -10 100 150 160 10 170 20 180 30 190 40 Total 6000 (SS) Calculating Standard Deviation N-1 9 Sample Variance 667 Standard Deviation 25.8 SS is the key to many statistics Calculate the difference, square the difference, sum it, divide it by n minus 1 to get sample variance. Square root of sample variance is standard deviation.

22 Calculating formula Defining formula Sum of squares Variance
Formula Variations Calculating formula Defining formula Sum of squares Variance Standard deviation

23 6 Standard Deviations partition most data in Normal Distribution

24 Standardized Scores: Z Scores
Standard Score Z = (x-m) / s Mean & standard deviations are also used to compute standard scores. Standard scores have distributions that are published and available on the web so one can easily calculate the frequency of an observation.

25 Standardized Scores: Z Scores
Z = (x-m) / s Z = 140 – 110 / 10 = 3 For example, if we want to Calculate standard score for blood pressure of 140 if the sample mean is 110 and the standard deviation is 10. Now we can use the standard score to read from the web how often does the blood pressure of 140 occur in our distribution.

26 Observed Expected Standard deviations Allows comparison of observed distribution to expected distribution

27 Measures of Central Tendency & Variability Can Describe the Distribution of Data


Download ppt "Central Tendency & Variability"

Similar presentations


Ads by Google