Measures of dispersion

Slides:



Advertisements
Similar presentations
DESCRIBING DISTRIBUTION NUMERICALLY
Advertisements

Statistical Techniques I EXST7005 Start here Measures of Dispersion.
Measures of Dispersion
Numerically Summarizing Data
Measures of Dispersion or Measures of Variability
B a c kn e x t h o m e Parameters and Statistics statistic A statistic is a descriptive measure computed from a sample of data. parameter A parameter is.
Measures of Dispersion CJ 526 Statistical Analysis in Criminal Justice.
Intro to Descriptive Statistics
Biostatistics Unit 2 Descriptive Biostatistics 1.
2.3. Measures of Dispersion (Variation):
VARIABILITY. PREVIEW PREVIEW Figure 4.1 the statistical mode for defining abnormal behavior. The distribution of behavior scores for the entire population.
Variability Ibrahim Altubasi, PT, PhD The University of Jordan.
Lecture 4 Dustin Lueker.  The population distribution for a continuous variable is usually represented by a smooth curve ◦ Like a histogram that gets.
Statistics for the Behavioral Sciences Second Edition Chapter 4: Central Tendency and Variability iClicker Questions Copyright © 2012 by Worth Publishers.
Statistics for Linguistics Students Michaelmas 2004 Week 1 Bettina Braun.
Describing distributions with numbers
LECTURE 12 Tuesday, 6 October STA291 Fall Five-Number Summary (Review) 2 Maximum, Upper Quartile, Median, Lower Quartile, Minimum Statistical Software.
Summary statistics Using a single value to summarize some characteristic of a dataset. For example, the arithmetic mean (or average) is a summary statistic.
JDS Special Program: Pre-training1 Basic Statistics 01 Describing Data.
Anthony J Greene1 Dispersion Outline What is Dispersion? I Ordinal Variables 1.Range 2.Interquartile Range 3.Semi-Interquartile Range II Ratio/Interval.
Chapter 4 Variability. Variability In statistics, our goal is to measure the amount of variability for a particular set of scores, a distribution. In.
Measures of Central Tendency and Dispersion Preferred measures of central location & dispersion DispersionCentral locationType of Distribution SDMeanNormal.
Descriptive statistics Describing data with numbers: measures of variability.
Part II  igma Freud & Descriptive Statistics Chapter 3 Viva La Difference: Understanding Variability.
Objectives Vocabulary
1 1 Slide Descriptive Statistics: Numerical Measures Location and Variability Chapter 3 BA 201.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 3 Descriptive Statistics: Numerical Methods.
Table of Contents 1. Standard Deviation
What is variability in data? Measuring how much the group as a whole deviates from the center. Gives you an indication of what is the spread of the data.
KNR 445 Statistics t-tests Slide 1 Variability Measures of dispersion or spread 1.
13-1 Copyright  2010 McGraw-Hill Australia Pty Ltd PowerPoint slides to accompany Croucher, Introductory Mathematics and Statistics, 5e Chapter 13 Measures.
UTOPPS—Fall 2004 Teaching Statistics in Psychology.
Lecture 5 Dustin Lueker. 2 Mode - Most frequent value. Notation: Subscripted variables n = # of units in the sample N = # of units in the population x.
INVESTIGATION 1.
Chapter 5 Measures of Variability. 2 Measures of Variability Major Points The general problem The general problem Range and related statistics Range and.
Unit 3 Lesson 2 (4.2) Numerical Methods for Describing Data
Business Statistics Spring 2005 Summarizing and Describing Numerical Data.
Numerical Measures of Variability
LECTURE CENTRAL TENDENCIES & DISPERSION POSTGRADUATE METHODOLOGY COURSE.
Chapter 3, Part A Descriptive Statistics: Numerical Measures n Measures of Location n Measures of Variability.
Describing Distributions with Numbers Chapter 2. What we will do We are continuing our exploration of data. In the last chapter we graphically depicted.
Part II Sigma Freud and Descriptive Statistics Chapter 3 Vive La Différence: Understanding Variability.
Chapter 5: Measures of Dispersion. Dispersion or variation in statistics is the degree to which the responses or values obtained from the respondents.
1.  In the words of Bowley “Dispersion is the measure of the variation of the items” According to Conar “Dispersion is a measure of the extent to which.
Variability Introduction to Statistics Chapter 4 Jan 22, 2009 Class #4.
CHAPTER 2: Basic Summary Statistics
By Tatre Jantarakolica1 Fundamental Statistics and Economics for Evaluating Survey Data of Price Indices.
1 STAT 500 – Statistics for Managers STAT 500 Statistics for Managers.
Introduction Dispersion 1 Central Tendency alone does not explain the observations fully as it does reveal the degree of spread or variability of individual.
AP Statistics 5 Number Summary and Boxplots. Measures of Center and Distributions For a symmetrical distribution, the mean, median and the mode are the.
Mathematical Presentation of Data Measures of Dispersion
Teaching Statistics in Psychology
Introductory Mathematics & Statistics
Description of Data (Summary and Variability measures)
Chapter 3 Describing Data Using Numerical Measures
Numerical Descriptive Measures
Descriptive Statistics
Variability.
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
Measuring Variation – The Five-Number Summary
Basic Practice of Statistics - 3rd Edition
Numerical Descriptive Measures
Basic Practice of Statistics - 3rd Edition
MBA 510 Lecture 2 Spring 2013 Dr. Tonya Balan 4/20/2019.
CHAPTER 2: Basic Summary Statistics
2.3. Measures of Dispersion (Variation):
Basic Biostatistics Measures of central tendency and dispersion
Warm up Honors Algebra 2 3/14/19
Presentation transcript:

Measures of dispersion

Measure of Dispersion The measure of central tendency gives a single value that represents the whole value, however central tendency cannot describe the observation fully. The measure of dispersion helps us to study the variability of the items. With dispersion measures We measure the variation of the items among themselves We measure the variation around the average.

Measure of Dispersion With dispersion you can determine the reliability of the average.

Understanding dispersion Consider the soil carbon study: µ = 33.90, median =11.73 µ = 2.73, median=2.46

Describing dispersion - range The difference between the largest and the smallest value. It is the simplest measure of dispersion. 12 5 10 7 12 1 18 7 8 7

Describing dispersion - range Min=1.327; Max.=523.300 Min= 0.283; Max.= 6.260 TC_g.kg = Min. 1st Qu. Median Mean 3rd Qu. Max. 1.327 7.800 11.730 33.900 22.760 523.300 Log(TC_g.kg) = Min. 1st Qu. Median Mean 3rd Qu. Max. 0.2831 2.0540 2.4620 2.7300 3.1250 6.2600

Describing dispersion - quartiles Quartile deviation: difference between the upper quartile and lower quartiles taken and is called the interquartile range. Quartiles are the values that divide the list of numbers into quarters. First quartile: 25% of the values (numbers)in the data set lie below Q1 and about 75% lie above Q1. Third quartile: 75% of the values in the data lie below Q3 and about 25% lie above Q3.

Describing dispersion - quartiles Min=1.327; Max.=523.300 Min= 0.283; Max.= 6.260 Q1 =7.800; Q3= 22.760 Q1 =2.054; Q3= 3.125 Q1-Q3 = encompass 50% of all the data The central 50% not influenced by extreme values Will give a better estimate of data variability to expect Q1-Q3 = encompass 50% of all the data (this is 50% around the median – which remember is the midpoint of the data). The central 50% is not influenced by any of the extreme values so will give you a better estimate of the variability you may expect. TC_g.kg = Min. 1st Qu. Median Mean 3rd Qu. Max. 1.327 7.800 11.730 33.900 22.760 523.300 Log(TC_g.kg) = Min. 1st Qu. Median Mean 3rd Qu. Max. 0.2831 2.0540 2.4620 2.7300 3.1250 6.2600

Quartiles Q1 Q2 Q1 – Q2 12 5 10 7 12 1 18 7 8 7

The Variance The objective measure of data clustering around the mean 𝑠 2 = 𝑖=1 𝑛 ( 𝑥 𝑖 − 𝑥 ) 2 𝑛−1 Degrees of freedom (n-1) Used to account for the estimates made in calculation In this case the 𝑥 is an estimate of µ Issue with variance is the units of measurement eg. If data is in min then unit will be min2

Standard deviation 𝑠= 𝑖=1 𝑛 ( 𝑥 𝑖 − 𝑥 ) 2 𝑛−1 𝑠= 𝑖=1 𝑛 ( 𝑥 𝑖 − 𝑥 ) 2 𝑛−1 Now the answer will have the same units of the measurement 1s = 68% data 2s = 95% data 3s = 99.7% 𝑥 = 0, s=1

Standard deviation vs Variance Both are derived from the mean. VAR SD 𝑠 2 = 𝑖=1 𝑛 ( 𝑥 𝑖 − 𝑥 ) 2 𝑛−1 𝑠= 𝑖=1 𝑛 ( 𝑥 𝑖 − 𝑥 ) 2 𝑛−1

Standard deviation vs Variance The variance measures the average degree to which each point differs from the mean. SD is simply the square root of the variance. Why the calculation of the variance uses squares? 𝑠 2 = 𝑖=1 𝑛 ( 𝑥 𝑖 − 𝑥 ) 2 𝑛−1

Standard deviation vs Variance The calculation of the variance uses squares because it weights outliers more heavily than the data near to the mean. 1 , 2 , 5 𝑠 2 = 𝑖=1 𝑛 ( 𝑥 𝑖 − 𝑥 ) 2 𝑛−1 𝑠 2 = 𝑖=1 𝑛 ( 𝑥 𝑖 − 𝑥 ) 𝑛−1

Standard deviation vs Variance This also prevents differences above the mean from cancelling out those below, which can sometimes result in a variance of zero. 1 , 2 , -3

Standard deviation vs Variance However, because of this squaring, the variance is no longer in the same unit of measurements as the original data. 𝑠 2 = 𝑖=1 𝑛 ( 𝑥 𝑖 − 𝑥 ) 2 𝑛−1 𝑠 2 = 𝑖=1 𝑛 (𝑚𝑒𝑡𝑒𝑟𝑠− 𝑚𝑒𝑡𝑒𝑟𝑠 ) 2 𝑛−1 =m2

Standard deviation vs Variance Taking the root of the variance means the SD is restored to the original unit of measure. 𝑠= 𝑖=1 𝑛 ( 𝑥 𝑖 − 𝑥 ) 2 𝑛−1 𝑠 = 𝑖=1 𝑛 (𝑚𝑒𝑡𝑒𝑟𝑠− 𝑚𝑒𝑡𝑒𝑟𝑠 ) 2 𝑛−1 =m

Comparing distributions 𝑥 = 10; black = s=1; blue s = 1.7 Which distribution has more variability in the data?

Population statistics Parameter Mean 𝑥 µ Variance 𝑠 2 𝜎 2 Standard deviation 𝑠 𝜎

Exercise Provide the range, and 1st and 3rd quartiles for your two data sets Manually calculate in excel the standard deviation for your sample of soil values. Using the stdev formula in excel calculate the standard deviations for all the sample sets How variable are the standard deviations? Which of the datasets comes closest to the population std deviation? (70.27 g.kg-1)