Chapter 4 Fundamental statistical characteristics II: Dispersion and form measurements.

Slides:



Advertisements
Similar presentations
Chapter 3 Properties of Random Variables
Advertisements

Descriptive Statistics
Measures of Dispersion
Measures of Dispersion or Measures of Variability
Calculating & Reporting Healthcare Statistics
DESCRIBING DATA: 2. Numerical summaries of data using measures of central tendency and dispersion.
B a c kn e x t h o m e Parameters and Statistics statistic A statistic is a descriptive measure computed from a sample of data. parameter A parameter is.
Biostatistics Unit 2 Descriptive Biostatistics 1.
Chapter 4 SUMMARIZING SCORES WITH MEASURES OF VARIABILITY.
Measures of Central Tendency
Describing Data: Numerical
Descriptive Statistics Used to describe the basic features of the data in any quantitative study. Both graphical displays and descriptive summary statistics.
1 Measure of Center  Measure of Center the value at the center or middle of a data set 1.Mean 2.Median 3.Mode 4.Midrange (rarely used)
Overview Summarizing Data – Central Tendency - revisited Summarizing Data – Central Tendency - revisited –Mean, Median, Mode Deviation scores Deviation.
© Copyright McGraw-Hill CHAPTER 3 Data Description.
Smith/Davis (c) 2005 Prentice Hall Chapter Six Summarizing and Comparing Data: Measures of Variation, Distribution of Means and the Standard Error of the.
Measures of Central Tendency and Dispersion Preferred measures of central location & dispersion DispersionCentral locationType of Distribution SDMeanNormal.
Descriptive Statistics Measures of Variation. Essentials: Measures of Variation (Variation – a must for statistical analysis.) Know the types of measures.
Variability.
Measures of Dispersion & The Standard Normal Distribution 2/5/07.
Descriptive Statistics: Numerical Methods
By: Amani Albraikan 1. 2  Synonym for variability  Often called “spread” or “scatter”  Indicator of consistency among a data set  Indicates how close.
Page 1 Chapter 3 Variability. Page 2 Central tendency tells us about the similarity between scores Variability tells us about the differences between.
Measures of Dispersion & The Standard Normal Distribution 9/12/06.
Skewness & Kurtosis: Reference
Medical Statistics Medical Statistics Tao Yuchun Tao Yuchun 3
An Introduction to Statistics. Two Branches of Statistical Methods Descriptive statistics Techniques for describing data in abbreviated, symbolic fashion.
Measures of Dispersion
1 Measure of Center  Measure of Center the value at the center or middle of a data set 1.Mean 2.Median 3.Mode 4.Midrange (rarely used)
Dr. Serhat Eren 1 CHAPTER 6 NUMERICAL DESCRIPTORS OF DATA.
Chapter 3 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 Chapter 3: Measures of Central Tendency and Variability Imagine that a researcher.
LECTURE CENTRAL TENDENCIES & DISPERSION POSTGRADUATE METHODOLOGY COURSE.
Measures of Variability: “The crowd was scattered all across the park, but a fairly large group was huddled together around the statue in the middle.”
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 3-1 Business Statistics, 4e by Ken Black Chapter 3 Descriptive Statistics.
© 2008 McGraw-Hill Higher Education The Statistical Imagination Chapter 5. Measuring Dispersion or Spread in a Distribution of Scores.
1 Day 1 Quantitative Methods for Investment Management by Binam Ghimire.
CHAPTER 2: Basic Summary Statistics
Statistics Josée L. Jarry, Ph.D., C.Psych. Introduction to Psychology Department of Psychology University of Toronto June 9, 2003.
Chapter 3 Fundamental statistical characteristics I: Measures of central tendency.
Central Tendency Quartiles and Percentiles (الربيعيات والمئينات)
Theme 3. Group description
Descriptive Statistics ( )
Chapter Six Summarizing and Comparing Data: Measures of Variation, Distribution of Means and the Standard Error of the Mean, and z Scores PowerPoint Presentation.
Descriptive Statistics Measures of Variation
Measures of Dispersion
SPSS CODING/GRAPHS & CHARTS CENTRAL TENDENCY & DISPERSION
Business and Economics 6th Edition
Business Decision Making
Descriptive Statistics
Descriptive measures Capture the main 4 basic Ch.Ch. of the sample distribution: Central tendency Variability (variance) Skewness kurtosis.
Practice Page Practice Page Positive Skew.
Descriptive Statistics (Part 2)
Dispersion.
CHAPTER 3 Data Description 9/17/2018 Kasturiarachi.
Descriptive Statistics
Numerical Descriptive Measures
Central tendency and spread
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
Measures of Dispersion
Numerical Descriptive Measures
Numerical Descriptive Measures
MBA 510 Lecture 2 Spring 2013 Dr. Tonya Balan 4/20/2019.
CHAPTER 2: Basic Summary Statistics
2.3. Measures of Dispersion (Variation):
Measures of Dispersion
Business and Economics 7th Edition
Numerical Descriptive Measures
Presentation transcript:

Chapter 4 Fundamental statistical characteristics II: Dispersion and form measurements

Fundamental statistical characteristics Group indexes Central tendency Variability (Dispersion) Bias (Asymmetry) Skewness (Kurtosis) Individual indexes Position Centiles (Ci) Percentiles (Pi) Quartiles (Qi) Raw scores (Xi) Differentials scores (xi) Standard scores (Zi)

How is the data arranged with respect to the distribution center? How far or together are the data from each other? Variability or dispersion indexes At S2 C.V Q S

How are the data arranged with respect to the rest? Are data piled at one end? Bias or Asymmetry indexes g1

Which form is the distribution? Is it flattened or sharp? Skewness or Kurtosis indexes g2

Variability or dispersion indexes

Example A 4 10 12 14 20 B 11 13 C 104 110 112 114 120

4 10 12 14 20 A 10 11 12 14 14 B 104 110 112 114 120 C

Variability quantifiers 5 V = 0 5

Variability quantifiers 1 2 5 8 9 10 1 2 5 8 9 10

SEMIINTERQUARTILE AMPLITUDE At TOTAL AMPLITUDE Q SEMIINTERQUARTILE AMPLITUDE INDEXES S2 VARIANCE S STANDARD DEVIATION

a) The total amplitude (or Range) It is the distance between the maximum and minimum value of a data set. Advantage: easy to calculate. Disadvantages: It only uses two sample data, so it is very sensitive to extreme values and not to the average values. It is not stable. It is not independent of the sample sizes (AT obtained in samples of different sizes are not directly comparable). AT = XMáx – XMin

Total amplitude calculation B) 7 7 8 9 10 11 12 13 AT = 13 – 7 = 6 C) 7 10 10 10 10 10 10 13 B = C A > B y C

A) 3 7 8 9 10 11 12 13 B) 7 8 9 10 11 12 13 C) 7 10 13

b) The semiinterquartile amplitude It is the semidistance between quartile 3 and quartile 1 It is usually calculated: When we only want to consider the central scores of the distribution. When we can’t use the mean.

Calculation example Data analysis: A) Unbalance to the student of Psychology. B) It is an essential tool for Psychology.

Unbalance to the student Essential tool A B XA fA FA XB fB FB 1 35 5 2 40 75 20 25 3 150 65 4 30 180 80 145 200 55 CALCULATE Q

A: Data analysis unbalance to the student of Psychology Centile Position Cum. Frec. Value 25 50,25 Between 50 and 51 2+0,25(2-2) = 2 Centile Position Cum. Frec. Value 75 150,75 Between 150 and 151 3+0,75(4-3) = 3,75

B: Data analysis is an essential tool for Psychology Centile Position Cum.Frec. Value 25 50,25 Between 50 and 51 2+0,25(2-2) = 3 Centile Position Cum.Frec. Value 75 150,75 Between 150 and 151 5+0,75(5-5) = 5

Calculate the semiinterquartile deviation (Q) in the following distribution Xi fri Fi %ai 21 23 40 24 0,12 32 25 110 26 0,20 29 30 100

c) The variance and d) the standard deviation To what point are people in relation to the representative person of the population? We are interested in what are the approximate average distance between every subject and the representative person 21

Same mass above and below the average: to solve this problem we square. Because of the discrepancies sum, as more data will mean more differences: to avoid this influence of n, divided by n.

Other ways of calculating the variance derived from this formula

Type I distribution Small data set 3 – 6 – 9 – 12 – 15 Xi 3 3 – 9 = -6 36 6 6 – 9 = -3 9 9 – 9 = 0 12 12 – 9 = 3 15 15 – 9 = 6 Total 90

Variance is expressed in units squared, and this does not usually use. 2 onions are ok, but 2 onions squared have no sense! In order to the dispersion is also expressed in the same units as the variable in its origin we do the square root (obtaining the standard deviation).

Variance Standard desviation

With another derived formula Xi X2 3 9 6 36 81 12 144 15 225 Total 495

Type II distribution Big data set Xi fi fiXi 3 1 3-5 = -2 4 7 28 4-5 = -1 5 6 30 5-5 = 0 18 6-5 = 1 21 7-5 = 2 12 Total 20 100 10 26

Type II distribution Big data set Xi fi fiXi 3 1 3-5 = -2 4 7 28 4-5 = -1 5 6 30 5-5 = 0 18 6-5 = 1 21 7-5 = 2 12 Total 20 100 10 26

Xi fi fi Xi fiX2 3 1 9 4 7 28 112 5 6 30 150 18 108 21 147 Total 20 100 491

The variance and the standard deviation It is the mean squared differences (squared high) respect to the mean: If the data are grouped by frequency (big data set): Provide an absolute value of variability 31

In a data set we have calculated distances to the mean and squaring these. The result is: 4 – 9 – 4 – 1 – 0 – 25 – 9 A) Calculate the standard deviation. B) Reproduce the original distancies. C) Supossing that mean is 9, elaborate the frequency table.

Other dispersion indexes

e) The Quasivariance and f) The Standard Quasideviation It can be used when you want to make a more accurate estimate of the variance and the standard deviation of the population. When the sample is small, the difference is significant.

The quasivariance Formula: If we have frequency table: If we have the variance:

The standard quasideviation Formula:

From the previous example The variance: from 18 to 22.5 The s.d.: from 4.2 to 4.74

Variance and standard deviation properties 1. Both S2 and S are essentially non-negative values : 2. It is not calculable or not recommended when it is not calculable or the mean is not considered a good measure of central tendency. S2 ≥ 0 S ≥ 0

Variance and standard deviation properties 3. The s.d. is expressed in the same units for which data are expressed. 4. Both variance (S2) and standard deviation (S) from the sample are lower than variance ( 2) and standard deviation ( ) from the population: S2 < 2 S <

g) Pearson’s variation coefficient Ex: age A difference of 2 years of age may be a lot or little: 78-80 age range 1-3 age range Does this difference of two years have the same connotations? There is a psychological difference that the numbers can not detect. We, as psychologists (in the near future) have to interpret it. 40

Pearson’s variation coefficient Called ‘Relative variability coefficient’ too. Symbolically :

It is preferable to use the CV before the S when we want to compare the dispersions of two or more distributions of data. Large units produce larger differences. This is reflected in the mean. When the means are similar, it is more simple and equally valid comparison in terms of S (calculating the CV does not bring anything new).

a) S1= 2 b) S2= 2 c) S3=10 What do you think about the degree of variability of these groups?

2-4-5-6-8 Mean= 5 47-49-50-51-53 Mean = 50 35-45-50-55-65 Mean = 50 Larger or smaller units are reflected in the means CV1= 40 b) CV2= 4 c) CV3=20

2-4-5-6-8 Mean = 5 47-49-50-51-53 Mean = 50 35-45-50-55-65 Mean = 50 S1= 2 b) S2= 2 c) S3=10 CV1= 40 b) CV2= 4 c) CV3=20 When the means are equal, CV does not add anything (same conclusions)

In A there is more variation Example 1: We performed an experiment on reaction times to two stimuli A and B in a sample of subjects. The results were as follows: Mean Stand. Desv. A 50 5 B 600 6 Which of A or B show more variation? In A there is more variation

In group A there is more variation Example 2: We used the same test to two groups of students A and B. The results have been: Mean Stand. Desv. Grupo A 38 7 Grupo B 53 9 Which group have higher dispersion? In group A there is more variation

Form measurements

Bias or asimmetry measurements Two distributions with the same mean and the same dispersion can be, in terms of shape, totally different. These measures tell us in which distribution side there is a greater dispersion.

g1 g1 < 0 Negative asymmetric g1 = 0 Symmetric g1 > 0 Positive 50

Skewness or Kurtosis measurements Refers to the distributions degree of pointing or slenderness. The skewness indicates a marked contrast between the high central frequency and the rest.

Skewness measurements(Kurtosis) g2 < 0 Platykurtic g2 = 0 Mesokurtic g2 > 0 Leptokurtic g2 52

Example 1 The next set of data corresponds to a symmetric distribution with mean equal to 5. Substitute the Xs for their values: 1 3 4 X

Solution Xi fi fiXi 1 3 9 4 2 8 6 12 7 21 Total 60

Example 2: Two distributions with equal means and standard deviations do not necessarily have the same form Xi fi 12 40 11 10 13 30 20 14 15

Simmetry (it is not necessary)

Example Xi fi 1 2 3 4 6 7