Presentation is loading. Please wait.

Presentation is loading. Please wait.

Descriptive Statistics

Similar presentations


Presentation on theme: "Descriptive Statistics"— Presentation transcript:

1 CHAPTER 1 Basic Concepts CHAPTER 2 Describing and Exploring Data Part B

2 Descriptive Statistics
Measures of Central Tendency

3 Measures of Central Tendency
Mean Interval or Ratio scale Polygon The sum of the values divided by the number of values--often called the "average." μ=ΣX/N Add all of the values together. Divide by the number of values to obtain the mean. Example: X 7 12 24 20 19 ????

4 Descriptive Statistics
The Mean is: μ=ΣX/N= 82/5=16.4 ( ) / 5 = 16.4.

5 The Characteristics of Mean
1. Changing a score in a distribution will change the mean 2. Introducing or removing a score from the distribution will change the mean 3. Adding or subtracting a constant from each score will change the mean 4. Multiplying or dividing each score by a constant will change the mean 5. Adding a score which is same as the mean will not change the mean

6 Measures of Central Tendency
Median/MiddleOrdinal ScaleBar/Histogram Divides the values into two equal halves, with half of the values being lower than the median and half higher than the median. Sort the values into ascending order. If you have an odd number of values, the median is the middle value. If you have an even number of values, the median is the arithmetic mean (see above) of the two middle values. Example: The median of the same five numbers (7, 12, 24, 20, 19) is ???.

7 Measures of Central Tendency
The median is 19. ModeNominal Scale Bar/Histogram The most frequently-occurring value (or values). Calculate the frequencies for all of the values in the data. The mode is the value (or values) with the highest frequency. Example: For individuals having the following ages -- 18, 18, 19, 20, 20, 20, 21, and 23, the mode is ????

8 CHARACTERISTICS OF MODE
Nominal Scale Discrete Variable Describing Shape

9 WHEN TO USE WHICH MEASURE
Measure of Central Tendency Level of Measurement Use When Examples Mode Nominal Data are categorical Eye color, party affiliation Median Ordinal Data include extreme scores Rank in class, birth order, income Mean Interval and ratio You can, and the data fit Speed of response, age in years

10 Variability

11 MEASURES OF VARIABILITY
Variability--The degree of spread or dispersion in a set of scores Range—The difference between highest and lowest score +1 Standard Deviation—The average difference of each score from mean

12

13

14

15

16 Variability Variability is a measure of dispersion or spreading of scores around the mean, and has 2 purposes: 1. Describes the distribution Next slide

17 Variability 2. How well an individual score (or group of scores) represents the entire distribution. i.e. Z Score Ex. In inferential statistics we collect information from a small sample then, generalize the results obtained from the sample to the entire population.

18 Range, Interquartile Range, Semi-Interquartile Range, Standard Deviation, and Variance are the Measures of Variability The Range: The Range is the difference between the highest number –lowest number +1 2, 4, 7, 8, and > Discrete Numbers 2, 4.6, 7.3, 8.4, and > Continues Numbers The difference between the upper real limit of the highest number and the lower real limit of the lowest number.

19 Interquartile Range (IQR)
Assesses the distance between the scores at the 75th and 25th percentile ranks. See next slide IQR = Q3-Q1

20

21 Interquartile Range (IQR)
IQR is the range covered by the middle 50% of the distribution. IQR is the distance between the 3rd Quartile and 1st Quartile.

22

23

24

25 Interquartile Range (IQR)
In descriptive statistics, the Interquartile Range (IQR), also called the midspread or middle fifty, is a measure of statistical dispersion, being equal to the difference between the upper and lower quartiles. (Q3 − Q1)=IQR

26

27 Semi-Interquartile Range (SIQR)
Assesses the distance between the scores at the 75th and 25th percentile ranks divided by 2. SIQR = (Q3-Q1)/2

28 Semi-Interquartile Range (SIQR)
SIQR is ½ or half of the Interquartile Range. It is used when our data are open ended (i.e., in research we continue to receive more extreme numbers). It is lowest measure of variability. SIQR = (Q3-Q1)/2

29 Variability SS, Standard Deviations and Variances
X σ² = ss/N Variance Pop σ = √ss/N Standard Deviation 2 s² = ss/n-1 or ss/df Variance Sample s = √ss/df Standard Deviation SS=Σx²-[(Σx)²/N]  Computation SS=Σ( x-μ)²  Definition Sum of Squared Deviation from Mean Variance (σ²) is the Mean of Squared Deviations=MS Used in ANOVA

30 Practical Implication for Test Construction
Variance and Covariance measure the quality of each item in a test. Reliability and validity measure the quality of the entire test. σ²=SS/N  used for one set of data Variance is the degree of variability of scores from mean. Correlation is based on a statistic called Covariance (Cov xy or S xy) COVxy=SP/N-1  used for 2 sets of data Covariance is a number that reflects the degree to which 2 variables vary together. r=sp/√ssx.ssy

31 Variance X σ² = ss/N Pop 1 s² = ss/n-1 or ss/df Sample 2 4 5
SS=Σx²-(Σx)²/N SS=Σ( x-μ)² Sum of Squared Deviation from Mean

32 COMPUTING THE STANDARD DEVIATION
List scores and compute mean X 13 14 15 12 16 9 X = 13.4

33 COMPUTING THE STANDARD DEVIATION
X (X-X) 13 -0.4 14 0.6 15 1.6 12 -1.4 16 2.6 9 -4.4 X = 0 List scores and compute mean Subtract mean from each score (Deviation) X = 13.4

34 COMPUTING THE STANDARD DEVIATION
X 13 -0.4 0.16 14 0.6 0.36 15 1.6 2.56 12 -1.4 1.96 16 2.6 6.76 9 -4.4 19.36 X =13.4  X = 0 (X – X) (X – X)2 List scores and compute mean Subtract mean from each score (Deviation) Square each (Deviation)

35 COMPUTING THE STANDARD DEVIATION
X 13 -0.4 0.16 14 0.6 0.36 15 1.6 2.56 12 -1.4 1.96 16 2.6 6.76 9 -4.4 19.36 X =13.4  X = 0  X2 = 34.4 (X – X) (X – X)2 List scores and compute mean Subtract mean from each score Square each Deviation Sum Squared Deviations

36 COMPUTING THE STANDARD DEVIATION
X 13 -0.4 0.16 14 0.6 0.36 15 1.6 2.56 12 -1.4 1.96 16 2.6 6.76 9 -4.4 19.36 X =13.4  X = 0  X2 = 34.4 (X – X) (X – X)2 List scores and compute mean Subtract mean from each score Square each deviation Sum squared deviations Divide sum of squared deviation by n – 1 34.4/9 = 3.82 (= s2) Compute square root of step 5 3.82 = 1.95

37 @Suppose you earned a score of
X = 54 on an exam. Which set of parameters would give you the highest grade? a. μ= 50 and σ= σ²=4 b. μ= 50 and σ= σ²=16 c. μ= 54 and σ= σ²=4 d. μ= 54 and σ= σ²=16

38 Suppose you earned a score of
X = 46 on an exam. Which set of parameters would give you the highest grade? a. μ= 50 and σ= σ²=4 b. μ= 50 and σ= σ²=16 c. μ= 54 and σ= σ²=4 d. μ= 54 and σ= σ²=16

39 Covariance Correlation is based on a statistic called Covariance (Cov xy or S xy) ….. COVxy=SP/N-1 Correlation-- r=sp/√ssx.ssy Covariance is a number that reflects the degree to which 2 variables vary together. Original Data X Y

40 Covariance Correlation is based on a statistic called Covariance (Cov xy or S xy) ….. COVxy=SP/N-1 Correlation-- r=sp/√ssx.ssy Covariance is a number that reflects the degree to which 2 variables vary together. Original Data X Y

41 Covariance

42 Descriptive Statistics for Non-dichotomous Variables

43 X6 X7 5 1 Calcúlate the VaríAnCe 5 1

44 Descriptive Statistics for Dichotomous Data (students assignments)

45 X6 X7 0 1 Calcúlate the VaríAnCe 0 1

46 Descriptive Statistics for Dichotomous Data Item Variance & Covariance

47 FACTORS THAT AFFECT VARIABILITY
1. Extreme Scores i.e. 1, 3, 8, 11, 1,000,000. We can’t use the Range in this situation but we can use the other measures of variability. 2. Sample Size If we increase the sample size will change the Range therefore we can’t use the Range in this situation but we can use the other measures of variability (i.e., IQR) 3. Stability Under Sampling (see next slide) The S and S² for all samples of a population should be the same because they come from the same population (all slices of a pizza should taste the same). 4. Open-Ended Distribution When we don’t know the highest score and lowest score in a distribution (because we keep adding to the sample) we use SIQR.

48

49 Variability

50

51 DATA ENTERY

52 Sample Survey Questioner

53 Bootstrap Bootstrapping is a method for deriving robust estimates of standard errors and confidence intervals for estimates such as the mean, median, proportion, odds ratio, correlation coefficient or regression coefficient.

54 Please read the sample review questions and take the Quiz 2.


Download ppt "Descriptive Statistics"

Similar presentations


Ads by Google