Presentation is loading. Please wait.

Presentation is loading. Please wait.

Univariate Descriptive Statistics

Similar presentations


Presentation on theme: "Univariate Descriptive Statistics"— Presentation transcript:

1 Univariate Descriptive Statistics
Lecture 16: Univariate Descriptive Statistics

2 Agenda Results of Data Search Using Descriptive Statistics

3 Probability and Statistics
Statistics deal with what we observe and how it compares to what might be expected by chance. For now, we especially care about the normal (Gaussian) distribution

4 The Normal Curve Several properties: Unimodal Mean = median = mode
Curve is symmetric Range is infinite

5 Simple frequencies… quick and dirty way to see if you should even care about a “variable”
Example in GSS data: (tab attend)

6 Describing Simple Distributions of Data
Central Tendency Some way of “typifying” a distribution of values, scores, etc. Mean (sum of scores divided by number of scores) Median (middle score, as found by rank) Mode (most common value from set of values) In a normal distribution, all 3 measures are equal. What are relative advantages of mean, median and mode? Mode = good for qualitative descriptions when mean value is not a good descriptive Median = don’t want those big outliers to have an effect…just want to know what lies in the middle Mean = ‘best’ overall value for the middle value, all values have equal weight in computing value STATA example: pull up gss data; demonstration of summary statistics for different types of variables

7 Special Features of the Mean
Sum of the deviations from the mean of all scores = zero. Itis the point in a distribution where the two halves are balanced.

8 Using Central Tendencies in Recoding
“splitting” metrics into binary variables High/Low (mean) Most common, least common (mode) “collapsing” variables Groups of scores in different ranges above and below the mean

9 Dispersion Range Standard Deviation Variance
Difference between highest value and the lowest value. Standard Deviation A statistic that describes how tightly the values are clustered around the mean. Variance A measure of spread Computed as the average squared deviation of each value from its mean

10 Properties of Standard Deviation
Variance is just the square of the S.D. (or, S.D is the square root of the variance) If a constant is added to all scores, it has no impact on S.D. If a constant is multiplied to all scores, it will affect the dispersion (S.D. and variance) S = standard deviation X = individual score M = mean of all scores n = sample size (number of scores) Standard deviation and Variance: basically exact same thing but… Why would we want to square the S.D?

11 Common Data Representations
Histograms (hist) Simple graphs of the density or frequency With density, area comes out in percent and total area = 100% Box Plots (graph box) Yet another way of displaying dispersion.

12

13 Issues with Normal Distributions
Skewness Kurtosis Skewness in a normal distribution = 0 Kurtosis = 3 (though some texts use kurtosis excess, which is 0 for normal distribution) A distribution is platykurtic if it is flatter than the corresponding normal curve and leptokurtic if it is more peaked than the normal curve.

14 In-Class Examples in STATA
(Using GSS_96small.dta)


Download ppt "Univariate Descriptive Statistics"

Similar presentations


Ads by Google