Download presentation
Presentation is loading. Please wait.
1
Univariate Descriptive Statistics
Lecture 15: Univariate Descriptive Statistics
2
Agenda Finding Data for Quantitative Analysis
Using Descriptive Statistics
3
Finding Data for Quantitative Analysis
Berkeley Survey Research Center UC Data (access to ICPSR) Data on the Net
4
Thinking about Data Sources for Final Exam
5
Probability and Statistics
Statistics deal with what we observe and how it compares to what might be expected by chance. A set of probabilities corresponding to each possible value of some variable, X, creates a probability distribution For now, we especially care about the normal (Gaussian) distribution
6
The Normal Curve
7
Simple frequencies… quick and dirty way to see if you should even care about a “variable”
Example in GSS data: (tab attend)
8
Describing Simple Distributions of Data
Central Tendency Some way of “typifying” a distribution of values, scores, etc. Mean (sum of scores divided by number of scores) Median (middle score, as found by rank) Mode (most common value from set of values) In a normal distribution, all 3 measures are equal. Example: GSS data What are relative advantages of mean, median and mode? Mode = good for qualitative descriptions when mean value is not a good descriptive Median = don’t want those big outliers to have an effect…just want to know what lies in the middle Mean = ‘best’ overall value for the middle value, all values have equal weight in computing value STATA example: pull up gss data; demonstration of summary statistics for different types of variables
9
Special Features of the Mean
Sum of the deviations from the mean of all scores = zero. Unlike the median, it is the point in a distribution where the two halves are balanced.
10
Using Central Tendencies in Recoding
“splitting” metrics into binary variables “collapsing” variables
11
Dispersion Range Standard Deviation Variance
Difference between highest value and the lowest value. Standard Deviation A statistic that describes how tightly the values are clustered around the mean. Variance A measure of how much spread a distribution has. Computed as the average squared deviation of each value from its mean
12
Properties of Standard Deviation
Variance is just the square of the S.D. (or, S.D is the square root of the variance) If a constant is added to all scores, it has no impact on S.D. If a constant is multiplied to all scores, it will affect the dispersion (S.D. and variance) S = standard deviation X = individual score M = mean of all scores n = sample size (number of scores)
13
Why Variance Matters… In many ways, this is the purpose of many statistical tests: explaining the variance in a dependent variable through one or more independent variables.
14
Common Data Representations
Histograms Simple graphs of the frequency of groups of scores. Stem-and-Leaf Displays Another way of displaying dispersion, particularly useful when you do not have large amounts of data. Box Plots Yet another way of displaying dispersion. Boxes show 75th and 25th percentile range, line within box shows median, and “whiskers” show the range of values (min and max)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.