Lecture 8 Data Analysis: Univariate Analysis and Data Description Research Methods and Statistics 1
Quantitative Data Analysis Descriptive statistics: the use of statistics to summarize, describe or explain the essential characteristics of a data set. -Frequency Distributions -Measures of Central Tendency -Measures of Variability Inferential statistics: the use of statistics to make generalizations or inferences about the characteristics of a population using data from a sample. -Estimation -Hypothesis Testing 2
Frequency Distributions A frequency distribution describes the number or percentage of occurrences of each value of a variable. 3
Normal Distribution 4 Low Values High Values Frequency Characteristics of a Normal Distribution Symmetrical Unimodal e.g, standardized test scores, physical and psychological variables Normality Assumption necessary for conducting many inferential statistics
Non-Normal Distributions Distributions that lack symmetry are skewed. Distributions that have two frequently occurring values are bimodal 5 Positively Skewed Distribution Negatively Skewed Distribution Bimodal Distribution
Measures of Central Tendency Measures of central tendency provide information about the single numerical value that is most typical of the values of a variable. Mean (average): the sum of values of all cases divided by the total number of cases Median: the center point in a set of values of a variable Mode: the most frequently occurring value of a variable 6 CaseAnnual Salary 1$20,100 2$22,700 3$25,600 4$26,400 5$27,900 6$32,600 7$38,400 8$42,600 9$55,700 10$60,000 11$550,000 Total$902,000 Mean$82,000 Median$32,600
Central Tendency & Normal Distributions The mean and the median are affected by skewness, or lack of symmetry in the data. 7 Negatively Skewed Distribution Normal Distribution Positively Skewed Distribution Mean Median Mode Median Mean Mode Median Mean Frequency
Measures of Variability Measures of variability provide information about how "spread out" the values of a variable are. Ex. Standard variation (SD), variance Range: the difference between the highest and lowest values Standard Deviation (SD): how far the values tend to vary from the mean. 8 Case # School ASchool B Mean70 Median70 SD15.145
9 Case # School ASchool B Mean70 Median70 SD15.145
Measures of Variability Percentile: a value below which a certain percent of the ordered observations in a distribution are located. Inter-quartile range: the range of values within which the middle 50 percent of the observations are -The first quartile: value below which 25 percent of the cases are found -The second quartile: value below which 50 percent of the cases are found -The third quartile: value below which 75 percent of the cases are found 10
Describing Single Variables: Univariate Analysis 11 Variable Type Measure of Central Tendency Measure of Variability Nominal (e.g. gender) Moden/a Ordinal (e.g. Ed degree) (e.g. Likert scale) MedianRange Mean Standard Deviation Interval/Ratio (e.g. income, test scores) Mean Standard Deviation MedianRange
Class Exercise Use “bodytemp.sav” -Form a frequency distribution of the variable “body_temp” -Include a table that shows the mean and standard deviation, skewness, and value of the distribution’s 25 th, 50 th, and 75 th percentiles. -Plot the values of the variable using a histogram that has a normal distribution superimposed over it. -Based on your output, is the distribution of the variable “body_temp” approximately normal? 12