Statistics Intro Univariate Analysis Central Tendency Dispersion
Review of Descriptive Stats. l Descriptive Statistics are used to present quantitative descriptions in a manageable form. l This method works by reducing lots of data into a simpler summary. l Example: –Batting average in baseball –Cornell’s grade-point system
Univariate Analysis l This is the examination across cases of one variable at a time. l Frequency distributions are used to group data. l One may set up margins that allow us to group cases into categories. l Examples include –Age categories –Price categories –Temperature categories.
Distributions l Two ways to describe a univariate distribution l A table l A graph (histogram, bar chart)
Distributions (con’t) l Distributions may also be displayed using percentages. l For example, one could use percentages to describe the following: –Percentage of people under the poverty level –Over a certain age –Over a certain score on a standardized test
Distributions (cont.) CategoryPercent Under 359% A Frequency Distribution Table
Distributions (cont.) A Histogram
Central Tendency l An estimate of the “center” of a distribution l Three different types of estimates: –Mean –Median –Mode
Mean l The most commonly used method of describing central tendency. l One basically totals all the results and then divides by the number of units or “n” of the sample. l Example: The HSS 292 Quiz 1 mean was determined by the sum of all the scores divided by the number of students taking the exam.
Working Example (Mean) l Lets take the set of scores: 15,20,21,20,36,15, 25,15 l The Mean would be 167/8=20.875
Median l The median is the score found at the exact middle of the set. l One must list all scores in numerical order and then locate the score in the center of the sample. l Example: If there are 500 scores in the list, score #250 would be the median. l This is useful in weeding out outliers.
Working Example (Median) l Lets take the set of scores: 15,20,21,20,36,15, 25,15 l First line up the scores. l 15,15,15,20,20,21,25,36 l The middle score falls at 20. There are 8 scores, and score #4 and #5 represent the halfway point.
Mode l The mode is the most repeated score in the set of results. l Lets take the set of scores: 15,20,21,20,36,15, 25,15 l Again we first line up the scores l 15,15,15,20,20,21,25,36 l 15 is the most repeated score and is therefore labeled the mode.
Central Tendency l If the distribution is normal (i.e., bell- shaped), the mean, median and mode are all equal. l In our analyses, we’ll use the mean.
Dispersion l Two estimates types: –Range –Standard deviation l Standard deviation is more accurate/detailed because an outlier can greatly extend the range.
Range l The range is used to identify the highest and lowest scores. l Lets take the set of scores:15,20,21,20,36,15, 25,15. l The range would be This identifies the fact that 21 points separates the highest to the lowest score.
Standard Deviation l The standard deviation is a value that shows the relation that individual scores have to the mean of the sample. l If scores are said to be standardized to a normal curve, there are several statistical manipulations that can be performed to analyze the data set.
Standard Dev. (con’t) l Assumptions may be made about the percentage of scores as they deviate from the mean. l If scores are normally distributed, one can assume that approximately 69% of the scores in the sample fall within one standard deviation of the mean. Approximately 95% of the scores would then fall within two standard deviations of the mean.
Standard Dev. (con’t) l The standard deviation calculates the square root of the sum of the squared deviations from the mean of all the scores, divided by the number of scores. l This process accounts for both positive and negative deviations from the mean.
Working Example (stand. dev.) l Lets take the set of scores 15,20,21,20,36,15, 25,15. l The mean of this sample was found to be Round up to 21. l Again we first line up the scores. l 15,15,15,20,20,21,25,36. l 21-15=6, 21-15=6, 21-15=6,20-21=-1,20-21=- 1, 21-21=0, 21-25=-4, 36-21=15.
Working Ex. (Stan. dev. con’t) l Square these values. l 36,36,36,1,1,0,16,225. l Total these values l Divide 351 by l Take the square root of l 6.62 is your standard deviation.