SPSS Review CENTRAL TENDENCY & DISPERSION

Slides:



Advertisements
Similar presentations
Descriptive Statistics
Advertisements

Review of Basics. REVIEW OF BASICS PART I Measurement Descriptive Statistics Frequency Distributions.
Review of Basics. REVIEW OF BASICS PART I Measurement Descriptive Statistics Frequency Distributions.
Calculating & Reporting Healthcare Statistics
Descriptive Statistics – Central Tendency & Variability Chapter 3 (Part 2) MSIS 111 Prof. Nick Dedeke.
B a c kn e x t h o m e Parameters and Statistics statistic A statistic is a descriptive measure computed from a sample of data. parameter A parameter is.
PSY 307 – Statistics for the Behavioral Sciences
Descriptive Statistics
SOC 3155 SPSS CODING/GRAPHS & CHARTS CENTRAL TENDENCY & DISPERSION.
Statistical Analysis SC504/HS927 Spring Term 2008 Week 17 (25th January 2008): Analysing data.
Intro to Descriptive Statistics
Slides by JOHN LOUCKS St. Edward’s University.
Measures of Variability
Edpsy 511 Homework 1: Due 2/6.
Central Tendency & Variability Dec. 7. Central Tendency Summarizing the characteristics of data Provide common reference point for comparing two groups.
Data observation and Descriptive Statistics
Central Tendency and Variability Chapter 4. Central Tendency >Mean: arithmetic average Add up all scores, divide by number of scores >Median: middle score.
Measures of Central Tendency
Today: Central Tendency & Dispersion
Measures of Central Tendency
Measures of Central Tendency
Lecture 4 Dustin Lueker.  The population distribution for a continuous variable is usually represented by a smooth curve ◦ Like a histogram that gets.
Describing Data: Numerical
BIOSTATISTICS II. RECAP ROLE OF BIOSATTISTICS IN PUBLIC HEALTH SOURCES AND FUNCTIONS OF VITAL STATISTICS RATES/ RATIOS/PROPORTIONS TYPES OF DATA CATEGORICAL.
Descriptive Statistics Used to describe the basic features of the data in any quantitative study. Both graphical displays and descriptive summary statistics.
JDS Special Program: Pre-training1 Basic Statistics 01 Describing Data.
© Copyright McGraw-Hill CHAPTER 3 Data Description.
Statistics Recording the results from our studies.
BUS250 Seminar 4. Mean: the arithmetic average of a set of data or sum of the values divided by the number of values. Median: the middle value of a data.
PPA 501 – Analytical Methods in Administration Lecture 5a - Counting and Charting Responses.
Measures of Central Tendency and Dispersion Preferred measures of central location & dispersion DispersionCentral locationType of Distribution SDMeanNormal.
Central Tendency Introduction to Statistics Chapter 3 Sep 1, 2009 Class #3.
Descriptive Statistics: Numerical Methods
1 PUAF 610 TA Session 2. 2 Today Class Review- summary statistics STATA Introduction Reminder: HW this week.
Chapter 8 Quantitative Data Analysis. Meaningful Information Quantitative Analysis Quantitative analysis Quantitative analysis is a scientific approach.
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved.
Central Tendency and Variability Chapter 4. Variability In reality – all of statistics can be summed into one statement: – Variability matters. – (and.
An Introduction to Statistics. Two Branches of Statistical Methods Descriptive statistics Techniques for describing data in abbreviated, symbolic fashion.
1 Univariate Descriptive Statistics Heibatollah Baghi, and Mastee Badii George Mason University.
Copyright © 2014 by Nelson Education Limited. 3-1 Chapter 3 Measures of Central Tendency and Dispersion.
INVESTIGATION 1.
Dr. Serhat Eren 1 CHAPTER 6 NUMERICAL DESCRIPTORS OF DATA.
A way to organize data so that it has meaning!.  Descriptive - Allow us to make observations about the sample. Cannot make conclusions.  Inferential.
SOC 3155 SPSS CODING/GRAPHS & CHARTS CENTRAL TENDENCY & DISPERSION h458 student
Chapter 2 Means to an End: Computing and Understanding Averages Part II  igma Freud & Descriptive Statistics.
SOC 3155 SPSS Review CENTRAL TENDENCY & DISPERSION.
Chapter 3: Averages and Variation Section 2: Measures of Dispersion.
Central Tendency & Dispersion
Lecture 4 Dustin Lueker.  The population distribution for a continuous variable is usually represented by a smooth curve ◦ Like a histogram that gets.
BASIC STATISTICAL CONCEPTS Chapter Three. CHAPTER OBJECTIVES Scales of Measurement Measures of central tendency (mean, median, mode) Frequency distribution.
Edpsy 511 Exploratory Data Analysis Homework 1: Due 9/19.
Statistical Analysis of Data. What is a Statistic???? Population Sample Parameter: value that describes a population Statistic: a value that describes.
LIS 570 Summarising and presenting data - Univariate analysis.
Chapter 2 Describing and Presenting a Distribution of Scores.
Summation Notation, Percentiles and Measures of Central Tendency Overheads 3.
Descriptive Statistics(Summary and Variability measures)
A way to organize data so that it has meaning!.  Descriptive - Allow us to make observations about the sample. Cannot make conclusions.  Inferential.
Chapter 6: Descriptive Statistics. Learning Objectives Describe statistical measures used in descriptive statistics Compute measures of central tendency.
Educational Research Descriptive Statistics Chapter th edition Chapter th edition Gay and Airasian.
Central Tendency and Variability Chapter 4. Variability In reality – all of statistics can be summed into one statement: – Variability matters. – (and.
SOC 3155 SPSS CODING/GRAPHS & CHARTS CENTRAL TENDENCY & DISPERSION h458 student
Describing Data: Summary Measures. Identifying the Scale of Measurement Before you analyze the data, identify the measurement scale for each variable.
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved Chapter 2 Describing and Presenting a Distribution of Scores.
Chapter 4: Measures of Central Tendency. Measures of central tendency are important descriptive measures that summarize a distribution of different categories.
Lecture 8 Data Analysis: Univariate Analysis and Data Description Research Methods and Statistics 1.
SPSS CODING/GRAPHS & CHARTS CENTRAL TENDENCY & DISPERSION
B.A. –III Paper-III Unit-III Tabulation and Interpretation of DATA Measures of Central Tendency Measures of Dispersion By- Dr. Ankita Gupta Asstt. Professor.
Numerical Measures: Centrality and Variability
Descriptive Statistics
Central Tendency & Variability
Presentation transcript:

SPSS Review CENTRAL TENDENCY & DISPERSION SOC 3155 SPSS Review CENTRAL TENDENCY & DISPERSION

Survey Items Nominal Measurement Good Not as good What college (e.g., CLA, CHESP) are you enrolled in? Not as good Are you a freshman, sophomore, junior, or senior? What is your college major? Political affiliation/Religion

Survey Items Ordinal Interval/Ratio Mostly Likert-type Freshman, sophomore…. How far do you live from campus (0-5 miles…) Interval/Ratio GPA Hours study per week Number of credits carried

Review Class Exercise: Professor Numbscull surveys a random sample of Duluth residents. He wants to know if age predicts the support for banning so-called “synthetic marijuana” products (very supportive, somewhat supportive, not supportive). What are the IV and DV—what is their level of measurement?

Ratios/Proportions/Percents What percent of people enjoy fishing? What is the ratio of Ford F-150 owners to Toyota Prius Owners? What proportion of those who do not enjoy fishing drive a Honda Accord? Type of Vehicle Enjoy Fishing Do not enjoy Fishing TOTAL Honda Accord 11 14 25 Ford Escort 33 10 43 Toyota Prius 3 8 Ford F-150 Truck 41 2 TOTALS 88 34 122

SPSS CODING ALWAYS do “recode into different variable” INPUT MISSING DATA CODES Variable labels Check results with original variable Useful to have both numbers and variable labels on tables EDITOPTIONSOUTPUTPIVOT TABLES Variable values in label shown as values and labels

Class Exercise Recode a variable from GSS Create a variable called “married_r” where: 0 = not married 1 = married Be sure to label your new variable! Run a frequency distribution for the original married variable and “married_r”

Distribution of Scores All the observations for any particular sample or population Examples:

Distribution (GSS Histogram)

Exercise Create your own histogram from the GSS data using the variable “TVHOURS” Concept of “skew” Positive vs. negative Skew is only possible for ordered data, and typically relevant for interval/ratio data

Measures of Central Tendency Purpose is to describe a distribution’s typical case – do not say “average” case Mode Median Mean (Average)

Measures of Central Tendency Mode Value of the distribution that occurs most frequently (i.e., largest category) Only measure that can be used with nominal-level variables Limitations: Some distributions don’t have a mode Most common score doesn’t necessarily mean “typical” Often better off using proportions or percentages

Measures of Central Tendency

A more ambiguous mode

Measures of Central Tendency 2. Median value of the variable in the “middle” of the distribution same as the 50th percentile When N is odd #, median is middle case: N=5: 2 2 6 9 11 median=6 When N is even #, median is the score between the middle 2 cases: N=6: 2 2 5 9 11 15 median=(5+9)/2 = 7

MEDIAN: EQUAL NUMBER OF CASES ON EACH SIDE

Measures of Central Tendency 3. Mean The arithmetic average Amount each individual would get if the total were divided among all the individuals in a distribution Symbolized as X Formula: X = (Xi ) N

Measures of Central Tendency Characteristics of the Mean: It is the point around which all of the scores (Xi) cancel out. Example: X (Xi – X) 3 3 – 7 -4 6 6 – 7 -1 9 9 – 7 2 11 11- 7 4 X = 35 (Xi – X) =

Mean as the “Balancing Point” Think of the values in the previous slide as weights Think of the mean as the point at which the weights would be balanced. X

Measures of Central Tendency Characteristics of the Mean: 2. Every score in a distribution affects the value of the mean As a result, the mean is always pulled in the direction of extreme scores Example of why it’s better to use MEDIAN family income POSITIVELY SKEWED NEGATIVELY SKEWED

Measures of Central Tendency In-class exercise: Find the mode, median & mean of the following numbers: 8 4 10 2 5 1 6 2 11 2 Does this distribution have a positive or negative skew?

Measures of Central Tendency Levels of Measurement Nominal Mode only (categories defy ranking) Often, percent or proportion better Ordinal Mode or Median (typically, median preferred) Interval/Ratio Mode, Median, or Mean Mean if skew/outlier not a big problem (judgment call)

Measures of Dispersion provide information about the amount of variety or heterogeneity within a distribution of scores Necessary to include them w/measures of central tendency when describing a distribution

Measures of Dispersion Range (R) The scale distance between the highest and lowest score R = (high score-low score) Simplest and most straightforward measure of dispersion Limitation: even one extreme score can throw off our understanding of dispersion

Measures of Dispersion 2. Interquartile Range (Q) The distance from the third quartile to the first quartile (the middle 50% of cases in a distribution) Q = Q3 – Q1 Q3 = 75% quartile Q1 = 25% quartile Example: Prison Rates (per 100k), 2001: R = 795 (Louisiana) – 126 (Maine) = 669 Q = 478 (Arizona) – 281 (New Mexico) = 197 25% 50% 75% 281 478 126 366 795

MEASURES OF DISPERSION Problem with both R & Q: Calculated based on only 2 scores

MEASURES OF DISPERSION Standard deviation Uses every score in the distribution Measures the standard or typical distance from the mean Deviation score = Xi - X Example: with Mean= 50 and Xi = 53, the deviation score is 53 - 50 = 3

The Problem with Summing Deviations From Mean 2 parts to a deviation score: the sign and the number Deviation scores add up to zero Because sum of deviations is always 0, it can’t be used as a measure of dispersion X Xi - X 8 +5 1 -2 3 0 0 -3 12 0 Mean = 3 

Average Deviation (using absolute value of deviations) Works OK, but… AD =  |Xi – X| N X |Xi – X| 8 5 1 2 3 0 0 3 12 10 AD = 10 / 4 = 2.5 X = 3 Absolute Value to get rid of negative values (otherwise it would add to zero)

Variance & Standard Deviation Purpose: Both indicate “spread” of scores in a distribution Calculated using deviation scores Difference between the mean & each individual score in distribution To avoid getting a sum of zero, deviation scores are squared before they are added up. Variance (s2)=sum of squared deviations / N Standard deviation Square root of the variance Xi (Xi – X) (Xi - X)2 5 1 2 -2 4 6  = 20  = 0  = 14

Terminology “Sum of Squares” = Sum of Squared Deviations from the Mean =  (Xi - X)2 Variance = sum of squares divided by sample size =  (Xi - X)2 = s2 N Standard Deviation = the square root of the variance = s

Calculation Exercise Number of classes a sample of 5 students is taking: Calculate the mean, variance & standard deviation mean = 20 / 5 = 4 s2 (variance)= 14/5 = 2.8 s= 2.8 =1.67 Xi (Xi – X) (Xi - X)2 5 1 2 -2 4 6  = 20 14

Calculating Variance, Then Standard Deviation Number of credits a sample of 8 students is are taking: Calculate the mean, variance & standard deviation Xi (Xi – X) (Xi - X)2 10 -4 16 9 -5 25 13 -1 1 17 3 15 2 4 14 18  = 112 72 AGAIN, A SMALL NUMBER OF CASES. IN THIS CASE, N=8 Calculate the variance & standard deviation of these 8 scores. MEAN = 14 (112 / 8) VARIANCE = SUM OF SQUARED DEVIATIONS/# OF SCORES 72 / 8 = 9 Standard deviation = sq. rt. of variance

Summary Points about the Standard Deviation Uses all the scores in the distribution Provides a measure of the typical, or standard, distance from the mean Increases in value as the distribution becomes more heterogeneous Useful for making comparisons of variation between distributions Becomes very important when we discuss the normal curve (Chapter 5, next) 1 – Like the mean in this sense 2 – Good index of variation 3 – Another form of standardization (putting things into a common metric, just like %s or rates) Becomes very important when we discuss the normal curve (Chapter 5, next) 4 – Read Chapter 5 for next time

Mean & Standard Deviation Together Tell us a lot about the typical score & how the scores spread around that score Useful for comparisons of distributions: Example: Class A: mean GPA 2.8, s = 0.3 Class B: mean GPA 3.3, s = 0.6 Mean & Standard Deviation Applet