VARIABILITY. Case no.AgeHeightM/F 12368M 22264F 32369F 42571M 52764F 62272M 72465F 82366M 92366F 102568F 112168M 122162F 132471M 142766F 152162F 162556F.

Slides:



Advertisements
Similar presentations
DEPICTING DISTRIBUTIONS. How many at each value/score Value or score of variable.
Advertisements

SUMMARIZING DATA: Measures of variation Measure of Dispersion (variation) is the measure of extent of deviation of individual value from the central value.
Calculating & Reporting Healthcare Statistics
Slide Slide 1 Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley. Created by Tom Wegleitner, Centreville, Virginia Section 3-1.
1 Basic statistics Week 10 Lecture 1. Thursday, May 20, 2004 ISYS3015 Analytic methods for IS professionals School of IT, University of Sydney 2 Meanings.
Introduction to Educational Statistics
Edpsy 511 Homework 1: Due 2/6.
CHAPTER 6 Statistical Analysis of Experimental Data
Measures of Dispersion
Data Analysis Statistics. Levels of Measurement Nominal – Categorical; no implied rankings among the categories. Also includes written observations and.
Variance and Standard Deviation. Variance: a measure of how data points differ from the mean Data Set 1: 3, 5, 7, 10, 10 Data Set 2: 7, 7, 7, 7, 7 What.
July, 2000Guang Jin Statistics in Applied Science and Technology Chapter 4 Summarizing Data.
Today: Central Tendency & Dispersion
Measures of Central Tendency
BPT 2423 – STATISTICAL PROCESS CONTROL.  Frequency Distribution  Normal Distribution / Probability  Areas Under The Normal Curve  Application of Normal.
Confidence Intervals. Estimating the difference due to error that we can expect between sample statistics and the population parameter.
Chapter 2 Describing Data with Numerical Measurements
Objective To understand measures of central tendency and use them to analyze data.
BIOSTATISTICS II. RECAP ROLE OF BIOSATTISTICS IN PUBLIC HEALTH SOURCES AND FUNCTIONS OF VITAL STATISTICS RATES/ RATIOS/PROPORTIONS TYPES OF DATA CATEGORICAL.
Descriptive Statistics Used to describe the basic features of the data in any quantitative study. Both graphical displays and descriptive summary statistics.
Chapter 2 Describing Data with Numerical Measurements General Objectives: Graphs are extremely useful for the visual description of a data set. However,
Chapter 3 Statistical Concepts.
B AD 6243: Applied Univariate Statistics Understanding Data and Data Distributions Professor Laku Chidambaram Price College of Business University of Oklahoma.
Chapters 1 & 2 Displaying Order; Central Tendency & Variability Thurs. Aug 21, 2014.
JDS Special Program: Pre-training1 Basic Statistics 01 Describing Data.
Data Handbook Chapter 4 & 5. Data A series of readings that represents a natural population parameter A series of readings that represents a natural population.
© Copyright McGraw-Hill CHAPTER 3 Data Description.
Statistics Recording the results from our studies.
Psychology’s Statistics Statistical Methods. Statistics  The overall purpose of statistics is to make to organize and make data more meaningful.  Ex.
Some Useful Continuous Probability Distributions.
The Gaussian (Normal) Distribution: More Details & Some Applications.
Smith/Davis (c) 2005 Prentice Hall Chapter Six Summarizing and Comparing Data: Measures of Variation, Distribution of Means and the Standard Error of the.
Nature of Science Science Nature of Science Scientific methods Formulation of a hypothesis Formulation of a hypothesis Survey literature/Archives.
Warsaw Summer School 2014, OSU Study Abroad Program Variability Standardized Distribution.
Measures of Dispersion & The Standard Normal Distribution 2/5/07.
DISTRIBUTIONS. What is a “distribution”? One distribution for a continuous variable. Each youth homicide is a case. There is one variable: the number.
1 Psych 5500/6500 Standard Deviations, Standard Scores, and Areas Under the Normal Curve Fall, 2008.
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved.
Measures of Dispersion & The Standard Normal Distribution 9/12/06.
VARIABILITY. Case no.AgeHeightM/F 12368M 22264F 32369F 42571M 52764F 62272M 72465F 82366M 92366F F M F M F F F.
TYPES OF STATISTICAL METHODS USED IN PSYCHOLOGY Statistics.
An Introduction to Statistics. Two Branches of Statistical Methods Descriptive statistics Techniques for describing data in abbreviated, symbolic fashion.
Measures of Dispersion How far the data is spread out.
Chapter 3 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 Chapter 3: Measures of Central Tendency and Variability Imagine that a researcher.
Central Tendency & Dispersion
Review - Confidence Interval Most variables used in social science research (e.g., age, officer cynicism) are normally distributed, meaning that their.
DISTRIBUTIONS. What is a “distribution”? One distribution for a continuous variable. Each youth homicide is a case. There is one variable: the number.
Quality Control: Analysis Of Data Pawan Angra MS Division of Laboratory Systems Public Health Practice Program Office Centers for Disease Control and.
LIS 570 Summarising and presenting data - Univariate analysis.
Descriptive Statistics for one Variable. Variables and measurements A variable is a characteristic of an individual or object in which the researcher.
Outline of Today’s Discussion 1.Displaying the Order in a Group of Numbers: 2.The Mean, Variance, Standard Deviation, & Z-Scores 3.SPSS: Data Entry, Definition,
Descriptive Statistics Research Writing Aiden Yeh, PhD.
Descriptive Statistics(Summary and Variability measures)
Chapter 6: Descriptive Statistics. Learning Objectives Describe statistical measures used in descriptive statistics Compute measures of central tendency.
Cell Diameters and Normal Distribution. Frequency Distributions a frequency distribution is an arrangement of the values that one or more variables take.
© 2006 by The McGraw-Hill Companies, Inc. All rights reserved. 1 Chapter 10 Descriptive Statistics Numbers –One tool for collecting data about communication.
Advanced Quantitative Techniques
How Psychologists Ask and Answer Questions Statistics Unit 2 – pg
Statistics.
APPROACHES TO QUANTITATIVE DATA ANALYSIS
Description of Data (Summary and Variability measures)
Univariate Descriptive Statistics
Statistics is the science of conducting studies to collect, organize, summarize, analyze, present, interpret and draw conclusions from data. Table.
Module 8 Statistical Reasoning in Everyday Life
Basic Statistical Terms
Dispersion How values arrange themselves around the mean
Random sample of patrol officers, each scored 1-5 on a cynicism scale
VARIABILITY Distributions Measuring dispersion
VARIABILITY Distributions Measuring dispersion
Summary (Week 1) Categorical vs. Quantitative Variables
Presentation transcript:

VARIABILITY

Case no.AgeHeightM/F 12368M 22264F 32369F 42571M 52764F 62272M 72465F 82366M 92366F F M F M F F F M M F F 21 52F F M F M F M F F M F Summary statistics mean = 24mean = 67 %M 39 %F 61 Review: Distribution An arrangement of cases according to their score or value on one or more variables Categorical variable Continuous variable

Dispersion and the mean Dispersion: How scores or values arrange themselves around the mean If most scores cluster about the mean the shape of the distribution is peaked – This is the so-called “normal” distribution – In social science the scores or values for many variables are normally or near-normally distributed – This allows use of the mean to describe the dataset (that’s why it’s called a “summary statistic”) When scores are more dispersed a distribution’s shape is flatter – Distance between most scores and the mean is greater – Many scores are at a considerable distance from the mean – The mean loses value as a summary statistic Normal distribution “Flat” distribution Mean A good 3.0  descriptor Arrests TT Mean A poor 3.65  descriptor

Measuring dispersion Average deviation  (x - ) n – Average distance between the mean and the values (scores) for each case – Uses absolute distances (no + or -) – Affected by extreme scores Variance (s 2 ): A sample’s cumulative dispersion  (x - ) n  use n-1 for small samples Standard deviation (s): A standardized form of variance, comparable between samples  (x - ) n  use n-1 for small samples – Square root of the variance – Expresses dispersion in units of equal size for that particular distribution – Less affected by extreme scores

Sample 1 (n=10) OfficerScoreMeanDiff.Sq ____________________________________________________ Sum 8.90 Variance (sum of squares / n-1) s 2.99 Standard deviation (sq. root of variance) s.99 Variability exercise Random sample of patrol officers, each scored 1-5 on a cynicism scale This is not an acceptable graph – it’s only to illustrate dispersion

Sample 2 (n=10) OfficerScoreMeanDiff.Sq. 12 ______ ___ 21_________ 31___ ______ 42______ ___ 53______ ___ 63______ ___ 73______ ___ 83______ ___ 94___ ______ 102______ ___ Sum ____ Variance s 2 ____ Standard deviation s ____ Another random sample of patrol officers, each scored 1-5 on a cynicism scale Compute...

Sample 2 (n=10) OfficerScoreMeanDiff.Sq Sum 8.40 Variance (sum of squares / n-1) s 2.93 Standard deviation (sq. root of variance) s.97 Sample 1 (n=10) OfficerScoreMeanDiff.Sq Sum 8.90 Variance (sum of squares / n-1) s 2.99 Standard deviation (sq. root of variance) s.99 Two random samples of patrol officers, each scored 1-5 on a cynicism scale These are not acceptable graphs – they’re only used here to illustrate how the scores disperse around the mean

Normal distributions Characteristics: – Unimodal and symmetrical: shapes on both sides of the mean are identical – percent of the area “under” the curve – meaning percent of the cases – falls within one “standard deviation” (+/- 1 ) from the mean – NOTE: The fact that a distribution is “normal” or “near-normal” does NOT imply that the mean is of any particular value. All it implies is that scores distribute themselves around the mean “normally”. Means depend on the data. In this distribution the mean could be any value. By definition, the standard deviation score that corresponds with the mean of a normal distribution - whatever that score might be - is zero. Mean (whatever it is) Standard deviation (always 0 at the mean)

Number of tickets Frequency B D F H K A C E G I J L M SD mean +1 SD How well do means represent (summarize) a sample? Mean = 4.46 SD = officers scored on numbers of tickets written in one week Officer A: 1 ticket Officers B & C: 2 tickets each Officers D & E: 3 tickets each Officers F & G: 4 tickets each Officers H & I: 5 tickets each Officer J: 6 tickets Officers K & L: 7 tickets each Officer M: 9 tickets In a normal distribution about 66% of cases fall within 1 SD of the mean..66 X 13 cases = 9 cases But here only 7 cases (Officers D-J) fall within 1 SD of the mean. Six officers wrote very few or very many tickets, making the distribution considerably more dispersed than “normal.” So…for this sample, the mean does NOT seem to be a good summary statistic. It is NOT a good shortcut for describing how officers in this sample performed. If variable “no. of tickets” was “normally” distributed most cases would fall inside the bell- shaped curve. Here they don’t.

Mean = 4.69 SD = 2.1 In a normal distribution 66 percent of the cases fall within 1 SD of the mean.66 X 13 = 8.58 = 9 cases Here, 9 of the 13 cases (officers C-K) do fall within 1 SD of the mean. The distribution is normal because most officers wrote close to the same number of tickets, so the cases “clustered” around the mean. So, for this sample the mean is a good summary statistic - a good shortcut for describing officer performance D G E H J A B C F I K L M SD mean +1 SD Number of tickets Frequency 13 officers scored on numbers of tickets written in one week Officer A: 1 ticket Officer B: 2 tickets Officer C: 3 tickets Officers D, E, F: 4 tickets each Officers G, H, I: 5 tickets each Officers J & K: 6 tickets each Officer L: 7 tickets Officer M: 9 tickets If variable “no. of tickets” was “normally” distributed most cases would fall inside the bell- shaped curve. Here they do!

Going beyond description… As we’ve seen, when variables are normally or near- normally distributed, the mean, variance and standard deviation can help describe datasets But they are also useful in explaining why things change; that is, in testing hypotheses For example, assume that patrol officers in the XYZ police dept. were tested for effectiveness, and that on a scale of 1 (least eff.) to 5 (most eff.) their mean score was 3.2, distributed about normally You want to use XYTZ P.D. to test the hypothesis that college-educated cops are more effective: college  greater effectiveness – Independent variable: college (Y/N) – Dependent variable: effectiveness (scale 1-5) You draw two officer samples (we’ll cover this later in the term) and compare their mean effectiveness scores – 10 college grads (mean 3.7) – 10 non-college (mean 2.8) The difference between means is in the hypothesized direction. But does that “prove” that college grads are more effective? Each group’s variance is used in a test that determines whether the difference in mean scores is large enough to be “statistically significant.” Don’t worry - we’ll cover this later! College grads Non-college grads Are college- educated cops more effective?

Exam information You must bring a regular, non-scientific calculator with no functions beyond a square root key. You need to understand the concept of a distribution. You will be given data and asked to create graph(s) depicting the distribution of a single variable. You will compute basic statistics, including mean, median, mode and standard deviation. All computations must be shown on the answer sheet. You will be given the formula for variance (s 2 ). You must use and display the procedure described in the slides and practiced in class for manually calculating variance (s 2 ) and its square root, known as standard deviation (s). This is a relatively brief exam. You will have one hour to complete it. We will then take a break and move on to the next topic.