PS 366 3
Measurement Related to reliability, validity: Bias and error – Is something wrong with the instrument? – Is something up with the thing being measured?
Measurement Bias & error with the instrument – Random? – Systematic?
Measurement Bias & error with the thing being measured – Random? failure to understand a survey question – Systematic? does person have something to hide?
Measurement Example: – Reliability, validity, error & bias in measuring unemployment – Census survey [also hiring reports, claims filed w/ government, state data to feds...] – What sources of bias?
Measurement Unemployment [employment status]: – Fully employed – Part time – looking for work, + part time – looking for work, no job – lost job, not looking for work – retired
Measurement Example: – Reliability, validity, error & bias in measuring victims of violent crime – Census surveys, police records, FBI UCR – What sources of bias?
Measurement How do we ask people questions about attitudes, behavior that isn’t socially accepted? – prejudice – Racism – Feelings toward gays & lesbians – shoplifting
Measurement: Item Count Technique Here are 3 things that sometimes make people angry or upset. After reading these, record how many of them upset you. Not which ones, just how many? federal govt increasing the gas tax professional athletes getting million dollar salaries large corporations polluting the environment
Measurement: Item Count Technique federal govt increasing the gas tax professional athletes getting million dollar salaries large corporations polluting the environment federal govt increasing the gas tax professional athletes getting million dollar salaries large corporations polluting the environment a black family moving next door
Measurement: Item Count Technique Randomly assign ½ of subjects to the 3 item list Randomly assign ½ subjects to the 4 item list Difference in mean # of responses between groups = % upset by sensitive item – (mean 1 – mean 2) *100 = %
Item Count ControlTREATMENT % upset Non South South – 1.95 = 0.42 *100 = 42%
Item Count – Using poll information 1) The candidate graduated from a prestigious college 2) The candidate ran a business 3) The candidate’s family background 1) The candidate graduated from a prestigious college 2) The candidate ran a business 3) The candidate’s family background 4) The candidate is ahead in polls
Use poll info ControlTREATMENT % use poll All Young Is it significant? – Depends....how much does mean reflect the group? How much variation around the mean?
Central Tendency Statistics that describe the ‘average’ or ‘typical’ value of a variable – Mean – Median – Mode
Central Tendency Why median vs. mean? – Household income – Home prices
Median vs Mean HH Income median mean 60,66763,809 49,84761,187 66,87574,653 67,00571,443 45,73566,662 63,47273,648 44,89160,250 50,26259,688 39,93060,495 65,88580,581 76,91785,837 61,14664,526 56,81559,781 62,24478,289
Median vs Mean Price Seattle Median $400K Seattle Meanhigher!
Central Tendency Mean sum=864 mean = sum X/ N = 864 / 8 mean = 108 Is this repetitive?
Central Tendency Mean sum=1092 mean = sum X/ N = 1092 / 8 = Is this repetitive?
Central Tendency Mean median = (N +1) /2 – (8+1)/2 – 9/2 – 4.5 th – (120, 125) Is this repetitive?
Central Tendency Example $120,00 $60,000 $40,000 $30,000 Mean = $50,000 Mdn = $40,000 Mo = $30,000 Which is most representative?
The Distribution Where is mean, median, mode if – Normal – Left skew – Right skew
Variation How are observations distributed around the central point? Is there one, more central point? – unimodal – bimodal
Variation Which is unimodal, which is bimodal: – Mass public ideology V con, con, moderate, lib, v. lib – Members of Congress ideology – What does the mean mean?
Distribution How spread out are the observations? Single peak – not much variation Flat? – lots of variation; what does mean mean?
Variation Standard deviation Information about variation around the mean 1
Variation Mean mean = 108 Variance = sum of squared distances of each obsv from mean, over # of observations
Variance Mean mean = 108 (x - mean)
Variance Mean mean = 108 (x - mean) (x - mean) sum sqs=2938
Variance & Std. deviation Variance does not tell us much mean = 108 variance = 2938 / 8 = Standard deviation = square root of variance sd = sqrt = 19.2
Variation Range ( lo – hi) Variance (sum of distances from mean, squared) / n Standard Deviation Bigger # for each = more variation Standard Deviation expresses variation around the mean in ‘standardized’ units Bigger # = more Allow us to compare apples to oranges
Standard Deviation Total convictions – mean = 178, s.d. = Per capita convictions (per 10,000 people) – mean =.357, s.d. =.197
Standard Deviation Low s.d relative to mean High s.d. relative to mean
Standard Deviation Distribution of total convictions: mean 187; s.d. 199
Standard Deviation Mean.357, s.d..197
Standard Deviation Turnout by state: mean =.62 ; s.d. =.07
Standard Deviation Tells even more if distribution ‘normal’ If data interval What about a state that has 50% turnout, and.7 corruption convictions per 10,000? Where are they in each distribution?
Standard Deviation Mean.357, s.d..197 X
Standard Deviation Turnout by state: mean =.62 ; s.d. =.07 X
Standard Deviation & z-scores State’s position on turnout = z – z= (score – mean) / s.d. – = ( ) /.07 = – = -.09 /.07 = standard deviations below mean on turnout
Standard Deviation & z-scores State’s position on corruption = z – z= (score – mean) / s.d. – = ( ) /.19 = – = +.35 /.19 = standard deviations above mean on corruption
Std Dev & Normal Curve
Standard Deviation & z-scores Apples: Turnout Oranges: Corruption Z = 0 is mean Z = 3 is 3 very rare
Z scores and Normal Curve How many states between mean & How many above 1.84 See Appendix C in text – below mean = 50% – between mean and z=1.84 = 46.7% – beyond mean = 3.3% [1.5 states if normal]
Z scores and Normal Curve How many states between mean & How many below z= See Appendix C in text – above mean = 50% – between mean and z= = 39.9% – beyond mean = 10.3% [1.5 states if normal]