Presentation is loading. Please wait.

Presentation is loading. Please wait.

R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | i 1 INF 397C Introduction to Research in Information Studies.

Similar presentations


Presentation on theme: "R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | i 1 INF 397C Introduction to Research in Information Studies."— Presentation transcript:

1 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 1 INF 397C Introduction to Research in Information Studies Fall, 2009 Day 2

2 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 2 Standard Deviation σ = SQRT(Σ(X - µ) 2 /N) (Does that give you a headache?)

3 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 3 USA Today has come out with a new survey - apparently, three out of every four people make up 75% of the population. –David Letterman

4 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 4 Statistics: The only science that enables different experts using the same figures to draw different conclusions. –Evan Esar (1899 - 1995), US humorist

5 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 5 Didja hear the one about... the three statisticians who went hunting?

6 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 6 Critical Skepticism Remember the Rabbit Pie example from last week? The “critical consumer” of statistics asked “what do you mean by ’50/50’”?

7 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 7 Remember... I do NOT want you to become cynical. Not all “media bias” (nor bad research) is intentional. Just be sensible, critical, skeptical. As you “consume” statistics, ask some questions...

8 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 8 Ask yourself... Who says so? (A Zest commercial is unlikely to tell you that Irish Spring is best.) How does he/she know? (That Zest is “the best soap for you.”) What’s missing? (One year, 33% of female grad students at Johns Hopkins married faculty.) Did somebody change the subject? (“Camrys are bigger than Accords.” “Accords are bigger than Camrys.”) Does it make sense? (“Study in NYC: Working woman with family needed $40.13/week for adequate support.”)

9 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 9 What were...... some claims you all heard this week?

10 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 10 Last week... We learned about frequency distributions. I asserted that a frequency distribution, and/or a histogram (a graphical representation of a frequency distribution), was a good way to summarize a collection of data. And I asserted there’s another, even shorter-hand way.

11 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 11 Measures of Central Tendency Mode –Most frequent score (or scores – a distribution can have multiple modes) Median –“Middle score” –50 th percentile Mean - µ (“mu”) –“Arithmetic average” –ΣX/N

12 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 12 A quiz about averages 1 – If one score in a distribution changes, will the mode change? __Yes __No __Maybe 2 – How about the median? __Yes __No __Maybe 3 – How about the mean? __Yes __No __Maybe 4 – True or false: In a normal distribution (bell curve), the mode, median, and mean are all the same? __True __False What if we ADDED one score?

13 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 13 More quiz 5 – (This one is tricky.) If the mode=mean=median, then the distribution is necessarily a bell curve? __True __False 6 – I have a distribution of 10 scores. There was an error, and really the highest score is 5 points HIGHER than previously thought. a) What does this do to the mode? __ Increases it __Decreases it __Nothing __Can’t tell b) What does this do to the median? __ Increases it __Decreases it __Nothing __Can’t tell c) What does this do to the mean? __ Increases it __Decreases it __Nothing __Can’t tell 7 – Which of the following must be an actual score from the distribution? a) Mean b) Median c) Mode d) None of the above

14 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 14 OK, so which do we use? Means allow further arithmetic/statistical manipulation. But... It depends on: –The type of scale of your data Can’t use means with nominal or ordinal scale data With nominal data, must use mode –The distribution of your data Tend to use medians with distributions bounded at one end but not the other (e.g., salary). (Look at our “Number of MLB games” distribution.) –The question you want to answer “Most popular score” vs. “middle score” vs. “middle of the see-saw” “Statistics can tell us which measures are technically correct. It cannot tell us which are ‘meaningful’” (Tal, 2001, p. 52).

15 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 15 NameX (# of MLB games seen) µ Wenbin0 Daniel0 Stephen0 Christopher2 Geoff3 Clarke3 Justin4 Erik15 Randolph27 Total

16 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 16 Scales (last week) NominalOrdinalIntervalRatio Name=== Mutually- exclusive === Ordered== Equal interval = + abs. 0 Gender, Yes/No Class rank, ratings Days of wk., temp. Inches, dollars

17 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 17 Scales (which measure of CT?) Nominal (mode) Ordinal (mode, median) Interval (any) Ratio (any) Name=== Mutually- exclusive === Ordered== Equal interval= + abs. 0 Gender, Yes/No Class rank, ratings Days of wk., temp. Inches, dollars

18 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 18 Mean – “see saw” (from Tal, 2001)

19 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 19 Have sidled up to SHAPES of distributions Symmetrical Skewed – positive and negative Flat Multi-modal

20 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 20 Now, let’s add to freq dist Raw Cumu Relative Cumu Score Freq Freq Freq Rel Freq 02 2.2.2 13 5.3.5 21 6.1.6 31 7.1.7 41 8.1.8 51 9.1.9 131 10.1 1.0

21 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 21 When you... add relative frequency and cumulative relative frequency to your frequency distribution it will help you calculate percentiles (and, therefore, the median).

22 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 22 “Pulling up the mean”

23 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 23 Why...... isn’t a “measure of central tendency” all we need to characterize a distribution of scores/numbers/data/stuff? “The price for using measures of central tendency is loss of information” (Tal, 2001, p. 49). –Remember the see-saw example. Same measure of central tendency – widely varying distribution of scores.

24 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 24 Didja hear the one about... the Aggies who were on a march and came to a river? The Aggie captain asked the farmer how deep the river was.” “Oh, it averages two feet deep.” All the Aggies drowned.

25 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 25 Note... We started with a bunch of specific scores. We put them in order. We drew their distribution. Now we can report their central tendency. So, we’ve moved AWAY from specifics, to a summary. But with Central Tendency, alone, we’ve ignored the specifics altogether. –Note MANY distributions could have a particular central tendency! If we went back to ALL the specifics, we’d be back at square one.

26 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 26 Measures of Dispersion Range Semi-interquartile range Standard deviation –σ (sigma)

27 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 27 Range Highest score minus the lowest score. Like the mode... –Easy to calculate –Potentially misleading –Doesn’t take EVERY score into account. What we need to do is calculate one number that will capture HOW spread out our numbers are from that measure of Central Tendency. –‘Cause MANY different distributions of scores can have the same central tendency! –“Standard Deviation”

28 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 28 Back to our data – MLB games Let’s take just the men in this class xls spreadsheet. Measures of central tendency. Go with mean. (‘Cause we can – ratio scale data!) So, how much do the actual scores deviate from the mean?

29 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 29 First – just for grins – mode, median, mean? NameX (# of MLB games seen) µ Wenbin0 Daniel0 Stephen0 Christopher2 Geoff3 Clarke3 Justin4 Erik15 Randolph27 Total54

30 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 30 So... Add up all the deviations and we should have a feel for how disperse, how spread, how deviant, our distribution is. Let’s calculate the Standard Deviation. As always, start inside the parentheses. (X - µ)

31 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 31 So, find distance of each score from the mean NameX (# of MLB games seen) µX - µ Wenbin06-6 Daniel06-6 Stephen06-6 Christopher26-4 Geoff36-3 Clarke36-3 Justin46-2 Erik1569 Randolph27621 Total54

32 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 32 So, find distance of each score from the mean NameX (# of MLB games seen) µX - µ Wenbin06-6 Daniel06-6 Stephen06-6 Christopher26-4 Geoff36-3 Clarke36-3 Justin46-2 Erik1569 Randolph27621 Total540

33 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 33 Damn! OK, let’s try it on a smaller set of numbers. X 2 3 5 6

34 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 34 Damn! (cont’d.) OK, let’s try it on a smaller set of numbers. XX - µ 2-2 3 51 62 Σ = 16Σ = 0 µ = 4Hmm.

35 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 35 OK...... so mathematicians at this point do one of two things. Take the absolute value or square ‘em. We square ‘em. Σ(X - µ) 2

36 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 36 So, find distance of each score from the mean NameX (# of MLB games seen) µX - µ(X - µ) 2 Wenbin06-636 Daniel06-636 Stephen06-636 Christopher26-416 Geoff36-39 Clarke36-39 Justin46-24 Erik15699 Randolph27621441 Total540596

37 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 37 Standard Deviation (cont’d.) Then take the average of the squared deviations. Σ(X - µ) 2 /N –596/9 = 66.2 But this number is so BIG!

38 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 38 Remember... We had to SQUARE all the deviation scores (X - µ) to get around the addin’- up-to-zero problem... So now we take the square root, to get us back in the same ballpark: SQRT(66.2) = 8.1.

39 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 39 Sooooo... How many MLB games have the males in our class seen live: 0, 3, 0, 27, 15, 0, 3, 4, 2 (ugh) 0, 0, 0, 2, 3, 3, 4, 15, 27 (hmm) 50 th percentile (median) = 3 (now we’re talkin’) µ = 6 (I’m with ya’) µ = 6, σ = 8.1 (NOW I have a pretty clear picture. I know YOU don’t, yet!)

40 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 40 OK...... take the square root (to make up for squaring the deviations earlier). σ = SQRT(Σ(X - µ) 2 /N) Now this doesn’t give you a headache, right? I said “right”?

41 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 41 Hmmm... ModeRange Median????? MeanStandard Deviation

42 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 42 We need... A measure of spread that is NOT sensitive to every little score, just as median is not. SIQR: Semi-interquartile range. (Q3 – Q1)/2

43 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 43 To summarize ModeRange-Easy to calculate. -May be misleading. MedianSIQR-Capture the center. -Not influenced by extreme scores. Mean (µ) SD (σ) -Take every score into account. -Allow later manipulations.

44 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 44 Who wants to guess...... What I think is the most important sentence in S, Z, & Z (2006), Chapter 2?

45 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 45 p. 32 Penultimate paragraph, first sentence: “Scientists seek to determine whether any differences in their observations of the dependent variable are caused by the different conditions of the independent variable.”

46 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 46 http://highered.mcgraw- hill.com/sites/0072494468/student_view0 /statistics_primer.htmlhttp://highered.mcgraw- hill.com/sites/0072494468/student_view0 /statistics_primer.html Click on Statistics Primer.

47 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 47 Practice Problems

48 R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | rbias@ischool.utexas.edu i 48 Homework LOTS of reading. See syllabus. Send a table/graph/chart that you’ve read this past week. Send email to Garrett by noon, Friday. See you next week.


Download ppt "R. G. Bias | School of Information | UTA 5.424 | Phone: 512 471 7046 | i 1 INF 397C Introduction to Research in Information Studies."

Similar presentations


Ads by Google