1 Reliability and Validity Research Methods in AD/PR COMM Fall 2007 Nan Yu
2 Exam 1 Multiple choices (70%) Short answer (30%) Time: 9/27 (Thursday), 3:35-5:30p Place: 143 Stuckeman No made-up exams will be given if you miss the exam without a prior notice or a verifiable excuse.
3 Overview of Last Class Levels of measurement Nominal Ordinal Interval Ratio Types of measurement Open-ended question Likert-type scale Thurstone scale Semantic Differential scale
4 How to differentiate interval/ratio/ordinal variable? How many hours do you watch TV everyday? ________hours How many hours do you watch TV everyday? hourhours How many hours do you watch TV every day? more than 11 hours
5 Nominal variable Categories are mutually exclusive Categories are exhaustive: all possible responses are provided (One individual case should fit in at least one category.) Ethnicity Caucasian African American Native American Asian Hispanic Other
6 Likert-type scale A neutral point is always provided Listening to heavy metal music makes one prone to violent acts. __Strongly agree __Agree __Neutral __Disagree __Strongly disagree How would you rate the quality of Daily Collegian? __ poor__ unsatisfactory __neither unsatisfactory nor satisfactory __ satisfactory __ excellent Listening to heavy metal music makes one prone to violent acts. Strongly disagree Neutral Strongly agree
7 Semantic Differential Scales (bipolar) If the scale looks like this: Not at all Very much Likable Good Pleasant Then what type of measurement is it now?
8 Corrections y1/survey1.html y1/survey1.html A few corrections 0-25% 26%-50% 51%-75%76-100% ordinal, closed question Gender 1=male, 2=female nominal, closed question
9 Reliability The extend to which measurement are consistent, stable, dependable, predicable Suggests that the same thing is repeated or recurs under identical or similar conditions Every time you measure, you get similar or same data Why reliability is important? Guarantee the quality of your data Replicability E.g., degree in which replicating a study using the same procedures, the same instruments, etc. will lead to the same results
10 Factors that reduce reliability Instrumental error Double-barreled question Do you like to watch and play basketball? Application error Instrument is used improperly Random error Unpredictable error
11 Types of Reliability Test-retest reliability Degree of matching between measurement results when the measurements are repeated E.g., They are taken more than once, for the same object of measurement
12 Types of Reliability Measurement item reliability Internal consistency of several items The degree in which a bunch of items stand together. E.g., happiness measured by answering questions such as “ how thrilled you are? ” “ how happy you are? ” “ how cheerful you are? ” Answers to these three questions should be similar; it would mean that the happiness scale is reliable / it has internal consistency Cronbach alpha: (interval variable)
13 Types of Reliability Inter-coder reliability When a researcher has to code or interpret open- ended answers of the respondents, or news stories material, etc., his or her interpretation might be subjective and therefore, not completely reliable. One way of dealing with this problem is to ask other individuals to “ code ” the same material. Degree in which they agree upon the results of coding is inter-coder reliability 90% of agreement, Cohen ’ s kappa, Scott ’ s pi.
14 Inter-coder reliability example Imagine that three coders are asked to code the amount of violence on a certain televised program; they are given a coding sheet, explaining to them what they should consider as being violent. However, they do not always agree that certain acts or behaviors pertain to one of the violence descriptions. If they agree 87 percent of the time, the inter-coder reliability is of 87% (0.87).
15 How to improve reliability Clearly conceptualize all constructs; reliability increases when a single construct or sub-dimension of a construct is measured Increase the level of measurement; more precise levels of measurement are more likely to be reliable than less precise measures because the latter pick up less detailed information Age Young Middle-aged Old Age Age What is your age?__________
16 How to improve reliability Use multiple indicators of a variable; multiple indicator measures tend to be more stable than measures with one item Sadness gloom sorrow grief unhappiness Use pretests, pilot studies, and replication trained observers/coders
17 Validity Degree to which a measure “ measures ” what is supposed to measure (e.g., degree of matching between the concept and the measurement).
18 Content and Face Validity Whet her a measure captures the meaning of the variable being measured.
19 Face and Content Validity Face validity In face validity, you look at the operationalization and see whether "on its face" it seems like a good translation of the construct. Example of lacking face validity Use a ruler to measure weight Use shoe size to measure intelligence Content validity Very similar to face validity Needs careful operationalization of the concept. Researchers are the judges of measurement validity (face and content).
20 Criterion Validity Criterion validity Uses some standard or criterion to indicate a construct accurately. “ concurrent validity ” and “ predictive validity ”
21 Types of Criterion Validity Concurrent validity: a measure (indicator) must be associated with a preexistent one that is judged to be valid. E.g. GRE Predictive validity: indicator predicts future events that are logically related to a construct is called predictive validity. E.g. SAT and college academic performance
22 Construct validity Construct validity refers to the degree to which inferences can legitimately be made from the operationalizations in your study to the theoretical constructs on which those operationalizations were based. Convergent validity Discriminant validity (Divergent validity)
23 Construct validity Convergent validity (multiple-item measures) Applies when multiple indicators converge or are associated with one another. you should be able to show a correspondence or convergence between similar constructs Convergent validity means that multiple measures of the same construct hang together or operate in similar ways.
24 Convergent validity (multi-item validity)
25 Discriminant validity Discriminant validity measures of constructs that theoretically should not be related to each other are, in fact, observed to not be related to each other you should be able to discriminate between dissimilar constructs you are measuring what you want to measure, not something else.
26 Discriminant validity
27 Internal Validity Internal validity Degree to which one can prove causation Practically, degree to which you can eliminate third variables or confounds.
28 External Validity External validity Degree to which one can generalize the conclusions of the study from the sample used in the study to the overall population. generalize
29 Relationship between reliability and validity (p ) “Reliability must be present or validity is impossible.” Reliability is a necessary condition for validity, but not sufficient. Even the measure turns to be reliability, it may not measure what you want to measure --lack of validity
30 Relationship between reliability and validity
31 In-Class Demo 1 and 2 Download the file “in-class demo 1” and “1-class demo 2” in “week 5” folder on ANGEL Complete both of them Submit your answers to the corresponding drop boxes in “week 5” folder on ANGEL Answers of are also in the “week 5” folder.