Validity and Reliability in Instrumentation : Research I: Basics Dr. Leonard February 24, 2010
Recap Research design can be… experimental or non-experimental (maybe quasi-experimental) basic or applied research laboratory or field setting quantitative or qualitative data collection Research must be based in solid theory and testable hypotheses Research must include clear conceptual and operational definitions
Quasi-experimental Occurring more commonly in psychology Apply experimental principles like cause and effect or group comparison to field, or less controlled settings More like correlational research Less control over extraneous variables but can take place outside of lab, which may decrease the artificial feeling Interpretation of results not as clean as in experimental research but closer to “real world” application
Scientific method √ 1. Formulate theories √ √ 2. Develop testable hypotheses (operational definitions) √ √ 3. Conduct research, gather data √ 4. Evaluate hypotheses based on data 5. Cautiously draw conclusions
Next steps…gather data Once you have explicitly clear conceptual and operational definitions to guide the research, you must develop your measures for collecting data Operational definition proposes type of measures Instrumentation is the process of selecting or creating measures for a study (the measure is your instrument) Two overarching goals for instrumentation Validity: the extent to which a measure (operationally defined) taps the concept it’s designed to measure and not some other concept Reliability: the consistency or stability of a measure, i.e., same results obtained if measure used again
Caveats Can never be certain of the validity (or reliability) of our instruments so we try to speculate the degree of validity We might claim “modest” or “partial” validity Hard to capture true essence of a concept/construct and some concepts/constructs are more elusive than others! An estimate of the validity of our measures depends on the purpose of the study Keep focused on the hypotheses and operational definitions! Two types of validity we estimate Judgmental validity Empirical validity
Types of validity: Judgmental Content validity: whether the concept being measured is a real concept AND whether the measurement being used is the most appropriate one to be using Is our operationally defined variable (concrete) really capturing the hypothetical concept (abstract) we are interested in studying? Are we capturing the central meaning? Concept Variable/ Measure
Types of validity: Judgmental Content validity, or any other type of validity alone, is never enough to determine if our measure is valid so we consider other types… Face validity: measure is valid because it makes sense; on the surface, it seems to tap into construct of interest Face Validity is neither sufficient nor absolutely necessary for overall validity, but is a helpful clue Could have high face validity but low content validity!
Good face validity? Rosenberg Self-Esteem Scale 1= Strongly Disagree, 7 = Strongly Agree _____1. I feel that I am a person of worth, at least on an equal basis with others. _____2. I feel that I have a number of good qualities. _____3. All in all, I am inclined to think that I am a failure.* _____4. I am able to do things as well as most people. _____5. I feel that I do not have much to be proud of.* _____6. I take a positive attitude towards myself. _____7. On the whole, I am satisfied with myself. _____8. I wish I could have more respect for myself.* _____9. I certainly feel useless at times.* _____10. At times I think I am no good at all.* *Reverse scored
Types of validity: Empirical Criterion-related Validity: extent to which your measure of a concept relates to a theoretically meaningful criterion for that concept, a “gold standard” for that concept Predictive validity: The measure should be able to predict future behavior that is related to the concept E.g., Job skills test and future ratings of performance Concurrent (convergent) validity: The measure should be meaningfully related or correlated to some other measure of the behavior E.g., Scores on two different job skills tests Predicitve or concurent validity coefficient: a number (0-1) based on correlation that quantifies whether the measure is in fact related to other measures it should be related to
Types of validity: Judgmental-Empirical Construct validity represents a combined approach for estimating validity using 1) a subjective prediction about what other concepts (indicators) the concept being measured should relate to and.. 2) an empirical test of whether the concept is in fact related to those other indicators E.g., Depression should be linked to disengagement from schoolwork among college students so test relationship between depression scores and GPA among a sample of students