Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to Measurement

Similar presentations


Presentation on theme: "Introduction to Measurement"— Presentation transcript:

1 Introduction to Measurement
5 Introduction to Measurement

2 5.1 Foundations of Measurement
Research topics are often abstract These topics must be translated into measurable constructs No measurement is perfect Example: your weight may be different on even the same scale!

3 5.1a Levels of Measurement
Figure 5.3 Relationship between attributes and values in a measure. For nominal data, such as party affiliation, we need to assign values (codes) to each category, so we can analyze the data numerically.

4 5.1a The Hierarchy of Levels of Measurement
Figure 5.4 The hierarchy of levels of measurement. Nominal Level of Measurement: Measuring a variable by assigning a number arbitrarily in order to name it numerically so that it might be distinguished from other objects. The jersey numbers in most sports are measured at a nominal level. Ordinal Level of Measurement: Measuring a variable using rankings. Class rank is a variable measured at an ordinal level. Interval Level of Measurement: Measuring a variable on a scale where the distance between numbers is interpretable. For instance, temperature in Fahrenheit or Celsius is measured on an interval level. Ratio Level of Measurement: Measuring a variable on a scale where the distance between numbers is interpretable and there is an absolute zero value. For example, weight is a ratio measurement.

5 5.2 Quality of Measurement
Two key concepts: Reliability Validity

6 5.2a Reliability True score theory Measurement error
Random error Systematic error Pay attention to potential bias True score theory: maintains that every observable score is the sum of two components: true ability (or the true level) of the respondent on that measure; and random error. The true score is essentially the score that a person would have received if the score were perfectly accurate. Random error: A component or part of the value of a measure that varies entirely by chance. Random error adds noise to a measure and obscures the true value. Bias: A systematic error in an estimate. A bias can be the result of any factor that leads to an incorrect estimate. In measurement a bias can either be systematic and consistent or random. In either case, when bias exists, the values that are measured do not accurately reflect the true value.

7 5.2a How to Reduce Measurement Error
Pilot test instruments When using observers as data collectors, make sure they are trained Double check the data Use statistical procedures to adjust for measurement error Triangulate: combine multiple measurements

8 5.2b Theories of Reliability
Reliable means a measurement is repeatable and consistent Reliability is a ratio: The proportion of truth in your observation Determined using a group of individuals, not a single observation Variance and standard deviation The higher the correlation, the more reliable the measure

9 5.2c Types of Reliability: Inter-rater
Inter-rater or inter-observer reliability Cohen’s Kappa Figure 5.17 If only there were consensus! Inter-rater or inter-observer reliability is used to assess the degree to which different raters/observers give consistent estimates of the same phenomenon. Courtesy of Robert Michael/Corbis Cohen’s kappa: A statistical estimate of inter-rater agreement or reliability that is more robust than percent agreement because it adjusts for the probability that some agreement is due to random chance.

10 5.2c Types of Reliability: Test-Retest
Figure 5.19 Test-retest reliability. You estimate test-retest reliability when you administer the same test to the same (or a similar) sample on two different occasions

11 5.2c Types of Reliability: Parallel-Forms Reliability
Figure 5.20 Parallel-forms reliability. Two forms of the instrument are created. The correlation between the two parallel forms is the estimate of reliability.

12 5.2c Types of Reliability: Internal Consistency
Average inter-item correlation Average item-total correlation Split-half reliability Cronbach’s alpha (α) Average Inter-Item Correlation: The average inter-item correlation uses all of the items on your instrument that are designed to measure the same construct. You first compute the correlation between each pair of items. Average Item-Total Correlation: This approach also uses the inter-item correlations. In addition, you compute a total score for the items and treat that in the analysis like an additional variable. Split-Half Reliability: In split-half reliability, you randomly divide into two sets all items that measure the same construct. You administer the entire instrument to a sample and calculate the total score for each randomly divided half of the measure. The split-half reliability estimate is simply the correlation between these two total scores. Cronbach’s Alpha: One specific method of estimating the reliability of a measure. Although not calculated in this manner, Cronbach’s Alpha can be thought of as analogous to the average of all possible split-half correlations.

13 5.2d Validity Construct validity Operationalization
The act of translating a construct into its manifestation, for example, translating the idea of your treatment or program into the actual program, or translating the idea of what you want to measure into the real measure The result is also referred to as an operationalization, that is, you might describe your actual program as an operationalized program

14 5.2e Construct Validity and Other Measurement Validity Labels
Translation validity Face validity Content validity Criterion-related validity Predictive validity Concurrent validity Convergent validity Discriminant validity Translation validity: A type of construct validity related to how well you translated the idea of your measure into its operationalization. Face validity: A validity that checks that “on its face” the operationalization seems like a good translation of the construct. Content validity: A check of the operationalization against the relevant content domain for the construct. Criterion-related validity: The validation of a measure based on its relationship to another independent measure as predicted by your theory of how the measures should behave. Predictive validity: A type of construct validity based on the idea that your measure is able to predict what it theoretically should be able to predict. Concurrent validity: An operationalization’s ability to distinguish between groups that it should theoretically be able to distinguish between. Convergent validity: the degree to which the operationalization is similar to (converges on) other operationalizations to which it theoretically should be similar. Discriminant validity: The degree to which concepts that should not be related theoretically are, in fact, not interrelated in reality.

15 5.2e Construct Validity; Putting it Together
Figure 5.29 Convergent and discriminant validity correlations in a single table or correlation matrix

16 5.2f Threats to Construct Validity
Inadequate preoperational explication of constructs Mono-operation and mono-method bias Interaction of different treatments, or of testing and treatment Restricted generalizability across constructs Confounding constructs and levels of constructs Inadequate Preoperational Explication of Constructs: failing to define what you meant by the construct before you tried to translate it into a measure or program. Mono-operation bias: A threat to construct validity that occurs when you rely on only a single implementation of your independent variable, cause, program, or treatment in your study. Mono-method bias: A threat to construct validity that occurs because you use only a single method of measurement. Interaction of different treatments: When participants are involved in more than one course of treatment, you cannot be sure it was your treatment that caused a change. Interaction of testing and treatment: Sometimes, the act of being tested itself can cause participants performance to change on a given construct. Restricted generalizability across constructs: Is spending money on fun things the same as spending money on bills? The two constructs are different, and therefore generalizing is restricted. Confounding constructs and levels: Is spending $5 the same as spending $500? The five dollar and five hundred dollar categories are levels of the construct “spending.” A different level may have a greater or lesser effect on the dependent variable.

17 5.2g The Social Threats to Construct Validity
Hypothesis guessing Evaluation apprehension Researcher expectations Hypothesis guessing: A threat to construct validity and a source of bias in which participants in a study guess the purpose of the study and adjust their responses based on that. Evaluation Apprehension: When the act of being evaluated or tested causes anxiety in participants, thus altering their natural performance or behavior. Researcher Expectations: A researcher can unconsciously and subtly influence the behavior of participants with body language, appearance, etc.

18 5.3 Integrating Reliability and Validity

19 Discuss and Debate Why are reliability and validity in our measurements important? How does test-retest reliability work? Illustrate with an example. Is it better to administer just one measurement of a construct? Why or why not? Reliability and validity in measurement are the cornerstones of any research project. If your measurement is not reliable and valid, then your study results will be flawed. Test-retest reliability occurs when we give the same test to a respondent twice, and then look at the correlation between the scores. A low correlation indicates that the respondent has changed his or her responses. When this happens, the test may be flawed, and is not considered reliable. When a researcher relies on only one form of measurement, a mono-method bias can skew the results of the study.


Download ppt "Introduction to Measurement"

Similar presentations


Ads by Google