Reliability and Validity. Criteria of Measurement Quality How do we judge the relative success (or failure) in measuring various concepts? How do we judge.

Slides:



Advertisements
Similar presentations
Agenda Levels of measurement Measurement reliability Measurement validity Some examples Need for Cognition Horn-honking.
Advertisements

Measurement Concepts Operational Definition: is the definition of a variable in terms of the actual procedures used by the researcher to measure and/or.
Cal State Northridge Psy 427 Andrew Ainsworth PhD
Survey Methodology Reliability and Validity EPID 626 Lecture 12.
1 COMM 301: Empirical Research in Communication Kwan M Lee Lect4_1.
VALIDITY AND RELIABILITY
Chapter 5 Measurement, Reliability and Validity.
Measurement Reliability and Validity
CH. 9 MEASUREMENT: SCALING, RELIABILITY, VALIDITY
Social Research Methods
LECTURE 9.
Reliability and Validity of Research Instruments
RESEARCH METHODS Lecture 18
Beginning the Research Design
Validity, Reliability, & Sampling
Research Methods in MIS
Classroom Assessment A Practical Guide for Educators by Craig A
Rosnow, Beginning Behavioral Research, 5/e. Copyright 2005 by Prentice Hall Ch. 6: Reliability and Validity in Measurement and Research.
Validity Lecture Overview Overview of the concept Different types of validity Threats to validity and strategies for handling them Examples of validity.
Test Validity S-005. Validity of measurement Reliability refers to consistency –Are we getting something stable over time? –Internally consistent? Validity.
Measurement Concepts & Interpretation. Scores on tests can be interpreted: By comparing a client to a peer in the norm group to determine how different.
Measurement and Data Quality
Instrument Validity & Reliability. Why do we use instruments? Reliance upon our senses for empirical evidence Senses are unreliable Senses are imprecise.
Reliability and Validity what is measured and how well.
VALIDITY, RELIABILITY, and TRIANGULATED STRATEGIES
Bryman: Social Research Methods, 4 th edition What is a concept? Concepts are: Building blocks of theory Labels that we give to elements of the social.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 14 Measurement and Data Quality.
LECTURE 06B BEGINS HERE THIS IS WHERE MATERIAL FOR EXAM 3 BEGINS.
Final Study Guide Research Design. Experimental Research.
The Basics of Experimentation Ch7 – Reliability and Validity.
Reliability & Validity
Validity Is the Test Appropriate, Useful, and Meaningful?
Measurement Validity.
Research: Conceptualization and Measurement Conceptualization Steps in measuring a variable Operational definitions Confounding Criteria for measurement.
Chapter 8 Validity and Reliability. Validity How well can you defend the measure? –Face V –Content V –Criterion-related V –Construct V.
Research: Conceptualization and Measurement Conceptualization Steps in measuring a variable Operational definitions Confounding Criteria for measurement.
CHAPTER OVERVIEW The Measurement Process Levels of Measurement Reliability and Validity: Why They Are Very, Very Important A Conceptual Definition of Reliability.
Presented By Dr / Said Said Elshama  Distinguish between validity and reliability.  Describe different evidences of validity.  Describe methods of.
Research Methodology and Methods of Social Inquiry Nov 8, 2011 Assessing Measurement Reliability & Validity.
Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Assessing Measurement Quality in Quantitative Studies.
Validity: Introduction. Reliability and Validity Reliability Low High Validity Low High.
Week 4 Slides. Conscientiousness was most highly voted for construct We will also give other measures – protestant work ethic and turnover intentions.
Measurement Theory in Marketing Research. Measurement What is measurement?  Assignment of numerals to objects to represent quantities of attributes Don’t.
Criteria for selection of a data collection instrument. 1.Practicality of the instrument: -Concerns its cost and appropriateness for the study population.
Measurement Experiment - effect of IV on DV. Independent Variable (2 or more levels) MANIPULATED a) situational - features in the environment b) task.
Chapter 6 - Standardized Measurement and Assessment
1 Announcement Movie topics up a couple of days –Discuss Chapter 4 on Feb. 4 th –[ch.3 is on central tendency: mean, median, mode]
Issues in Personality Assessment
Survey Design Class 02.  It is a true measure  Measurement Validity is the degree of fit between a construct and indicators of it.  It refers to how.
Dr. Jeffrey Oescher 27 January 2014 Technical Issues  Two technical issues  Validity  Reliability.
Copyright © 2014 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 11 Measurement and Data Quality.
© 2009 Pearson Prentice Hall, Salkind. Chapter 5 Measurement, Reliability and Validity.
RELIABILITY AND VALIDITY Dr. Rehab F. Gwada. Control of Measurement Reliabilityvalidity.
Measurement and Scaling Concepts
Consistency and Meaningfulness Ensuring all efforts have been made to establish the internal validity of an experiment is an important task, but it is.
ESTABLISHING RELIABILITY AND VALIDITY OF RESEARCH TOOLS Prof. HCL Rawat Principal UCON,BFUHS Faridkot.
MGMT 588 Research Methods for Business Studies
Reliability and Validity
Chapter 2 Theoretical statement:
Lecture 5 Validity and Reliability
Reliability and Validity
Test Validity.
Tests and Measurements: Reliability
Journalism 614: Reliability and Validity
Week 3 Class Discussion.
Reliability and Validity of Measurement
RESEARCH METHODS Lecture 18
Social Research Methods
Reliability & Validity
Reliability & Validity
Presentation transcript:

Reliability and Validity

Criteria of Measurement Quality How do we judge the relative success (or failure) in measuring various concepts? How do we judge the relative success (or failure) in measuring various concepts? Reliability – consistency of measurement Reliability – consistency of measurement Validity – confidence in measures and design Validity – confidence in measures and design

Reliability and Validity Reliability focuses on measurement Reliability focuses on measurement Validity also extends to: Validity also extends to: Precision in the design of the study – ability to isolate causal agents while controlling other factors Precision in the design of the study – ability to isolate causal agents while controlling other factors (Internal Validity) (Internal Validity) Ability to generalized from the unique and idiosyncratic settings, procedures and participants to other populations and conditions Ability to generalized from the unique and idiosyncratic settings, procedures and participants to other populations and conditions (External Validity) (External Validity)

Reliability Consistency of Measurement Consistency of Measurement Reproducibility over time Reproducibility over time Consistency between different coders/observers Consistency between different coders/observers Consistency among multiple indicators Consistency among multiple indicators Estimates of Reliability Estimates of Reliability Statistical coefficients that tell use how consistently we measured something Statistical coefficients that tell use how consistently we measured something

Measurement Validity Are we really measuring concept we defined? Are we really measuring concept we defined? Is it a valid way to measure the concept? Is it a valid way to measure the concept? Many different approaches to validation Many different approaches to validation Judgmental as well as empirical aspects Judgmental as well as empirical aspects

Key to Reliability and Validity Concept explication Concept explication Thorough meaning analysis Thorough meaning analysis Conceptual definition: Conceptual definition: Defining what a concept means Defining what a concept means Operational definition: Operational definition: Spelling out how we are going to measure concept Spelling out how we are going to measure concept

Four Aspects of Reliability: 1. Stability 1. Stability 2. Reproducibility 2. Reproducibility 3. Homogeneity 3. Homogeneity 4. Accuracy 4. Accuracy

1. Stability Consistency across time Consistency across time repeating a measure at a later time to examine the consistency repeating a measure at a later time to examine the consistency Compare time 1 and time 2 Compare time 1 and time 2

2. Reproducibility Consistency between observers Consistency between observers Equivalent application of measuring device Equivalent application of measuring device Do observers reach the same conclusion? Do observers reach the same conclusion? If we don’t get the same results, what are we measuring? If we don’t get the same results, what are we measuring? Lack of reliability can compromise validity Lack of reliability can compromise validity

3. Homogeneity Consistency between different measures of the same concept Consistency between different measures of the same concept Different items used to tap a given concept show similar results – ex. open-ended and closed-ended questions Different items used to tap a given concept show similar results – ex. open-ended and closed-ended questions

4. Accuracy Lack of mistakes in measurement Lack of mistakes in measurement Increased by clear, defined procedures Increased by clear, defined procedures Reduce complications that lead to errors Reduce complications that lead to errors Observers must have sufficient: Observers must have sufficient: Training Training Motivation Motivation Concentration Concentration

Increasing Reliability General: General: Training coders/interviewers/lab personnel Training coders/interviewers/lab personnel More careful concept explication (definitions) More careful concept explication (definitions) Specification of procedures/rules Specification of procedures/rules Reduce subjectivity (room for interpretation) Reduce subjectivity (room for interpretation) Survey measurement: Survey measurement: Increase the number of items in scale Increase the number of items in scale Weeding out bad items from “item pool” Weeding out bad items from “item pool” Content analysis coding: Content analysis coding: Improve definition of content categories Improve definition of content categories Eliminate bad coders Eliminate bad coders

Indicators of Reliability Test-retest Test-retest Make measurements more than once and see if they yield the same result Make measurements more than once and see if they yield the same result Split-half Split-half If you have multiple measures of a concept, split items into two scales, which should then be correlated If you have multiple measures of a concept, split items into two scales, which should then be correlated Cronbach’s Alpha or Mean Item-total Correlation Cronbach’s Alpha or Mean Item-total Correlation

Reliability and Validity Reliability is a necessary condition for validity Reliability is a necessary condition for validity If it is not reliable it cannot be valid If it is not reliable it cannot be valid Reliability is NOT a sufficient condition for validity Reliability is NOT a sufficient condition for validity If it is reliable it may not necessarily be valid If it is reliable it may not necessarily be valid Example: Example: Bathroom scale, old springs Bathroom scale, old springs

Not Reliable or Valid

Reliable but not Valid

Reliable and Valid

Types of Validity 1. Face validity 1. Face validity 2. Content validity 2. Content validity 3. Pragmatic (criterion) validity 3. Pragmatic (criterion) validity A. Concurrent validity A. Concurrent validity B. Predictive validity B. Predictive validity 4. Construct validity 4. Construct validity A. Testing of hypotheses A. Testing of hypotheses B. Convergent validity B. Convergent validity C. Discriminant validity C. Discriminant validity

Face Validity Subjective judgment of experts about: Subjective judgment of experts about: “what’s there” “what’s there” Do the measures make sense? Do the measures make sense? Compare each item to conceptual definition Compare each item to conceptual definition Do it represent the concept in question? Do it represent the concept in question? If not, it should be dropped If not, it should be dropped Is the measure valid “on its face” Is the measure valid “on its face”

Content Validity Subjective judgment of experts about: Subjective judgment of experts about: “what is not there” “what is not there” Start with conceptual definition of each dimension: Start with conceptual definition of each dimension: Is it represented by indicators at the operational level? Is it represented by indicators at the operational level? Are some over or underrepresented? Are some over or underrepresented? If current indicators are insufficient: If current indicators are insufficient: develop and add more indicators develop and add more indicators Example--Civic Participation questions: Example--Civic Participation questions: Did you vote in the last election? Did you vote in the last election? Do you belong to any civic groups? Do you belong to any civic groups? Have you ever attended a city council meeting? Have you ever attended a city council meeting? What about “protest participation” or “online organizing”? What about “protest participation” or “online organizing”?

Pragmatic Validity Empirical evidence used to test validity Empirical evidence used to test validity Compare measure to other indicators Compare measure to other indicators 1. Concurrent validity 1. Concurrent validity Does a measure predict simultaneous criterion? Does a measure predict simultaneous criterion? Validating new measure by comparing to existing measure Validating new measure by comparing to existing measure E.g., Does new intelligence test correlate with established test E.g., Does new intelligence test correlate with established test 2. Predictive validity 2. Predictive validity Does a measure predict future criterion? Does a measure predict future criterion? E.g., SAT scores: Do they predict college GPA? E.g., SAT scores: Do they predict college GPA?

Construct Validity Encompasses other elements of validity Encompasses other elements of validity Do measurements: Do measurements: A. Represent all dimensions of the concept A. Represent all dimensions of the concept B. Distinguish concept from other similar concepts B. Distinguish concept from other similar concepts Tied to meaning analysis of the concept Tied to meaning analysis of the concept Specifies the dimensions and indicators to be tested Specifies the dimensions and indicators to be tested Assessing construct validity Assessing construct validity A. Testing hypotheses A. Testing hypotheses B. Convergent validity B. Convergent validity C. Discriminant validity C. Discriminant validity

A. Testing Hypotheses When measurements are put into practice: When measurements are put into practice: Are hypotheses that are theoretically derived, supported by observations? Are hypotheses that are theoretically derived, supported by observations? If not, there is a problem with: If not, there is a problem with: A. Theory A. Theory B. Research design (internal validity) B. Research design (internal validity) C. Measurement (construct validity?) C. Measurement (construct validity?) In seeking to examine construct validity: In seeking to examine construct validity: Examine theoretical linkages of the concept to others Examine theoretical linkages of the concept to others Must identify antecedent and consequences Must identify antecedent and consequences What leads to the concept? What leads to the concept? What are the effects of the concept? What are the effects of the concept?

B. Convergent Validity Measuring a concept with different methods Measuring a concept with different methods If different methods yield the same results: If different methods yield the same results: than convergent validity is supported than convergent validity is supported E.g., Survey items measuring Participation: E.g., Survey items measuring Participation: Voting Voting Donating to money to candidates Donating to money to candidates Signing petitions Signing petitions Writing letters to the editor Writing letters to the editor Civic group memberships Civic group memberships Volunteer activities Volunteer activities

C. Discriminant (Divergent) Validity Measuring a concept to discriminate that concept from other closely related concepts Measuring a concept to discriminate that concept from other closely related concepts E.g., Measuring Maternalism and Paternalism as distinct concepts E.g., Measuring Maternalism and Paternalism as distinct concepts

Dimensions of Validity for Research Design Internal Internal Validity of research design Validity of research design Validity of sampling, measurement, procedures Validity of sampling, measurement, procedures External External Given the research design, how valid are Given the research design, how valid are Inferences made from the conclusions Inferences made from the conclusions Implications for real world Implications for real world

Internal and External Validity in Experimental Design Internal validity: Internal validity: Did the experimental treatment make a difference? Did the experimental treatment make a difference? Or is there an internal design flaw that invalidates the results? Or is there an internal design flaw that invalidates the results? External validity: External validity: Are the results generalizable? Are the results generalizable? Generalizable to: Generalizable to: What populations? What populations? What situations? What situations? Without internal validity, there is no external validity Without internal validity, there is no external validity