The Research Consumer Evaluates Measurement Reliability and Validity

Slides:



Advertisements
Similar presentations
Agenda Levels of measurement Measurement reliability Measurement validity Some examples Need for Cognition Horn-honking.
Advertisements

Chapter 8 Flashcards.
Measurement Concepts Operational Definition: is the definition of a variable in terms of the actual procedures used by the researcher to measure and/or.
Taking Stock Of Measurement. Basics Of Measurement Measurement: Assignment of number to objects or events according to specific rules. Conceptual variables:
Reliability and Validity checks S-005. Checking on reliability of the data we collect  Compare over time (test-retest)  Item analysis  Internal consistency.
Increasing your confidence that you really found what you think you found. Reliability and Validity.
Psychometrics William P. Wattles, Ph.D. Francis Marion University.
Chapter 4 – Reliability Observed Scores and True Scores Error
VALIDITY AND RELIABILITY
Reliability for Teachers Kansas State Department of Education ASSESSMENT LITERACY PROJECT1 Reliability = Consistency.
Reliability and Validity of Research Instruments
RESEARCH METHODS Lecture 18
Reliability and Validity Dr. Roy Cole Department of Geography and Planning GVSU.
RELIABILITY & VALIDITY
Concept of Measurement
Validity, Reliability, & Sampling
Research Methods in MIS
Classroom Assessment A Practical Guide for Educators by Craig A
Understanding Validity for Teachers
Chapter 4. Validity: Does the test cover what we are told (or believe)
Validity and Reliability Neither Valid nor Reliable Reliable but not Valid Valid & Reliable Fairly Valid but not very Reliable Think in terms of ‘the purpose.
Test Validity S-005. Validity of measurement Reliability refers to consistency –Are we getting something stable over time? –Internally consistent? Validity.
Measurement and Data Quality
Validity and Reliability
Reliability, Validity, & Scaling
MEASUREMENT OF VARIABLES: OPERATIONAL DEFINITION AND SCALES
Reliability and Validity what is measured and how well.
Instrumentation.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 14 Measurement and Data Quality.
LECTURE 06B BEGINS HERE THIS IS WHERE MATERIAL FOR EXAM 3 BEGINS.
Technical Adequacy Session One Part Three.
Psychometrics William P. Wattles, Ph.D. Francis Marion University.
Reliability & Validity
1 Chapter 4 – Reliability 1. Observed Scores and True Scores 2. Error 3. How We Deal with Sources of Error: A. Domain sampling – test items B. Time sampling.
Measurement Validity.
Research: Conceptualization and Measurement Conceptualization Steps in measuring a variable Operational definitions Confounding Criteria for measurement.
Validity and Reliability Neither Valid nor Reliable Reliable but not Valid Valid & Reliable Fairly Valid but not very Reliable Think in terms of ‘the purpose.
Research: Conceptualization and Measurement Conceptualization Steps in measuring a variable Operational definitions Confounding Criteria for measurement.
Validity Validity: A generic term used to define the degree to which the test measures what it claims to measure.
Research Methodology and Methods of Social Inquiry Nov 8, 2011 Assessing Measurement Reliability & Validity.
Measurement Issues General steps –Determine concept –Decide best way to measure –What indicators are available –Select intermediate, alternate or indirect.
McGraw-Hill/Irwin © 2012 The McGraw-Hill Companies, Inc. All rights reserved. Obtaining Valid and Reliable Classroom Evidence Chapter 4:
©2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Nurhayati, M.Pd Indraprasta University Jakarta.  Validity : Does it measure what it is supposed to measure?  Reliability: How the representative is.
Reliability and Validity Themes in Psychology. Reliability Reliability of measurement instrument: the extent to which it gives consistent measurements.
Measurement Experiment - effect of IV on DV. Independent Variable (2 or more levels) MANIPULATED a) situational - features in the environment b) task.
Chapter 6 - Standardized Measurement and Assessment
VALIDITY, RELIABILITY & PRACTICALITY Prof. Rosynella Cardozo Prof. Jonathan Magdalena.
Reliability a measure is reliable if it gives the same information every time it is used. reliability is assessed by a number – typically a correlation.
Validity & Reliability. OBJECTIVES Define validity and reliability Understand the purpose for needing valid and reliable measures Know the most utilized.
Language Assessment Lecture 7 Validity & Reliability Instructor: Dr. Tung-hsien He
Measurement Chapter 6. Measuring Variables Measurement Classifying units of analysis by categories to represent variable concepts.
Copyright © 2014 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 11 Measurement and Data Quality.
Measurement and Scaling Concepts
ESTABLISHING RELIABILITY AND VALIDITY OF RESEARCH TOOLS Prof. HCL Rawat Principal UCON,BFUHS Faridkot.
Copyright © 2014 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 25 Critiquing Assessments Sherrilene Classen, Craig A. Velozo.
Ch. 5 Measurement Concepts.
Reliability and Validity
Reliability and Validity
Concept of Test Validity
Test Validity.
Introduction to Measurement
Human Resource Management By Dr. Debashish Sengupta
پرسشنامه کارگاه.
5. Reliability and Validity
Unit IX: Validity and Reliability in nursing research
Reliability.
RESEARCH METHODS Lecture 18
Measurement Concepts and scale evaluation
Presentation transcript:

The Research Consumer Evaluates Measurement Reliability and Validity Chapter 6. The Research Consumer Evaluates Measurement Reliability and Validity

Evidence that Matters: Reliable Measurements Evidence that matters is collected from reliable, valid, responsive, and interpretable measures—methods of collecting information---of participant characteristics and program process, outcomes, impact and costs. For research findings to count, they must come from measures that have the capacity to consistently and accurately detect changes in program participants’ knowledge, attitudes and behavior. A reliable measure is a consistent one. A measure of quality of life, for example, is reliable if, on average, it produces the same information from the same people today and two weeks from now.

Reliability, Reproducibility, and Precision A reliable measure is reproducible and precise: Each time it is used it produces the same value. A beam scale can measure body weight precisely, but a questionnaire about good citizenship is likely to produce values that vary from person to person and even from time to time. A measure (e.g., of good citizenship) cannot be perfectly precise it its underlying concept is imprecise (e.g., because differing definitions of good citizenship). This imprecision is the gateway to random (chance) error. Error comes from three sources: variability in the measure itself, variability in the respondents, and variability in the observer.

Reliability Types Test-retest reliability A measure has test-retest reliability if the correlation or reliability coefficient between scores from time-to-time is high. Internal Consistency Reliability. Internal consistency is an indicator of the cohesion of the items in a single measure. All items in an internally consistent measure actually assess the same idea or concept. One example of internal consistency might be a test of two questions. The first statement says "You almost always feel like smoking." The second question says "You almost never feel like smoking." If a person agrees with the first and disagrees with the second, the test has internal consistency.

Reliability Types (Continued) Split-half Reliability To estimate split-half reliability, the researcher divides a measure into two equal halves (say by choosing all odd numbered questions to be in the first and all even numbered questions to be in the second half). Then using the researcher calculates the correlation between the two halves. Alternate-form Reliability Refers to the extent to which two instruments measure the same concepts at the same level of difficulty.

Reliability Types (Continued) Intra-rater reliability Refers to the extent to which an individual’s observations are consistent over time. If you score the quality of 10 evaluation reports at time 1, for example, and then re-score them 2 weeks later, your intrra-rater reliability will be perfect if the two sets of scores are in perfect agreement.

Reliability Types (Continued) Inter-rater reliability Refers to the extent to which two or more observers or measurements agree with one another. Suppose you and a co-worker score the quality of 10 evaluation reports. If you and your co-worker have identical scores for each of the 10 reports, you inter-rater reliability will be perfect. A commonly used method for determining the agreement between observations and observers results in a statistic called kappa.

Measurement Validity Validity refers to the degree to which a measure assesses what it is supposed to measure. Measurement validity is not the same thing as internal and the concepts of external validity we discussed in connection with research design Measurement validity refers to the extent to which a measure or instrument provides data that accurately represents the concepts of interest.

Validity Types Content validity Refers to the extent to which a measure thoroughly and appropriately assesses the skills or characteristics it is intended to measure. Face validity Refers to how a measure appears on the surface: Does it seem to cover all the important domains? ask all the needed questions? Face validity is established by experts in the field who are asked to review a measure and to comment on its coverage. Face validity is the weakest type because it does not have theoretical or research support.

Validity Types (Continued) Predictive validity Predictive validity refers to the extent to which a measure forecasts future performance. A graduate school entry examination that predicts who will do well in graduate school (as measured, for example, by grades) has predictive validity. Concurrent validity Concurrent validity is demonstrated when two measures agree with one another, or a new measure compares favorably with one that is already considered valid. Construct validity Construct validity is established experimentally to demonstrate that a measure distinguishes between people who do and do not have certain characteristics. To demonstrate constructive validity for a measure of competent teaching, you need proof that teachers who do well on the measure are competent whereas teachers who do poorly are incompetent.

Sensitivity and Specificity Sensitivity and specificity are two terms that are used in connection with screening and diagnostic tests and measures to detect “disease .” Sensitivity refers to the proportion of people with disease who have a positive test result. A sensitive measure will correctly detect disease among people who have the disease. A sensitive measure is a valid measure. What happens when people without the disease get a positive test anyway, as sometimes happens? That is called a false positive. Insensitive, invalid measures lead to false positives.

Sensitivity and Specificity Specificity refers to the proportion of people without disease who have a negative test result. Measures with poor specificity lead to false negatives. They invalidly classify people as not having a disease when in fact they actually do.