Cal State Northridge Psy 427 Andrew Ainsworth PhD

Slides:

Advertisements

Similar presentations

Measurement Concepts Operational Definition: is the definition of a variable in terms of the actual procedures used by the researcher to measure and/or.

Advertisements

Cal State Northridge Psy 427 Andrew Ainsworth PhD

The Research Consumer Evaluates Measurement Reliability and Validity

1 COMM 301: Empirical Research in Communication Kwan M Lee Lect4_1.

Reliability and Validity

Chapter 4A Validity and Test Development. Basic Concepts of Validity Validity must be built into the test from the outset rather than being limited to.

RESEARCH METHODS Lecture 18

Chapter 4 Validity.

Test Validity: What it is, and why we care.

Concept of Measurement

Chapter 7 Correlational Research Gay, Mills, and Airasian

Chapter 7 Evaluating What a Test Really Measures

Reliability and Validity. Criteria of Measurement Quality How do we judge the relative success (or failure) in measuring various concepts? How do we judge.

Understanding Validity for Teachers

Chapter 4. Validity: Does the test cover what we are told (or believe)

1 Evaluating Psychological Tests. 2 Psychological testing Suffers a credibility problem within the eyes of general public Two main problems –Tests used.

Test Validity S-005. Validity of measurement Reliability refers to consistency –Are we getting something stable over time? –Internally consistent? Validity.

Ch 6 Validity of Instrument

Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.

Technical Adequacy Session One Part Three.

Validity & Practicality

MGTO 231 Human Resources Management Personnel selection II Dr. Kin Fai Ellick WONG.

Validity. Face Validity  The extent to which items on a test appear to be meaningful and relevant to the construct being measured.

Assessing the Quality of Research

Validity RMS – May 28, Measurement Reliability The extent to which a measurement gives results that are consistent.

CDIS 5400 Dr Brenda Louw 2010 Validity Issues in Research Design.

6. Evaluation of measuring tools: validity Psychometrics. 2012/13. Group A (English)

Measurement Validity.

Research: Conceptualization and Measurement Conceptualization Steps in measuring a variable Operational definitions Confounding Criteria for measurement.

Validity Validity: A generic term used to define the degree to which the test measures what it claims to measure.

Research Methodology and Methods of Social Inquiry Nov 8, 2011 Assessing Measurement Reliability & Validity.

Validity Test Validity: What it is, and why we care.

Chapter 4 Validity Robert J. Drummond and Karyn Dayle Jones Assessment Procedures for Counselors and Helping Professionals, 6 th edition Copyright ©2006.

Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Assessing Measurement Quality in Quantitative Studies.

Validity Validity is an overall evaluation that supports the intended interpretations, use, in consequences of the obtained scores. (McMillan 17)

Validity and Item Analysis Chapter 4.  Concerns what instrument measures and how well it does so  Not something instrument “has” or “does not have”

Criteria for selection of a data collection instrument. 1.Practicality of the instrument: -Concerns its cost and appropriateness for the study population.

Validity and Reliability in Instrumentation : Research I: Basics Dr. Leonard February 24, 2010.

Chapter 6 - Standardized Measurement and Assessment

RELIABILITY AND VALIDITY Dr. Rehab F. Gwada. Control of Measurement Reliabilityvalidity.

Measurement and Scaling Concepts

Consistency and Meaningfulness Ensuring all efforts have been made to establish the internal validity of an experiment is an important task, but it is.

Some Terminology experiment vs. correlational study IV vs. DV descriptive vs. inferential statistics sample vs. population statistic vs. parameter H 0.

MGMT 588 Research Methods for Business Studies

Reliability and Validity

Personality Assessment, Measurement, and Research Design

VALIDITY by Barli Tambunan/

Chapter 2 Personality Research Methods

Reliability and Validity

Concept of Test Validity

Evaluation of measuring tools: validity

Understanding Results

Journalism 614: Reliability and Validity

Reliability & Validity

Classroom Assessment Validity And Bias in Assessment.

Human Resource Management By Dr. Debashish Sengupta

Week 3 Class Discussion.

پرسشنامه کارگاه.

PSY 614 Instructor: Emily Bullock Yowell, Ph.D.

Reliability and Validity of Measurement

Analyzing Reliability and Validity in Outcomes Assessment Part 1

RESEARCH METHODS Lecture 18

INTRODUCTION TO RESEARCH

Personality Assessment, Measurement, and Research Design

An Introduction to Correlational Research

Analyzing Reliability and Validity in Outcomes Assessment

Chapter 8 VALIDITY AND RELIABILITY

Misc Internal Validity Scenarios External Validity Construct Validity

Presentation transcript:

Cal State Northridge Psy 427 Andrew Ainsworth PhD Validity Cal State Northridge Psy 427 Andrew Ainsworth PhD

Validity - Definitions The extent to which a test measures what it was designed to measure. Agreement between a test score or measure and the quality it is believed to measure. Proliferation of definitions led to a dilution of the meaning of the word into all kinds of “validities”

Validity - Definitions Internal validity – Cause and effect in experimentation; high levels of control; elimination of confounding variables External validity - to what extent one may safely generalize the (internally valid) causal inference (a) from the sample studied to the defined target population and (b) to other populations (i.e. across time and space). Generalize to other people Population validity – can the sample results be generalized to the target population Ecological validity - whether the results can be applied to real life situations. Generalize to other (real) situations

Validity - Definitions Content validity – when trying to measure a domain are all sub-domains represented When measuring depression are all 16 clinical criteria represented in the items Very complimentary to domain sampling theory and reliability However, often high levels of content validity will lead to lower internal consistency reliability

Validity - Definitions Construct validity – overall are you measuring what you are intending to measure Intentional validity – are you measuring what you are intending and not something else. Requires that constructs be specific enough to differentiate Representation validity or translation validity – how well have the constructs been translated into measureable outcomes. Validity of the operational definitions Face validity – Does a test “appear” to be measuring the content of interest. Do questions about depression have the words “sad” or “depressed” in them

Validity - Definitions Construct Validity Observation validity – how good are the measures themselves. Akin to reliability Convergent validity - Convergent validity refers to the degree to which a measure is correlated with other measures that it is theoretically predicted to correlate with. Discriminant validity - Discriminant validity describes the degree to which the operationalization does not correlate with other operationalizations that it theoretically should not correlated with.

Validity - Definitions Criterion-Related Validity - the success of measures used for prediction or estimation. There are two types: Concurrent validity - the degree to which a test correlates with an external criteria that is measured at the same time (e.g. does a depression inventory correlated with clinical diagnoses) Predictive validity - the degree to which a test predicts (correlates) with an external criteria that is measured some time in the future (e.g. does a depression inventory score predict later clinical diagnosis) Social validity – refers to the social importance and acceptability of a measure

Validity - Definitions There is a total mess of “validities” and their definitions, what to do? 1985 - Joint Committee of AERA: American Education Research Association APA: American Psychological Association NCME: National Council on Measurement in Education developed Standards for Educational and Psychological Testing (revised in 1999).

Validity - Definitions According to the Joint Committee: Validity is the evidence for inferences made about a test score. Three types of evidence: Content-related Criterion-related Construct-related Different from the notion of “different types of validity”

Validity - Definitions Content-related evidence (Content Validity) Based upon an analysis of the body of knowledge surveyed. Criterion-related evidence (Criterion Validity) Based upon the relationship between scores on a particular test and performance or abilities on a second measure (or in real life). Construct-related evidence (Construct Validity) Based upon an investigation of the psychological constructs or characteristics of the test.

Validity That Isn’t Face Validity The mere appearance that a test has validity. Does the test look like it measures what it is supposed to measure? Do the items seem to be reasonably related to the perceived purpose of the test. Does a depression inventory ask questions about being sad? Not a “real” measure of validity, but one that is commonly seen in the literature. Not considered legitimate form of validity by the Joint Committee.

Content-Related Evidence Does the test adequately sample the content or behavior domain that it is designed to measure? If items are not a good sample, results of testing will be misleading. Usually developed during test development. Not generally empirically evaluated. Judgment of subject matter experts.

Content-Related Evidence To develop a test with high content-related evidence of validity, you need: good logic intuitive skills Perseverance Must consider: wording reading level

Content-Related Evidence Other content-related evidence terms Construct underrepresentation: failure to capture important components of a construct. Test is designed for chapters 1-10 but only chapters 1- 8 show up on the test. Construct-irrelevant variance: occurs when scores are influenced by factors irrelevant to the construct. Test is well-intentioned, but problems secondary to the test negatively influence the results (e.g., reading level, vocabulary, unmeasured secondary domains)

Criterion-Related Evidence Tells us how well a test corresponds with a particular criterion criterion: behavioral or measurable outcome SAT predicting GPA (GPA is criterion) BDI scores predicting suicidality (suicide is criterion). Used to “predict the future” or “predict the present.”

Criterion-Related Evidence Predictive Validity Evidence forecasting the future how well does a test predict future outcomes SAT predicting 1st yr GPA most tests don’t have great predictive validity decrease due to time & method variance

Criterion-Related Evidence Concurrent Validity Evidence forecasting the present how well does a test predict current similar outcomes job samples, alternative tests used to demonstrate concurrent validity evidence generally higher than predictive validity estimates

Quantifying Criterion-Related Evidence Validity Coefficient correlation between the test and the criterion usually between .30 and .60 in real life. In general, as long as they are statistically significant, evidence is considered valid. However, recall that r2 indicates explained variance. SO, in reality, we are only looking at explained criterion variance in the range of 9 to 36%. Sound Problematic??

Evaluating Criterion-Related Evidence Look for changes in the cause of relationships. (third variable effect) E.g. Situational factors during validation that are replicated in later uses of the scale Examine what the criterion really means. Optimally the criterion should be something the test is trying to measure If the criterion is not valid and reliable, you have no evidence of criterion-related validity! Review the subject population in the validity study. If the normative sample is not representative, you have little evidence of criterion-related validity.

Evaluating Criterion-Related Evidence Ensure the sample size in the validity study was adequate. Never confuse the criterion with the predictor. GREs are used to predict success in grad school Some grad programs may admit low GRE students but then require a certain GRE before they can graduate. So, low GRE scores succeed, this demonstrates poor predictive validity! But the process was dumb to begin with… Watch for restricted ranges.

Evaluating Criterion-Related Evidence Review evidence for validity generalization. Tests only given in laboratory settings, then expected to demonstrate validity in classrooms? Ecological validity? Consider differential prediction. Just because a test has good predictive validity for the normative sample may not ensure good predictive validity for people outside the normative sample. External validity?

Construct-Related Evidence Construct: something constructed by mental synthesis What is Intelligence? Love? Depression? Construct Validity Evidence assembling evidence about what a test means (and what it doesn’t) sequential process; generally takes several studies

Construct-Related Evidence Convergent Evidence obtained when a measure correlates well with other tests believed to measure the same construct. Self-report, collateral-report measures Discriminant Evidence obtained when a measure correlates less strong with other tests believed to measure something slightly different This does not mean any old test that you know won’t correlate; should be something that could be related but you want to show is separate Example: IQ and Achievement Tests

Validity & Estimates Standard Error of Estimate: standard deviation of the test validity of the test Essentially, this is regression all over again.

Validity and Reliability Maximum Validity depends on Reliability is the maximum validity is the reliability of test 1

Validity and Reliability