Validity Psych 395 - DeShon. Example: Validity of a Measure “The use of the polygraph (lie detector test) is not nearly as valid as some say and can easily.

Slides:



Advertisements
Similar presentations
Agenda Levels of measurement Measurement reliability Measurement validity Some examples Need for Cognition Horn-honking.
Advertisements

Chapter 8 Flashcards.
Measurement Concepts Operational Definition: is the definition of a variable in terms of the actual procedures used by the researcher to measure and/or.
 Degree to which inferences made using data are justified or supported by evidence  Some types of validity ◦ Criterion-related ◦ Content ◦ Construct.
Cal State Northridge Psy 427 Andrew Ainsworth PhD
Survey Methodology Reliability and Validity EPID 626 Lecture 12.
1 COMM 301: Empirical Research in Communication Kwan M Lee Lect4_1.
Reliability and Validity
© 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Validity and Reliability Chapter Eight.
Reliability & Validity.  Limits all inferences that can be drawn from later tests  If reliable and valid scale, can have confidence in findings  If.
Measurement Reliability and Validity
Chapter 4A Validity and Test Development. Basic Concepts of Validity Validity must be built into the test from the outset rather than being limited to.
RESEARCH METHODS Lecture 18
Chapter 4 Validity.
Test Validity: What it is, and why we care.
VALIDITY.
Developing a Hiring System Reliability of Measurement.
SELECTION & ASSESSMENT SESSION THREE: MEASURING THE EFFECTIVENESS OF SELECTION METHODS.
Validity of Selection. Objectives Define Validity Relation between Reliability and Validity Types of Validity Strategies.
Chapter 7 Evaluating What a Test Really Measures
Classroom Assessment A Practical Guide for Educators by Craig A
Reliability and Validity. Criteria of Measurement Quality How do we judge the relative success (or failure) in measuring various concepts? How do we judge.
Understanding Validity for Teachers
Chapter 4. Validity: Does the test cover what we are told (or believe)
Ch 6 Validity of Instrument
Reliability and Validity what is measured and how well.
Instrumentation.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 14 Measurement and Data Quality.
MGTO 324 Recruitment and Selections Validity II (Criterion Validity) Kin Fai Ellick Wong Ph.D. Department of Management of Organizations Hong Kong University.
Validity. Face Validity  The extent to which items on a test appear to be meaningful and relevant to the construct being measured.
MGTO 324 Recruitment and Selections Validity I (Construct Validity) Kin Fai Ellick Wong Ph.D. Department of Management of Organizations Hong Kong University.
Validity Is the Test Appropriate, Useful, and Meaningful?
Assessing the Quality of Research
6. Evaluation of measuring tools: validity Psychometrics. 2012/13. Group A (English)
Measurement Models: Exploratory and Confirmatory Factor Analysis James G. Anderson, Ph.D. Purdue University.
Measurement Validity.
Research: Conceptualization and Measurement Conceptualization Steps in measuring a variable Operational definitions Confounding Criteria for measurement.
Chapter 8 Validity and Reliability. Validity How well can you defend the measure? –Face V –Content V –Criterion-related V –Construct V.
Research: Conceptualization and Measurement Conceptualization Steps in measuring a variable Operational definitions Confounding Criteria for measurement.
Concurrent Validity Pages By: Davida R. Molina October 23, 2006.
Validity Validity: A generic term used to define the degree to which the test measures what it claims to measure.
Research Methodology and Methods of Social Inquiry Nov 8, 2011 Assessing Measurement Reliability & Validity.
Validity Test Validity: What it is, and why we care.
Chapter 4 Validity Robert J. Drummond and Karyn Dayle Jones Assessment Procedures for Counselors and Helping Professionals, 6 th edition Copyright ©2006.
Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Assessing Measurement Quality in Quantitative Studies.
MEASUREMENT. MeasurementThe assignment of numbers to observed phenomena according to certain rules. Rules of CorrespondenceDefines measurement in a given.
Validity and Item Analysis Chapter 4. Validity Concerns what the instrument measures and how well it does that task Not something an instrument has or.
Validity and Item Analysis Chapter 4.  Concerns what instrument measures and how well it does so  Not something instrument “has” or “does not have”
Week 4 Slides. Conscientiousness was most highly voted for construct We will also give other measures – protestant work ethic and turnover intentions.
Chapter 9 Correlation, Validity and Reliability. Nature of Correlation Association – an attempt to describe or understand Not causal –However, many people.
Chapter 6 - Standardized Measurement and Assessment
Lab 6 Validity. Picking a Topic for Your Paper Were you able to come up with 3 ideas? Let’s chat about some of the ideas to make sure we’re all on the.
Survey Design Class 02.  It is a true measure  Measurement Validity is the degree of fit between a construct and indicators of it.  It refers to how.
Foundations of Evidence-Based Outcome Measurement.
Survey Methodology Reliability and Validity
Reliability and Validity
Reliability and Validity
Reliability and Validity in Research
Concept of Test Validity
Evaluation of measuring tools: validity
Validity and Reliability
Journalism 614: Reliability and Validity
Human Resource Management By Dr. Debashish Sengupta
Week 3 Class Discussion.
پرسشنامه کارگاه.
5. Reliability and Validity
Reliability and Validity of Measurement
RESEARCH METHODS Lecture 18
Cal State Northridge Psy 427 Andrew Ainsworth PhD
Presentation transcript:

Validity Psych DeShon

Example: Validity of a Measure “The use of the polygraph (lie detector test) is not nearly as valid as some say and can easily be beaten and should never be admitted into evidence in courts of law, say psychologists from two scientific communities who were surveyed on the validity of polygraphs.” – APA News Release

Measures and Constructs, Again! XY Unobservable Variables XY Measured Variables

Basic Insights… Differences in Observed Measures are Caused by Variations in the Unobserved Construct. One Way to Think About Validity: How well does the observed variable capture the unobserved variable?

Apply this Idea to the Polygraph…

Example: Validity of a Measure “The use of the polygraph (lie detector test) is not nearly as valid as some say and can easily be beaten and should never be admitted into evidence in courts of law, say psychologists from two scientific communities who were surveyed on the validity of polygraphs.” – APA News Release

Issues of Validity Does the test actually measure what it is purported to measure? Do differences in tests scores reflect true differences in the underlying construct? Are inferences based on the test scores justified?

It’s All About Inferences…. Cronbach (1971): Validation is the process of collecting evidence to support the types of inferences that are drawn from test scores. There is no such thing as “the” validity of a test. Why? Many different kinds of inferences can be made from the same test.

“Validity for what?” Inferences and decisions based on test scores A person with this score is likely to Be a better parent Do well in law school Be most satisfied as an engineer Steal from his/her employer

Types of validity Content Criterion-related Construct construct (general evidence-gathering) content (more theory- based) criterion-related (more data-based)

Content Validity of a Measure Collectively, do the items adequately represent all of the domains of the construct of interest? Staring Point: A Well Defined Construct. Often have a panel of experts judge whether items adequately sample the domain of interest.

Example: 1 st Grade Math Objectives What 1 st Graders in School District X Should: A. Be able to add any two positive numbers whose sum is 20 or less. B. Subtract any two numbers (each less than 15) whose difference is a positive number.

Item Pool – Which are Content Valid? =___ =____ = ____ = ____ – 7 = ____ Sammy has 10 pennies. He lost 2. How many pennies does Sammy have? A. 2 pennies B. 8 pennies C. 10 pennies D. 12 pennies

Example: Depression (Modified from the DSM – IV) A complex of symptoms marked by: – Disruptions in appetite and weight – Insomnia or hypersomnia – Loss of interest or pleasure in activities – Loss of energy – Feelings of worthlessness – Feels sad or empty nearly everyday – Frequent death-related thoughts

Item Pool – Which are Content Valid? I feel blue or sad. I feel nervous when speaking to someone in authority. I have crying spells. I’m always willing to admit it when I make a mistake. I felt that everything I did was an effort. I never resent being asked to return a favor. I experience spells of terror or panic.

Contamination & Deficiency ConstructMeasure Relevance or Content Validity Measure Contamination Measure Deficiency

What do we want? A measure that samples from all important domains or aspects (Low Deficiency) A measure that does not include anything irrelevant (Low Contamination) That is, a measure that adequately captures all of the domains of the construct that it is intended to measure. (High Content Validity)

What Else Do We Want: A Measure that Predicts Something It Should!

Criterion-related Evidence for a Measure What should this test predict? What inferences are we going to use this test to make? Criterion-related validation is data based. Does the test actually predict behavior that it is supposed to predict? – Correlate an honesty test with employee theft – Correlate a pencil and paper measure of delinquency with arrest records – Correlate a measure of study habits with actual grades

Two types of criterion-related validity Predictive validity – future criteria Concurrent validity – current criteria This distinction makes no procedural difference (Both correlations)

Think of a Relevant Criterion SAT or ACT Scores A Measure of Conscientiousness A Measure of Political Liberalism A Measure of Relationship Satisfaction

Criterion-related validity: Concurrent validity Students who have been admitted to MSU take the SAT. Their GPA is recorded at the same time. The correlation between the test scores and performance is computed. This correlation is sometimes called a validity coefficient.

Criterion-related validity: Predictive validity Students take the SAT (or ACT) during High School and then some are selected into MSU. Later, their SAT scores are correlated with their college GPA. This correlation is also sometimes called a validity coefficient. If SAT scores and college GPA are correlated, then the SAT has some degree of predictive validity for predicting college GPA.

In both cases the degree of criterion- related validity is inferred from the size of the correlation….

Issues What is our Criterion? How do we measure it? – Reliability of Predictor and Criterion – Recall: What does measurement error do? What sample will we use? – Small Samples – More Imprecision – Issues of Generalization Restriction of Range – Want Variability on both Predictor and Criterion variables Predictor-Criterion Overlap – Same “items” on both measures … bad!

Measurement Error Reliability – Index of the presence of measurement error (1.0 reliability = No error) Unreliability in the predictor and criterion increases the error variance and therefore serves to reduce (attenuate) the observed correlation between them

When/where might we find unreliability? … Everywhere! Tests used as predictors (e.g., measures of depression) Criterion measures (e.g., ratings of client well-being) Unreliability is a concern for both predictors and criteria – unreliability in both can reduce correlations

Correcting Correlations for Attenuation r xy = observed correlation between x and y r xx and r xx = reliability coefficients of x and y

Construct Validity – How Well Does a Measure Actually Assess the Underlying Conceptual Variable? Often the focus is on the Construct (i.e., the idea) and NOT just the properties of a single measure. How does this construct fit into a nomological network (a lawful network of expected relations)? Can we get convergence across different measures of the SAME construct? Can we get divergence? Are measures of different constructs unrelated?

Key Terms (Campbell & Fiske, 1959) Convergent Validity: Associations Between Different Methods of Assessing the Same Construct. Confirmation of the Measurement of the Construct using Multiple Methods. Discriminate Validity: Distinctiveness of Constructs. This is indicated by a lack of association between measures of different constructs.

Jingle Fallacy (Kelley, 1927) Jingle fallacy: Belief that because the same name is applied to measures of different constructs, these measures are really assessing the same thing. – Smith’s Measure of Extraversion and Robert’s Measure of Extraversion might not actually measure the same thing.

Jangle fallacy (Kelley, 1927) Jangle fallacy: Belief that because measures are called by different names they are measuring different constructs. – Smith’s Measure of Sociability and Robert’s Measure of Surgency might both actually measure Extraversion.

Q: How do examine all of these ideas? A: Use Correlation Matrices!

Multitrait-Multimethod Matrices (MTMM) Suppose we measure three different personality traits – Extraversion – Conscientiousness – Neuroticism Suppose we measure each of these traits in three different ways – Self-report – Informant Report – Behavior test (Won’t Show this on Charts)

Where is the convergent validity? Where is the discriminant validity? Note.: E=Extraversion, C=Conscientiousness, N=Neuroticism

Convergent Evidence Same construct assessed using different methods (self versus informant) Convergent Validity diagonal (blue font) – E:.57 – C:.45 – N:.39 Technical Label: Monotrait-Heteromethod correlation (“trait correlation”) – Same Trait – Different Method

Note.: E=Extraversion, C=Conscientiousness, N=Neuroticism

Divergent evidence Different traits assessed using the same method – (Want Low Correlations) – Technical: Heterotrait-Monomethod (Method Correlation) – Glop or Method Variance Different traits assessed using different methods - (Want Low Correlations) – Technical: Heterotrait-Hetromethod (“Neither”) – Should be the lowest correlations in the MTMM Matrix

Note.: E=Extraversion, C=Conscientiousness, N=Neuroticism

Differentiation Between Groups Examine the difference in test scores arising from groups known to differ on the construct – Kids with ADHD versus Kids Without ADHD – Depressed versus Non-Depressed People – Criminals versus Non-Criminals – Masculinity versus Femininity – Discriminant group Validity

Factor Analysis

Basic Ideas Figure out what is related and what is not? A construct-validity question … (Talking about convergence and divergence) We do factor analysis in our heads all the time in real life! Statistical Procedure to reduce a large number of intercorrelations to a smaller number of factors that summarize the pattern of observed correlations between variables

What is a Factor? Variables that give rise to correlations between items on a questionnaire. The existence of factors is inferred from patterns of association between observed variables. Factors are sometimes called Source Traits or Latent Traits. Goal of Factor Analysis is to identify these latent (unobserved) variables.