Psychometrics & Validation Psychometrics & Measurement Validity Properties of a “good measure” –Standardization –Reliability –Validity A Taxonomy of Item.

Slides:



Advertisements
Similar presentations
Test Development.
Advertisements

Conceptualization and Measurement
The Research Consumer Evaluates Measurement Reliability and Validity
1 COMM 301: Empirical Research in Communication Kwan M Lee Lect4_1.
Psychlotron.org.uk Why is it important that psychological research be valid and reliable?
Increasing your confidence that you really found what you think you found. Reliability and Validity.
Psychometrics William P. Wattles, Ph.D. Francis Marion University.
VALIDITY AND RELIABILITY
Psychological Testing Principle Types of Psychological Tests  Mental ability tests Intelligence – general Aptitude – specific  Personality scales Measure.
Part II Sigma Freud & Descriptive Statistics
Measurement Reliability and Validity
Face, Content & Construct Validity
CH. 9 MEASUREMENT: SCALING, RELIABILITY, VALIDITY
LECTURE 9.
Introduction to Psychological Measurement
Reliability and Validity of Research Instruments
Experiment Basics: Variables Psych 231: Research Methods in Psychology.
RESEARCH METHODS Lecture 18
Validity, Sampling & Experimental Control Psych 231: Research Methods in Psychology.
Introduction to Psychometrics Psychometrics & Measurement Validity Some important language Properties of a “good measure” –Standardization –Reliability.
Reliability & Validity
Beginning the Research Design
Psychometrics, Measurement Validity & Data Collection
Introduction to Psychometrics
Introduction to Psychometrics Psychometrics & Measurement Validity Constructs & Measurement “Kinds” of Items Properties of a “good measure” –Standardization.
Psych 231: Research Methods in Psychology
Manipulation and Measurement of Variables
Variables cont. Psych 231: Research Methods in Psychology.
Validity, Reliability, & Sampling
Research Methods in MIS
Practical Psychometrics Preliminary Decisions Components of an item # Items & Response Approach to the Validation Process.
Chapter 9 Flashcards. measurement method that uses uniform procedures to collect, score, interpret, and report numerical results; usually has norms and.
Classroom Assessment A Practical Guide for Educators by Craig A
Test Validity S-005. Validity of measurement Reliability refers to consistency –Are we getting something stable over time? –Internally consistent? Validity.
Measurement Concepts & Interpretation. Scores on tests can be interpreted: By comparing a client to a peer in the norm group to determine how different.
Measurement and Data Quality
Reliability, Validity, & Scaling
MEASUREMENT OF VARIABLES: OPERATIONAL DEFINITION AND SCALES
Measurement in Exercise and Sport Psychology Research EPHE 348.
Instrument Validity & Reliability. Why do we use instruments? Reliance upon our senses for empirical evidence Senses are unreliable Senses are imprecise.
Instrumentation.
MEASUREMENT CHARACTERISTICS Error & Confidence Reliability, Validity, & Usability.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 14 Measurement and Data Quality.
More Validity And some reliability. Today’s class Check in Validity In class exercise Reliability.
LECTURE 06B BEGINS HERE THIS IS WHERE MATERIAL FOR EXAM 3 BEGINS.
Psychometrics William P. Wattles, Ph.D. Francis Marion University.
Reliability & Validity
Validity Is the Test Appropriate, Useful, and Meaningful?
Chapter 7 Measurement and Scaling Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin.
Reliability vs. Validity.  Reliability  the consistency of your measurement, or the degree to which an instrument measures the same way each time it.
Measurement Validity.
Experiment Basics: Variables Psych 231: Research Methods in Psychology.
Evaluating Survey Items and Scales Bonnie L. Halpern-Felsher, Ph.D. Professor University of California, San Francisco.
Research Methodology and Methods of Social Inquiry Nov 8, 2011 Assessing Measurement Reliability & Validity.
Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Assessing Measurement Quality in Quantitative Studies.
Reliability: The degree to which a measurement can be successfully repeated.
Chapter 9 Correlation, Validity and Reliability. Nature of Correlation Association – an attempt to describe or understand Not causal –However, many people.
Measurement Reliability Objective & Subjective tests Standardization & Inter-rater reliability Properties of a “good item” Item Analysis Internal Reliability.
MEASUREMENT: PART 1. Overview  Background  Scales of Measurement  Reliability  Validity (next time)
Measurement Experiment - effect of IV on DV. Independent Variable (2 or more levels) MANIPULATED a) situational - features in the environment b) task.
Chapter 6 - Standardized Measurement and Assessment
Experiment Basics: Variables Psych 231: Research Methods in Psychology.
Copyright © 2014 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 11 Measurement and Data Quality.
Measurement and Scaling Concepts
Consistency and Meaningfulness Ensuring all efforts have been made to establish the internal validity of an experiment is an important task, but it is.
1 Measurement Error All systematic effects acting to bias recorded results: -- Unclear Questions -- Ambiguous Questions -- Unclear Instructions -- Socially-acceptable.
Reliability & Validity
5. Reliability and Validity
RESEARCH METHODS Lecture 18
Measurement Concepts and scale evaluation
Presentation transcript:

Psychometrics & Validation Psychometrics & Measurement Validity Properties of a “good measure” –Standardization –Reliability –Validity A Taxonomy of Item types

Psychometrics (Psychological measurement) The process of assigning values to represent the amounts and kinds of specified attributes, to describe (usually) persons. We do not “measure people” We measure specific attributes of a person Psychometrics is the “centerpiece” of empirical psychological research and practice. All data result from some form of “measurement” What we’ve meant by “Measurement Validity” all along The better the measurement, the better the data, the better the conclusions of the psychological research or application

Desirable Properties of Psychological Measures Interpretability of Individual’s and Group’s Scores Population Norms (Typical Scores) Validity (Consistent Accuracy) Reliability (Consistency) Standardization (Administration & Scoring)

Standardization Administration -- “given” the same way every time who administers the instrument specific instructions, order of items, timing, etc. Varies greatly -- multiple-choice classroom test (hand it out) -- Intelligence test pages of “how to” in manual -- about 1/2 semester in Psych 955 Scoring -- “graded” the same way every time who scores the instrument correct, “partial” and incorrect answers, points awarded, etc. Varies greatly -- multiple choice test (fill in the sheet) -- Exner System for the Rorschach -- 2 weeks of in depth training

Reliability (Consistency) Inter-rater or Inter-scorer reliability can two raters produce the same score for a given test (assumes standardization) Internal reliability -- agreement among test items split-half reliability -- randomly split into two tests & correlate Chronbach’s  -- tests “extent to which items measure a central theme” External Reliability -- consistency of scores from whole test test-retest reliability -- give same test 3-12 weeks apart ( r ) alternate forms reliability -- two “versions” of the test ( r )

Validity (Consistent Accuracy) Criterion-related Validity -- does test correlate with “criterion”? statistical -- requires a criterion that you “believe in” predictive, concurrent, postdictive validity Content Validity -- do the items come from “domain of interest”? non-statistical -- decision of “expert in the field” Face Validity -- do the items come from “domain of interest” non-statistical -- decision of “target population” Construct Validity -- does test relate to other measures it should? Statistical -- Discriminant validity convergent validity -- correlates with selected tests divergent validity -- doesn’t correlate with others

Kinds of Items Survey Items individual items expected to “capture” the attribute of interest e.g., age, height, political registry Scale Items items that are expected to “capture” the attribute of interest only when aggregated together to form a scale e.g., emotional maturity, body image, liberalism-conservatism Psychometrics emphasizes the measurement of relatively complex attributes, and so, emphasizes the use of multi-item scales (made up of of absolute, similarity, sentiment items)

More about “kinds of items” Any item can be defined by specifying three central attributes… Judgment item vs. Sentiment item judgments -- have “correct answers” sentiments -- have no “correct answers” Comparative item vs. Absolute item responding to two or more “stimuli” responding to a single “stimulus” (relative to “internal scale”) Preference item vs. Similarity item ranking or ordering items giving a value to define similarity can be comparative or absolute (vs. internal stim)

Most Common Types of Items ??? Personality, Attitude, Opinion (“Psychology”) Items 1. How do you feel today ? Unhappy happy 2. How interested are you in campus politics ? Interested Uninterested Absolute / Similarity / Sentiment -- most common Pick a number from the scale (stimulus) that best depicts how you feel (no “correct answer)

Most Common Types of Items ??? Personality, Attitude, Opinion (“Psychology”) Items 1. Which of these best describes you ? a. I am mostly interested in the “social side” of college. b. I am mostly interested in the “intellectual side” of college. 2. Would you rather spend time with a friend... at your favorite restaurant watching a sporting event Comparative / Preference / Sentiment -- less common Pick from the two responses (stimuli) that which best depicts how you feel (no “correct answer)

Most Common Types of Items ??? “Test” Items 1. Which of these is one of the 7 dwarves ? a. Grungy b. Sleazy c. Kinky d. Doc e. Dorky 2. What should you do if the traffic light turns yellow as you approach an intersection ? a. Stop b. Speed up c. Check for Police and then choose “a” vs. “b” Comparative / Preference / Judgment Pick from the responses (stimuli) that which best depicts the correct answer

A bit more about judgment vs. sentiment items… Most of the items used on psychological measures are “somewhere between” judgments and sentiments -- called “keyed sentiments”. Consider these items from a depression measure… 1. It is tough to get out of bed some mornings. disagree agree 2. I sometimes just want to sit and cry I’m generally happy about my life If the person is “depressed”, we would expect then to give a fairly high rating for questions 1 & 2, but a low rating on 3. Before aggregating these items into a score, we would “reverse key” item #3 (1=5, 2=4, 4=2, 5=1) Summary: for keyed items we don’t score answers as right or wrong, but “key” them, so they can be aggregated sensibly