Unit 5: Improving and Assessing the Quality of Behavioral Measurement

Slides:



Advertisements
Similar presentations
Questionnaire Development
Advertisements

Standardized Scales.
Measurement Concepts Operational Definition: is the definition of a variable in terms of the actual procedures used by the researcher to measure and/or.
Survey Methodology Reliability and Validity EPID 626 Lecture 12.
© 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Validity and Reliability Chapter Eight.
VALIDITY AND RELIABILITY
In Today’s Society Education = Testing Scores = Accountability Obviously, Students are held accountable, But also!  Teachers  School districts  States.
General Information --- What is the purpose of the test? For what population is the designed? Is this population relevant to the people who will take your.
Measurement Reliability and Validity
Measurement. Scales of Measurement Stanley S. Stevens’ Five Criteria for Four Scales Nominal Scales –1. numbers are assigned to objects according to rules.
Reliability and Validity of Research Instruments
Chapter 5: Improving and Assessing the Quality of Behavioral Measurement Cooper, Heron, and Heward Applied Behavior Analysis, Second Edition.
RESEARCH METHODS Lecture 18
Can you do it again? Reliability and Other Desired Characteristics Linn and Gronlund Chap.. 5.
VALIDITY.
Concept of Measurement
Beginning the Research Design
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 5 Making Systematic Observations.
Psych 231: Research Methods in Psychology
Validity, Reliability, & Sampling
Selecting, Defining, and Measuring Behavior
Classroom Assessment A Practical Guide for Educators by Craig A
Reliability and Validity. Criteria of Measurement Quality How do we judge the relative success (or failure) in measuring various concepts? How do we judge.
Understanding Validity for Teachers
Questions to check whether or not the test is well designed: 1. How do you know if a test is effective? 2. Can it be given within appropriate administrative.
Test Validity S-005. Validity of measurement Reliability refers to consistency –Are we getting something stable over time? –Internally consistent? Validity.
Now that you know what assessment is, you know that it begins with a test. Ch 4.
RELIABILITY AND VALIDITY OF DATA COLLECTION. RELIABILITY OF MEASUREMENT Measurement is reliable when it yields the same values across repeated measures.
Experimental Research
Chapter 11 Research Methods in Behavior Modification.
Instrument Validity & Reliability. Why do we use instruments? Reliance upon our senses for empirical evidence Senses are unreliable Senses are imprecise.
Reliability and Validity what is measured and how well.
Psychometrics William P. Wattles, Ph.D. Francis Marion University.
Unit 1: Selecting and Defining Target Behaviors
Chapter 1: Research Methods
Scientific Inquiry & Skills
WELNS 670: Wellness Research Design Chapter 5: Planning Your Research Design.
Chapter 1: Introduction to Statistics
Reliability & Validity
For ABA Importance of Individual Subjects Enables applied behavior analysts to discover and refine effective interventions for socially significant behaviors.
Data Collection and Reliability All this data, but can I really count on it??
Measurement Validity.
CHAPTER OVERVIEW The Measurement Process Levels of Measurement Reliability and Validity: Why They Are Very, Very Important A Conceptual Definition of Reliability.
Validity Validity: A generic term used to define the degree to which the test measures what it claims to measure.
Chapter 2: Behavioral Variability and Research Variability and Research 1. Behavioral science involves the study of variability in behavior how and why.
Presented By Dr / Said Said Elshama  Distinguish between validity and reliability.  Describe different evidences of validity.  Describe methods of.
Research Methodology and Methods of Social Inquiry Nov 8, 2011 Assessing Measurement Reliability & Validity.
JS Mrunalini Lecturer RAKMHSU Data Collection Considerations: Validity, Reliability, Generalizability, and Ethics.
- Observational methods :  For the observations to be considered as scientific research, a carefully planned study is necessary. 1.Determining behaviors.
Chapter 7 Measuring of data Reliability of measuring instruments The reliability* of instrument is the consistency with which it measures the target attribute.
Reliability and Validity Themes in Psychology. Reliability Reliability of measurement instrument: the extent to which it gives consistent measurements.
RESEARCH METHODS IN INDUSTRIAL PSYCHOLOGY & ORGANIZATION Pertemuan Matakuliah: D Sosiologi dan Psikologi Industri Tahun: Sep-2009.
Measurement Experiment - effect of IV on DV. Independent Variable (2 or more levels) MANIPULATED a) situational - features in the environment b) task.
Chapter 6 - Standardized Measurement and Assessment
Validity & Reliability. OBJECTIVES Define validity and reliability Understand the purpose for needing valid and reliable measures Know the most utilized.
Educational Research Chapter 8. Tools of Research Scales and instruments – measure complex characteristics such as intelligence and achievement Scales.
5. Evaluation of measuring tools: reliability Psychometrics. 2011/12. Group A (English)
Data Collection Methods NURS 306, Nursing Research Lisa Broughton, MSN, RN, CCRN.
ESTABLISHING RELIABILITY AND VALIDITY OF RESEARCH TOOLS Prof. HCL Rawat Principal UCON,BFUHS Faridkot.
Survey Methodology Reliability and Validity
Reliability and Validity
Lecture 5 Validity and Reliability
Test Validity.
RELIABILITY OF QUANTITATIVE & QUALITATIVE RESEARCH TOOLS
Journalism 614: Reliability and Validity
Classroom Assessment Validity And Bias in Assessment.
Week 3 Class Discussion.
پرسشنامه کارگاه.
RESEARCH METHODS Lecture 18
Chapter 8 VALIDITY AND RELIABILITY
Presentation transcript:

Unit 5: Improving and Assessing the Quality of Behavioral Measurement PS 522: Behavioral Measures and Interpretation of Data Lisa R. Jackson, Ph.D.

Indicators of Trustworthy Measurement Validity Directly measures a socially significant behavior Measures a dimension of the behavior relevant to the question Helps you determine that you are testing what you think you are testing

Indicators of Trustworthy Measurement Accuracy Observed values match the true values of an event Reliability Measurement yields the same values across repeated measurement of the same event This is a measure of consistency, stability Does it reliably produce the same results over time? Is your assessment reliable within itself? If you use different versions, are they all equally reliable?

Types of Validity: Face Validity This is all about how the tests looks, not what it really measures. Does the test measure what the test items suggest it will? High face validity can increase test-taker’s confidence, interest and motivation in the test. Low face validity could disguise the real purpose of the test to decrease self-report bias Good for measuring something people don’t want to admit to It is the least scientifically accurate measure of validity

Types of Validity: Content Validity How well the test samples behavior representative of the whole universe of behavior it is designed to sample. Ex: A test of depression needs to measure thoughts, emotions, motivation and behavior, not just mood. An assessment for autism cannot just assess language skills Affected by who writes the test

Types of Validity: Criterion Related This is an external measure – how well does the test measure up against another standard? How well can a score on this test be used to infer an individual’s likely standing on the criterion? Concurrent: How well is a score on this test correlated to an individual’s likely standing on a criterion of interest in the present? Ex: if you test low on one depression test, do you also test low on the Beck Depression Inventory?  Predictive: How well is a score on this test correlated to an individual’s likely standing on a criterion of interest in the future? Ex: Does SAT predict college achievement?

Types of Validity: Construct Validity How appropriate are conclusions drawn from test scores regarding an individual’s standing on a construct?   Does this test really assess the construct? This takes time to determine. Ex: Over time, the Beck Depression Inventory has acquired construct validity. Most experts agree that it tests for depression.

Threats to Measurement Validity Indirect measurement Measuring a behavior other than the behavior of interest Example: Using children’s responses to a questionnaire as a measure of how often and how well they get along with their classmates Measuring a dimension that is irrelevant or ill suited to the reason for measuring behavior Example: Using a ruler in a pot of water to measure temperature Trying to measure reading endurance in oral reading by counting the number of correct and incorrect words read, but not counting how long the student read

Types of Reliability: Internal Consistency This one is really all about the questions, not the test. The goal is to see if the questions are consistent with one another. Do test items reliably produce the same results? Do all of the questions measure the same idea?

Types of Reliability: Split Half Divide a test in half equally Give one half to one group, the other half to an equal group The goal is to judge the consistency of the test Are the 2 halves strongly correlated to each other? If they are, the test is reliable. One way is to use odd/even reliability: questions 1,3,5,7 on one half, 2,4,6,8 on other Use when impractical to have two tests or two administrations; saves time and expense

Types of Reliability: Test/Retest Using the same instrument to measure the same thing at two different points in time.  The goal is to determine the consistency of the test over time Does it produce similar results each time? If so, the test is reliable Good for a stable construct such as personality Won’t work to assess a reading test if subjects improve in reading between administrations

Types of Reliability: Parallel and Alternate Forms A make-up test is an example You might not want to give the same one Minimizes memory effects if administering the same test to the same person The person won’t have memorized answers from testing to testing The mean on each form must be equivalent to the original Can be time consuming, expensive

Assessing the Reliability of Measurement Measurement is reliable when it yields the same values across repeated measures of the same event Not the same as accuracy Reliable application of measurement system is important Requires permanent products for re-measurement Low reliability signals suspect data

Using Interobserver Agreement to Assess Behavioral Measurement The degree to which two or more independent observers report the same values for the same events Determine competence of new observers Detect observer drift Judge clarity of definitions and system Increase believability of data

Requisites for IOA Observers must: Use the same observation code and measurement system Observe and measure the same participants and events Observe and record independently of one another

Methods for Calculating IOA Percentage of agreement is most common way to calculate Event Recording methods compare: Total count recorded by each observer Mean count-per-interval Exact count-per-interval Trial-by-trial

Methods for Calculating IOA Timing recording methods: Total duration IOA Mean duration-per-occurrence IOA Latency-per-response Mean IRT-per-response Interval recording and Time sampling: Interval-by-interval IOA (Point by point) Scored-interval IOA Unscored-interval IOA

Considerations in IOA Obtain and report IOA at the same levels at which researchers will report and discuss in study results For each behavior For each participant In each phase of intervention or baseline

Considerations in IOA Believability of data increases as agreement approaches 100% History of using 80% agreement as acceptable benchmark Depends upon the complexity of the measurement system

Considerations in IOA Reporting IOA Narrative form Table Graphs In all formats, report how, when, and how often IOA was assessed

Assessing the Accuracy and Reliability of Behavioral Measurement First, design a good measurement system Second, train observers carefully Third, evaluate extent to which data are accurate and reliable Measure the measurement system Accuracy means the observed values match the true values of an event You can’t base research conclusions or treatment decisions on faulty data

Assessing the Accuracy of Measurement Four purposes of accuracy assessment: Determine if data are good enough to make decisions Discovery and correction of measurement errors Reveal consistent patterns of measurement error Assure consumers that data are accurate

Accuracy Assessment Procedures Measurement is accurate when observed values match true values Accuracy determined by calculating correspondence of each data point with its true value Process for determining true value must differ from measurement procedures Accuracy assessment should be reported in research

Threats to Measurement Accuracy and Reliability Inadequate observer training Explicit and systematic Careful selection Train to competency standard On-going training to minimize observer drift

Threats to Measurement Accuracy and Reliability Unintended influences on observers Observer expectations of what the data should look like Observer reactivity when she/he is aware that others are evaluating the data Measurement bias Feedback to observers about how their data relates to the goals of intervention

Final Points The data that you gather is used to help improve the lives of real people Others may use your data as the basis of intervention Measurements need to be accurate, consistent and relevant

Questions?? Thanks for participating! I am sure you have been asking questions here in seminar!  Great job! But, if you have more, email me: Ljackson2@kaplan.edu These slides are posted in the Doc Sharing area for your review.