Standardized Scales.

Slides:



Advertisements
Similar presentations
Chapter 8 Flashcards.
Advertisements

Measurement Concepts Operational Definition: is the definition of a variable in terms of the actual procedures used by the researcher to measure and/or.
Research Curriculum Session II –Study Subjects, Variables and Outcome Measures Jim Quinn MD MS Research Director, Division of Emergency Medicine Stanford.
Principles of Measurement Lunch & Learn Oct 16, 2013 J Tobon & M Boyle.
ASSESSING RESPONSIVENESS OF HEALTH MEASUREMENTS. Link validity & reliability testing to purpose of the measure Some examples: In a diagnostic instrument,
Conceptualization and Measurement
Survey Methodology Reliability and Validity EPID 626 Lecture 12.
© 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Validity and Reliability Chapter Eight.
VALIDITY AND RELIABILITY
 A description of the ways a research will observe and measure a variable, so called because it specifies the operations that will be taken into account.
Part II Sigma Freud & Descriptive Statistics
Part II Sigma Freud & Descriptive Statistics
MGT-491 QUANTITATIVE ANALYSIS AND RESEARCH FOR MANAGEMENT
Measurement Reliability and Validity
LECTURE 9.
Individualized Rating Scales (IRS)
Concept of Measurement
Psych 231: Research Methods in Psychology
FOUNDATIONS OF NURSING RESEARCH Sixth Edition CHAPTER Copyright ©2012 by Pearson Education, Inc. All rights reserved. Foundations of Nursing Research,
Today Concepts underlying inferential statistics
Research Methods in MIS
Chapter 9 Flashcards. measurement method that uses uniform procedures to collect, score, interpret, and report numerical results; usually has norms and.
Classroom Assessment A Practical Guide for Educators by Craig A
Single-Subject Designs
Chapter 12 Inferential Statistics Gay, Mills, and Airasian
Reliability, Validity, & Scaling
Scales and Indices While trying to capture the complexity of a phenomenon We try to seek multiple indicators, regardless of the methodology we use: Qualitative.
Chapter 1: Introduction to Statistics
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 14 Measurement and Data Quality.
Analyzing Reliability and Validity in Outcomes Assessment (Part 1) Robert W. Lingard and Deborah K. van Alphen California State University, Northridge.
Standardization and Test Development Nisrin Alqatarneh MSc. Occupational therapy.
T tests comparing two means t tests comparing two means.
Classroom Assessments Checklists, Rating Scales, and Rubrics
What is a Measurement? Concept of measurement is intuitively simple  Measure something two concepts involved  The thing you are measuring  The measurement.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
Chapter 1: Introduction to Statistics
For ABA Importance of Individual Subjects Enables applied behavior analysts to discover and refine effective interventions for socially significant behaviors.
Chapter 2 Doing Sociological Research Key Terms. scientific method Involves several steps in research process, including observation, hypothesis testing,
Research Methodology and Methods of Social Inquiry Nov 8, 2011 Assessing Measurement Reliability & Validity.
Research Design ED 592A Fall Research Concepts 1. Quantitative vs. Qualitative & Mixed Methods 2. Sampling 3. Instrumentation 4. Validity and Reliability.
Assessing Responsiveness of Health Measurements Ian McDowell, INTA, Santiago, March 20, 2001.
Chapter 7 Measuring of data Reliability of measuring instruments The reliability* of instrument is the consistency with which it measures the target attribute.
RESEARCH METHODS IN INDUSTRIAL PSYCHOLOGY & ORGANIZATION Pertemuan Matakuliah: D Sosiologi dan Psikologi Industri Tahun: Sep-2009.
Chapter 6 - Standardized Measurement and Assessment
Course: Research in Biomedicine and Health III Seminar 5: Critical assessment of evidence.
Michigan Assessment Consortium Common Assessment Development Series Module 16 – Validity.
T tests comparing two means t tests comparing two means.
Chapter 13 Understanding research results: statistical inference.
Foundations of Evidence-Based Outcome Measurement.
Review Statistical inference and test of significance.
Copyright © 2014 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 11 Measurement and Data Quality.
Elspeth Slayter, Associate Professor School of Social Work, Salem State University.
CRITICALLY APPRAISING EVIDENCE Lisa Broughton, PhD, RN, CCRN.
Measurement and Scaling Concepts
Copyright © 2009 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 47 Critiquing Assessments.
Survey Methodology Reliability and Validity
Classroom Assessments Checklists, Rating Scales, and Rubrics
CCMH 535 RANK Career Begins/cchm535rank.com
Reliability and Validity in Research
Assessment Theory and Models Part II
Associated with quantitative studies
RELIABILITY OF QUANTITATIVE & QUALITATIVE RESEARCH TOOLS
Understanding Results
Classroom Assessments Checklists, Rating Scales, and Rubrics
Human Resource Management By Dr. Debashish Sengupta
Week 3 Class Discussion.
Measurement Concepts and scale evaluation
TESTING AND EVALUATION IN EDUCATION GA 3113 lecture 1
Lecture Slides Elementary Statistics Twelfth Edition
Chapter 8 VALIDITY AND RELIABILITY
Presentation transcript:

Standardized Scales

Standardization Use of identical procedures to collect, score, interpret, and report results of a measure Assures that differences over time or among different people are due to the variable being measured and not to different measurement procedures

What are Standardized Scales? Set of uniform procedures to collect, score, interpret, and report numerical results Usually have norms and empirical evidence of reliability and validity Typically include multiple items aggregated into one or more composite scores Frequently used to measure constructs

Construct Complex concept (e.g., intelligence, well-being, depression) Inferred or derived from a set of interrelated attributes (e.g., behaviors, experiences, subjective states, attitudes) of people, objects, or events Typically embedded in a theory Oftentimes not directly observable but measured using multiple indicators

Evaluating and Selecting Standardized Scales Purpose Reference populations and normative groups Reliability Validity Practical considerations

Purpose Identify whether or not a client has a significant problem Measure and monitor your client’s outcomes to determine if your client is making satisfactory progress

Reference Population Population of people for which a measure is intended and from which a normative group is sampled and norms are created

Normative Group Representative sample of a reference population, used to estimate norms for that population and, more generally, used to develop and test standardized measures Also known as a “standardization group” or “standardization sample” Population Sample

Reliability Internal consistency reliability (coefficient alpha) (most important) Interrater rater reliability (sometimes) Test-retest reliability

Validity Face Content Criterion Construct Sensitivity to change especially important

Practical Considerations Time Effort Training Cost Availability Acceptability (e.g., clients, practitioners, etc.)

Decisions, Decisions… Who Where When How often to collect outcome data

Who Client Practitioner Relevant others Independent evaluators

Where and When Private, quiet, physically comfortable location Complete at about the same time and under the same conditions on a regular basis

How Often Regular, frequent, pre-designated intervals Often enough to detect significant changes in the problem, but not so often that it becomes problematic In general about once per week

Engage and Prepare Clients Be certain the client understands and accepts the value and purpose of monitoring progress Discuss confidentiality Present measures with confidence Don’t ask for info the client can’t provide

Engage and Prepare Clients (cont’d) Be sure the client is prepared Be careful how you respond to information Use the information that is collected

Administering, Scoring, and Interpreting Standardized Scales Score, scoring formula, composite score Unidimensional and multidimensional scales Cut scores Reverse-worded items Reliable change, reliable improvement, reliable deterioration Clinically significant improvement Expected treatment response

Score Generic term for a number derived from a measure that represents the quantity or amount of an attribute or observation (e.g., number of times a behavior is observed, value obtained from a standardized scale) Interpret in context of all available quantitative and qualitative information

Scoring Procedure by which data from a measure are used to produce a score (e.g., number of times a behavior occurs or value on a standardized scale) or category (e.g., diagnostic category)

Scoring Formula A mathematical rule by which data from a measure are used to produce a score (e.g., sum or average of responses to items on a multi-item standardized scale) Item 1 Item 2 Item 3 Score

Composite Score Score that combines results from two or more related items or other measures using a specified formula (e.g. percentage of items answered correctly on a statistics test) Score Item 3 Item 2 Item 1

Unidimensional Scale Scale that measures a single attribute or construct (e.g., depression). (Contrast with multidimensional scale.)

Multidimensional Scale Scale that measures two or more distinct but related attributes or constructs, and measures of the different attributes or constructs are referred to as “subscales” Global Distress Subjective Well-Being Problems & Symptoms Social Functioning

Cut Scores Specific predetermined numerical values along a continuum of scores Used to separate people into categories with distinct substantive interpretations (e.g., clinically depressed or not) Used to make decisions (provide treatment for depression or not) Only as good as the normative sample(s) on which it is derived Interpret in context of all available quantitative and qualitative information

Reverse-Worded Item Item for which smaller numbers indicate a higher score on the measured variable because the item is worded to mean the opposite of the measured variable

Reliable Change Change in a score from one time to another that is more than expected just from random measurement error Clinical significance.xls

Reliable Improvement Improvement in a score from one time to another that is more than expected just from random measurement error

Reliable Deterioration Deterioration in a score from one time to another that is more than expected just from random measurement error

Clinically Significant Improvement Change that occurs when a client’s measured functioning on a standardized scale is: In the dysfunctional range before intervention (e.g., greater than 5 on the QIDS-SR) In the functional range after intervention (e.g., 5 or below on the QIDS-SR) Change is reliable

Clinically Significant Improvement (cont’d) Interpret in context of all available quantitative and qualitative information Does not guarantee a meaningful change in a client’s real-world functioning or quality of life Only as good as the normative sample(s) on which it is derived Does not speak to the question of whether it was your intervention or something else that caused the change

Expected Treatment Response Session-by-session progress is determined in comparison to normative data from ongoing responses to treatment of thousands of clients Feedback used in real time to monitor client progress and modify services as needed to reduce treatment failures and increase overall effectiveness

Global Rating Single rating based on a rater’s integration of information about numerous factors (e.g., global rating of change, improvement, or social functioning)

Single-Item Global Standardized Scales Global Assessment of Functioning (GAF) Children’s Global Assessment Schedule (CGAS) Social and Occupational Functioning Assessment Scale (SOFAS) Global Assessment of Relational Functioning (GARF)

Potential Advantages of Standardized Scales Pretested for reliability and validity Structured, so information less likely to be missed Can be used to compare individual functioning to normative group functioning Can be efficient and simple to use

Cautions in the Use of Standardized Scales May not measure concept suggested by scale name Different measures of the same concept may not be equivalent Sometimes limited information about reliability and validity Concepts as measured may not be completely relevant to individual clients

Resources Compendiums of measures Web measurement resources See Appendix B Web measurement resources