Making Social Work Count Lecture 4 An ESRC Curriculum Innovation and Researcher Development Initiative.

Slides:



Advertisements
Similar presentations
Tessa Peasgood Centre for Well-being in Public Policy Sheffield University Modelling Subjective Well- being. Do strong social relations lead to increases.
Advertisements

Standardized Scales.
Principles of Measurement Lunch & Learn Oct 16, 2013 J Tobon & M Boyle.
1 National Outcomes and Casemix Collection Training Workshop Strengths and Difficulties Questionnaire.
Survey Methodology Reliability and Validity EPID 626 Lecture 12.
The Research Consumer Evaluates Measurement Reliability and Validity
1 COMM 301: Empirical Research in Communication Kwan M Lee Lect4_1.
Reliability and Validity
Psychometrics William P. Wattles, Ph.D. Francis Marion University.
Research Methodology Lecture No : 11 (Goodness Of Measures)
MGT-491 QUANTITATIVE ANALYSIS AND RESEARCH FOR MANAGEMENT
GHQ-12 Able to concentrate Lost much sleep Playing a useful part
Measurement Reliability and Validity
Validity In our last class, we began to discuss some of the ways in which we can assess the quality of our measurements. We discussed the concept of reliability.
RESEARCH METHODS Lecture 18
Concept of Measurement
RESEARCH METHODS IN EDUCATIONAL PSYCHOLOGY
Chapter 7 Evaluating What a Test Really Measures
PSYCHOMETRIC TESTING. Psychometrics Psychometrics deals with the scientific measurement of individual differences (personality and intelligence) It attempts.
SPSS Session 4: Association and Prediction Using Correlation and Regression.
Test Validity S-005. Validity of measurement Reliability refers to consistency –Are we getting something stable over time? –Internally consistent? Validity.
DEVELOPMENT AND TRIAL OF AN ACT WORKSHOP FOR PARENTS OF A CHILD WITH ASD Associate Professor Kate Sofronoff School of Psychology University of Queensland.
Copyright © 2008 by Pearson Education, Inc. Upper Saddle River, New Jersey All rights reserved. John W. Creswell Educational Research: Planning,
Ch 6 Validity of Instrument
Research Method Step 1 – Formulate research question Step 2 – Operationalize concepts ◦ Valid and reliable indicators Step 3 – Decide on sampling technique.
Assessment with Children Chapter 1. Overview of Assessment with Children Multiple Informants – Child, parents, other family, teachers – Necessary for.
VALIDITY, RELIABILITY, and TRIANGULATED STRATEGIES
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 14 Measurement and Data Quality.
Analyzing Reliability and Validity in Outcomes Assessment (Part 1) Robert W. Lingard and Deborah K. van Alphen California State University, Northridge.
Technical Adequacy Session One Part Three.
Psychometrics William P. Wattles, Ph.D. Francis Marion University.
Standardization and Test Development Nisrin Alqatarneh MSc. Occupational therapy.
FDA Approach to Review of Outcome Measures for Drug Approval and Labeling: Content Validity Initiative on Methods, Measurement, and Pain Assessment in.
CRITICAL APPRAISAL OF SCIENTIFIC LITERATURE
Classroom Assessments Checklists, Rating Scales, and Rubrics
The Psychology of the Person Chapter 2 Research Naomi Wagner, Ph.D Lecture Outlines Based on Burger, 8 th edition.
WELNS 670: Wellness Research Design Chapter 5: Planning Your Research Design.
ScWk 240 Week 6 Measurement Error Introduction to Survey Development “England and America are two countries divided by a common language.” George Bernard.
Research Seminars in IT in Education (MIT6003) Research Methodology I Dr Jacky Pow.
Quantitative SOTL Research Methods Krista Trinder, College of Medicine Brad Wuetherick, GMCTE October 28, 2010.
Measurement Validity.
Developing Measures Concepts as File Folders Three Classes of Things That can be Measured (Kaplan, 1964) Direct Observables--Color of the Apple or a Check.
Research: Conceptualization and Measurement Conceptualization Steps in measuring a variable Operational definitions Confounding Criteria for measurement.
Measurement and Scaling
 Measuring Anything That Exists  Concepts as File Folders  Three Classes of Things That can be Measured (Kaplan, 1964) ▪ Direct Observables--Color of.
Chapter 7 Measuring of data Reliability of measuring instruments The reliability* of instrument is the consistency with which it measures the target attribute.
Validity and Reliability in Instrumentation : Research I: Basics Dr. Leonard February 24, 2010.
Exam feedback. Question 17 2 marks – non-directional, fully operationalised 1 mark – non-directional, not fully operationalised 0 marks – directional/difference.
National PE Cycle of Analysis. Fitness Assessment + Gathering Data Why do we need to asses our fitness levels?? * Strengths + Weeknesses -> Develop Performance.
Starter on mwb: Write a suitable directional hypothesis for this investigation (3 marks). Two psychologists investigated the relationship between age and.
Sampling techniques validity & reliability Lesson 8.
CRITICALLY APPRAISING EVIDENCE Lisa Broughton, PhD, RN, CCRN.
VALIDITY What is validity? What are the types of validity? How do you assess validity? How do you improve validity?
A2 unit 4 Clinical Psychology 4) Content Reliability of the diagnosis of mental disorders Validity of the diagnosis of mental disorders Cultural issues.
Copyright © 2009 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 47 Critiquing Assessments.
Survey Methodology Reliability and Validity
Assessing Personality
Quantitative and Qualitative data
Reliability and Validity in Research
Concept of Test Validity
Test Validity.
CHAPTER 5 MEASUREMENT CONCEPTS © 2007 The McGraw-Hill Companies, Inc.
HRM – UNIT 10 Elspeth Woods 9 May 2013
Journalism 614: Reliability and Validity
Reliability & Validity
Week 3 Class Discussion.
Strengths and Difficulties Questionnaires
Assessing Personality
TESTING AND EVALUATION IN EDUCATION GA 3113 lecture 1
Reliability and Validity
Presentation transcript:

Making Social Work Count Lecture 4 An ESRC Curriculum Innovation and Researcher Development Initiative

What is being studied? Approaches to measuring variables

Assessment and judgment Social workers have to assess all the time: – Is there a problem or need here? – What is the risk of things getting worse? – Have I made a difference? Researchers carry out similar tasks This lecture considers the key issue of developing meaningful measurements for use in quantitative research Many of the issues are of relevance to the more general task of “assessment”

Quantitative and qualitative All research involves simplification – The question is whether we know what is gained and lost by simplification Qualitative studies tend to focus on meaning – Common strategy is identifying themes of relevance Quantitative studies convert issues to numbers – Allows certain types of important description (e.g. how many people have this problem?) – And – crucially - comparison (e.g. are things getting better? Does one group have more problems?)

Quantitative and qualitative Quantitative research This session focuses on quantitative research It identifies key considerations in thinking about the quality of quantitative study – Reliability – Validity Qualitative research Some of these considerations can also be applied to qualitative research However, qualitative studies also have their own criteria for assessing good research

Learning outcomes Understand what a variable is Appreciate different types of variable that can be used in quantitative research Understand issues in relation to reliability and validity Know what a standardised instrument is Have had the opportunity to reflect on implications for practice

Example of children in care Returning to idea that care “fails” children Lecture 3 suggested that comparing children who have left care with the general population is not a valid comparison sample Now let’s look at outcome measures

Forrester et al. (2009) review The literature review focused on studies that looked at child welfare over time for children in care Strongest finding: very poor research base – this is a difficult area to research Of 13 studies, almost all suggested: – Most of the harm occurs before care – Children tend to do better once in care – Some harm occurs as children leave care – Even in good placements children still tend to have problems

But… What “outcomes” were being measured? What outcomes do YOU think should be measured for children in care?

Key points Deciding on “outcomes” or variables for a study is NOT some value- neutral, technocratic activity Key issues to consider: – WHO is deciding what is to be measured? (e.g. experts? Government? Service users?) – WHAT is being measured? – HOW is it being measured? [focus of this lecture]

Key points What is measured? For instance, in studies reviewed by Forrester: – the most common issue “measured” was behaviour (and particularly problem behaviour) – education was the second most common – others included physical growth, social relations, etc How is it measured? Studies in the review: – obtained information from social work files and made a researcher “judgment” – used school tests – pooled interview and other data and made a researcher “judgment” – used questionnaires to carers What are the strengths and weaknesses of each?

Attributes and variables An attribute – is a characteristic of an individual e.g. height, intelligence, beauty, serenity A variable – is the operationalisation of an attribute e.g. metres, IQ score, marks out of 10?, err… It allows attributes to be compared and described The focus of lecture is on: how attributes are operationalised?

Variables need to be reliable and valid Reliability Are the results consistent, e.g. can the results be replicated in different conditions and across different groups? Validity Does the instrument measure what it claims to measure?

Measures should be both reliable and valid Low reliability Cannot be valid if not reliable… Reliable Not valid Not reliable AND not valid

Standardised Instruments (SIs) Tools that measure a specific quality or characteristic e.g. psychological distress They let us compare results across groups in different settings e.g. social workers, families, teachers, police..... SIs need to be high in both reliability and validity

Reliability – overview The consistency of a measure A test is considered reliable if we get the same result repeatedly Reliability can be estimated in a number of different ways – Test-retest reliability: over time – Inter-rater reliability: between different scorers – Internal Consistency Reliability: across items on the same test

Test-Retest Reliability Tests the extent to which the test is repeatable and stable over time The same social workers are given the same questions 2 to 3 weeks later If the results differ substantially, and there has been no intervention, then we should question the reliability of those questions

Inter-rater reliability Where two or more people rate/score/judge the test The scores of the judges are compared to find the degree of correlation/consistency between their judgements If there is a high degree of correlation between the different judgements, the test can be said to be reliable

Internal Consistency Reliability For example where there are two questions within a SI that seem to be asking the same thing If the test is internally valid the respondent should give the same answer to both questions More generally questions should be linked to one another if they measuring the same attribute

Validity The extent to which a test measures what it claims to measure: – Construct validity: The degree to which the test measures the construct of what it wants to measure – the overarching type of validity – Predictive validity: The degree of effectiveness with which the performance on a test predicts performance in a real-life situation – Content validity: that items on the test represent the entire range of possible items the test should cover

Construct validity The degree to which the test measures what it is intended to measure The over-arching concept in validity – all other types of validity are ways of assessing this As a result construct validity has many elements: – Predictive validity (can it predict things e.g. IQ scores and later test results) – Criterion validity (does it correctly differentiate e.g. does a screening instrument identify people who are depressed) – Construct validity (is the full range of the construct included) – And other types…

Predictive validity Can structured risk assessment tools predict children who will be abused? Are the predictions more accurate than practitioners’ decisions?

Predictive validity Barlow et al (2013) found that most attempts to predict had low success i.e. high numbers of false positives or false negatives Further research needed to develop reliable tools that predict abuse or re-abuse Though this is also true for practitioners…

Content validity Refers to the extent to which a measure represents elements of a social construct or trait For example, a depression scale may lack content validity if it only assesses the affective dimension of depression but fails to take into account the behavioural dimension Or : how should “ethnicity” be defined? In practice it is not possible to capture the full range of possible ethnicities – but what level of simplification is “valid”?

General Health Questionnaire (GHQ) A reliable and valid screening instrument identifying aspects of current mental health (anxiety/depression/social phobia) The self administered questionnaire asks if someone has experienced a particular symptom or behaviour recently Each item is rated on a four- point scale Used in many countries in different languages

GHQ 12 questions Questions include: Have you recently Been able to concentrate on whatever you are doing 2. Lost much sleep over worry 3. Felt that you are playing a useful part in things 4. Felt capable of making decisions about things 5. Felt constantly under strain 6. Felt you couldn’t overcome your difficulties 7. Been able to enjoy your normal day to day activities 8. Been able to face up to your problems 9. Been feeling unhappy and depressed 10. Been losing confidence in yourself 11. Been thinking of yourself as a worthless person 12. Been feeling reasonably happy, all things considered

GHQ 12 Different ways of measuring risk of psychiatric problems using data All show reasonable link with clinical diagnosis Common way is ‘yes’ or ‘no’ (depending on question) in 4 or more questions How do social workers do…?

Clinical scores for social workers and general population using GHQ Carpenter et al, 2010; ONS, 2010

How to measure children’s emotional and behavioural welfare? SDQ: Questionnaire designed for carers, children and teachers Reliability is tested by: comparing emotional and behavioural welfare – and over time Validity is tested by: seeing whether scores predict children receiving specialist help, criminal behaviour, excluded from school and “real world” outcomes also comparing with clinical assessment and other instruments

Strengths and Difficulties Questionnaire (SDQ) A brief behavioural screening questionnaire for parents/carers/ teachers with 3-16 year olds Asks about psychological attributes, some positive and others negative – E.g. emotional, conduct, hyperactivity, peer relationship, prosocial behaviour

SDQ questions 25 questions composed of five scales with five questions in each scale E.g. 5 questions in the Emotional Symptoms Scale 1.I get a lot of headaches 2.I worry a lot 3.I am often unhappy 4.I am nervous in 5.I have many fears Responses: Not true/Somewhat true/Certainly true

Why does this matter? Worth considering common social work research methods such as coming to a “researcher judgment” – how reliable? How valid? More importantly – what about your practice? What is a better way of judging whether a child has emotional or behavioural problems, or an adult is at risk of psychological problems – your judgment or a standardized instrument? If you want to evaluate whether you are making a difference – what role might a standardized instrument have?

Learning outcomes Do you? Understand what a variable is Appreciate different types of variable that can be used in quantitative research Understand issues in relation to: – Reliability – Validity Know what a standardised instrument is Have had the opportunity to reflect on implications for practice

References Goldberg, D. & Williams, P. (1988) A user’s guide to the General Health Questionnaire. Slough: NFER-Nelson Goodman R (1997) The Strengths and Difficulties Questionnaire: A Research Note. Journal of Child Psychology and Psychiatry, 38, Barlow, J., Fisher, J.D. and Jones, D. (2013) Systematic Review of Models for Analysing Significant Harm, Department for Education Report; London Accessed: ata/file/183949/DFE-RR199.pdf ata/file/183949/DFE-RR199.pdf