Presentation is loading. Please wait.

Presentation is loading. Please wait.

Risk Evaluation: Maximizing Risk Accuracy Presentation to Special Commission to Reduce the Recidivism of Sex Offenders 10/8/2014.

Similar presentations


Presentation on theme: "Risk Evaluation: Maximizing Risk Accuracy Presentation to Special Commission to Reduce the Recidivism of Sex Offenders 10/8/2014."— Presentation transcript:

1 Risk Evaluation: Maximizing Risk Accuracy Presentation to Special Commission to Reduce the Recidivism of Sex Offenders 10/8/2014

2 Overview of Presentation Brief history of risk assessment and the different kinds of assessment that have been developed; Indication of where MA SORB Classification fits in this historical context, and in the context of current state strategies; Summary of the criteria for how one should evaluate risk instruments; Quick overview of the recent empirical evaluations of risk instruments; Suggestions of two strategies for improving classification in MA.

3 BRIEF HISTORY OF RISK ASSESSMENT

4 Brief History First generationFirst generation – Unstructured clinical judgment, including structured clinical guidelines (SCG). Second generationSecond generation – Actuarial risk scales comprising static, historical factors. Third generationThird generation – the assessment of “criminogenic needs” or dynamic risk factors. Bonta, 1996 Fixed or historical factors that cannot be changed (such as age at first offense) Potentially changeable factors, both stable, but potentially changeable risk traits, and acute, rapidly changing factors.

5 Brief History Characteristics of Unstructured Clinical Judgments –Characteristics of Unstructured Clinical Judgments – No items specified for considering risk level; Method for combining items is not specified. (Hanson & Morton-Bourgon, 2009) First Generation

6 Brief History Characteristics of SCGs–Characteristics of SCGs– They identify items to use in the decision and typically provide numerical values for each item; Although they also usually provide a method for combining the items into a total score, they do not specify a priori how the clinician should integrate the items; No tables linking the summary scores to recidivism rates. (Hanson & Morton-Bourgon, 2009) First Generation

7 Brief History Requirements of Empirical Actuarials –Requirements of Empirical Actuarials – Provide specific items to make the decision with quantitative anchors, which are derived from empirical investigation; Method for combining the items into an overall score is specified; Tables linking the summary scores to recidivism rates are provided. (Hanson & Morton-Bourgon, 2009) Second Generation

8 Brief History Requirements of Mechanical Actuarials – They provide specific items for the decision with numeric values for each item, which are derived from a review of literature and theory; Method for combining the items into an overall score is specified; Tables linking the summary scores to recidivism rates are n nn not provided. (Hanson & Morton-Bourgon, 2009) Second Generation

9 Brief History Additional condition Adjusted Actuarials –Additional condition Adjusted Actuarials – Use appropriate actuarials (empirical or mechanical); The clinician adjusts the score (and the recommendation) using factors external to the actuarial. (Hanson & Morton-Bourgon, 2009) Second Generation

10 MA SORB CLASSIFICATION FACTORS Where Does It Fit?

11 MA SORB Classification Factors Somewhere between an unstructured judgment and an SCG –Somewhere between an unstructured judgment and an SCG – Where Does It Fit? Predictive Validity AWAcrimeAWAcrimeClinJudgmtClinJudgmtSCGSCGEmp.ActuarialEmp.Actuarial Em. Act. + Dyn. Em. Act. + Dyn. MASORBMASORB

12 MA SORB Classification Factors Somewhere between an unstructured judgment and an SCG –Somewhere between an unstructured judgment and an SCG – butIt specifies a set of factors to be considered; but does not provideIt does not provide any quantification of these factors (i.e., numeric item scores). notIn many items it does not provide clear specification of where the cutoff for “presence” or “absence” of a factor would be. Thus, it provides limited guidance both on the presence of items and on the combining of items. Why Does It Fit Here?

13  Item 3. Psychopathy Code this by reference to the PCL-R. Code PCL- R scores of 30 or above as “Y,” scores of 21-29 as “?,” and scores of 20 or lower as “N.” Y = 2 ? = 1 N = 0 Example of SCG MA SORB Classification Factors SVR-20

14  Item 2. Repetitive and Compulsive Behavior Example of SORB Factors ?charges, convictions, self-report? ?includes both impulsive and compulsive behavior? MA SORB Classification Factors Could be either NoScore VagueCriteria&NoCutoff

15 So the MA SORB criteria neither— provide a metricprovide a metric for each item, so it is not known which items an expert is depending on and no item improvement can be attempted, nor specify the cutoffspecify the cutoff criteria necessary for items to be judged present or absent by two raters, so no determination of agreement or reliability can be ascertained. rules on how to combineMoreov er, there are no rules on how to combine or weight items in reaching a decision. MA SORB Classification Factors

16 Relative to other states?Relative to other states? Where Does It Fit?

17 Identified “Tiering”

18 De Facto “Tiering”

19 Criteria for De Facto “Tiering” 6% State Actuarial

20 Criteria for De Facto “Tiering” 6% State Actuarial

21 MN Leveling Criteria Actuarial Leveling Criteria Clinical Judgment Trumps 6% Hx of gratuitous violence Unsuccessful treatment Predatory offense behavior Supervision failures

22 HOW DO WE EVALUATE RISK TOOLS? Evaluating Reliability and Validity

23 Reliability HOW DO WE EVALUATE RISK TOOLS?

24 Reliability Accuracy Freedom from variable error Consistency Across raters Across items Across different measures of the same construct Across time Reliability is --

25 Reliability Interrater

26 Interrater Reliability R 1 R 2 Agreement High Reliability Low Reliability Disagreement

27 Reliability Interrater Internal Consistency

28 Agreement or Correlation Among Items = High Reliability

29 Allows one to calculate various forms of reliability – Item reliability Reliability of subscales (e.g., sexual deviance, criminality, etc.) Internal consistency of items in the instrument Thus, quantification allows us to restructure items and their anchors to improve reliability. Advantages of Quantification Allows Reliability Checks Gives us the Power of Being on the Same Page

30 Most popular SCGs and actuarials assessed in the comparative literature have acceptable reliability. Unstructured judgments have poor reliability. The reliability of MA SORB Classification Factors have not and can not be assessed. SCGs and Actuarials Reliability Results

31 HOW DO WE EVALUATE RISK TOOLS? Evaluating Reliability and Validity

32 Validity HOW DO WE EVALUATE RISK TOOLS? Validity

33

34 Validity Answers the Question Does a test measure what it is suppose to measure? What does a test measure? What can one do with the test? What does a test score predict?

35 Validity Answers the Question Does a test measure what it is suppose to measure? What does a test measure? What can one do with the test? What does a test score predict?

36 Predicting Sexual Recidivism Instrument Typed(95% CI) Empirical Actuarial.67(.63 -.72) Mechanical Actuarial.66(.58 -.74) SCG.46(.29 -.62) Unstructured Judgmt.42(.32 -.51) (Hanson & Morton-Bourgon, 2009)

37 significantly betterOverall, controlling for a large number of study variables, Empirical and Mechanical were significantly better predictors of recidivism; SCGs using clinical judgment and SCGs that calculate total scores did not differ. loweredIn all studies examined, clinicians’ adjustment of actuarial scores consistently lowered predictive accuracy. Predicting Sexual Recidivism (Hanson & Morton-Bourgon, 2009)

38 Across multiple areas of prediction, mechanical actuarial prediction (statistical prediction rules [SPRs]) has been shown to be superior to clinical judgment. A recent meta-analysis summarizes the results of years of research (Grove et al., 2000). Why Is Clinical Judgment Inferior?

39 All studies published in English from 1920s to mid 1990s. 136 studies on the prediction of health- related phenomena or human behavior. (Grove et al., 2000)

40

41 A large body of research has documented the reasons for the cognitive errors that clinicians make. For instance, clinicians are great at making observations and rating items, but they, like all humans, are worse than a formula at adding the items together and combining them. Why Is Clinical Judgment Inferior?

42 Allows one to use various strategies for improving validity of a measure– Assess item correlation with outcome; Adjust item cutoffs to maximize prediction; Assess the validity of subscales (e.g., sexual deviance, criminality, etc.); Optimize item weights for decision-making and predicting. Thus, one can restructure items, their anchors, cutoffs, and combinations to improve validity. Advantages of Quantification Allows Validity Checks

43 STRATEGIES FOR IMPROVING MA SORB CLASSIFICATION Examples from Two States New Jersey Oregon

44 New Jersey New Jersey: State Generated Actuarial

45 RRASItems Scoring: Highest possible total score = 111 Low Range: 0 – 36 Moderate Range: 37 – 73 High Range: 74 - 111

46 Focuses on the current empirical literature to generate items and a scale. Each item is quantified and anchored cutoffs are provided. Method of combining items to generate a score is specified. Levels are tied to specific scores. New Jersey: State Generated Actuarial Advantages

47 Reliability is an iterative process that takes time to develop. Baserates of scores not initially available. No follow-up data are available. No reoffense probabilities available until prospective study completed. New Jersey: State Generated Actuarial Disadvantages

48 48 Re-offense Rates by State Risk Levels MN & NJ: 3 Level SystemFL & SC: Offender / Predator (  2 (1) = 3.37, p =.066) (AUCs =.493 -.569, ns) (Zgoba et al., 2014)

49 STRATEGIES FOR IMPROVING MA SORB CLASSIFICATION Examples from Two States New Jersey Oregon

50

51 Oregon: Standard Actuarial

52 The Static-99R is the chosen risk assessment scale for Oregon, with the following level cutoffs recommended: Level I: Score -3 to 3 (Low) Level II: Score 4 to 5 (Moderate) Level III: Score of 6+ Override and downward departure factors are taken into consideration: Aggravating factors that result in override to a higher level: 1.Deviant Sexual Preference (by STABLE-2007 definition); 2.Emotional Identification with Children (STABLE-2007 definition); 3.High level of psychopathic traits as identified by validated assessment 4.Individual articulates to officials/treatment professional an unwillingness to control future sexually assaultive behaviors and/or plans to reoffend violently or sexually. Mitigating factors that result in downward departure to lower level: 1.Debilitating illness and/or permanent incapacitation 2.10+ years clean record within the community Assessments for aggravating and mitigating factors must be completed by a trained professional.

53 53 Static 99R Items

54 Focuses on the current empirical literature to generate items and a scale. Each item is quantified and anchored cutoffs are provided. Method of combining items to generate a score are specified. Levels are tied to specific scores. Oregon: Standard Actuarial Advantages

55 Extensive follow-up data have been already been gathered. There are existing estimates of the probabilities of recidivism for score levels. Oregon: Standard Actuarial Advantages

56 Actuarial not made specifically for the local state environment. Tied to standardized instrument that you are less likely to assess for continuous improvement. Disadvantages Oregon: Standard Actuarial

57 APPLYING THE TWO STRATEGIES TO THE MA SORB CRITERIA

58 General Issues Creation of separate adult and juvenile actuarials; Creation of separate male and female actuarials; Dealing with the issues of Mental Illness and Intellectual Disabilities. Improving the Current MA SORB Criteria

59 Strategy 1: NJ Solution Fix the Current MA SORB Criteria for Adult Males Divide instrument into static and dynamic item subsets; Use recent meta-analytic literature to purge items that are not likely to have predictive validity;

60 Examples of Poor Predictors Released from civil commitment vs. not committed (Knight & Thornton, 2007) Maximum term of incarceration; Current home situation (?vague and unspecified?); Physical condition; Documentation from a licensed mental health professional specifically indicating that offender poses no risk to reoffend;

61 Recent Threats; Supplemental material; Victim impact statement. Examples of Poor Predictors

62 Strategy 1: NJ Solution Fix the Current MA SORB Criteria for Adult Males Divide instrument into static and dynamic item subsets; Use recent meta-analytic literature to purge items that are not likely to have predictive validity; Transform remaining items into a quantifiable format with clear cutoffs; Do a small study on a subset of offenders to establish reliability. ? Add items to capture predictive domains not adequately sampled?

63 Strategy 1: NJ Solution Fix the Current MA SORB Criteria for Adult Males Adjust items with the reliability data; Do a preliminary check on the predictive validity of revised items using existing data bases; Revise items as a function of predictive study and establish preliminary leveling cutoffs; Use the revised instrument, requiring item and total scores of raters for future validation studies.

64 Strategy 1: NJ Solution Fix the Current MA SORB Criteria for Adult Males Follow all offenders and prospectively assess the instrument’s predictive validity of recidivism; Continually adjust instrument to improve predictive accuracy.

65 Strategy 2: OR Solution Strategy 2: OR Solution Use the Static99R to determine leveling; Any “aggravating” or “mitigating” criteria should be operationally defined (e.g., STABLE 2007; PCL:R), and its adjustment contribution should be quantitatively specified. SORB has been doing Static99Rs for a while, so use the ones that they have done. Have a team of trained graduate student raters (cheap and accurate) do Static99Rs on remaining offenders.

66 ESTIMATING LEVEL 3 FREQUENCY

67 MTC Committed

68 MTC Not Committed

69 STATIC-99R Scores (n = 1312) 69 21.2% Zgoba et al., 2014

70 MA % RSO Level 3 (2010) As cited in Harris, Levenson, & Ackerman, 2012

71 Moving forward use existing dynamic instruments to create profiles for treatment and management of offenders and for future adjustments. Strategy 2: OR Solution


Download ppt "Risk Evaluation: Maximizing Risk Accuracy Presentation to Special Commission to Reduce the Recidivism of Sex Offenders 10/8/2014."

Similar presentations


Ads by Google