Risk scoring Allan Wardhaugh. Why bother? Comparison of performance between units Comparison of performance between units Used in RCT to adjust for case-mix.

Risk scoring Allan Wardhaugh

Why bother? Comparison of performance between units Comparison of performance between units Used in RCT to adjust for case-mix Used in RCT to adjust for case-mix

Standardised Mortality Ratio Measured mortality Measured mortality Predicted mortality – risk adjustment tool Predicted mortality – risk adjustment tool SMR = Measured/ Predicted SMR = Measured/ Predicted –  SMR > 1performing poorly – SMR < 1performing well

Making a risk adjustment tool

Regression statistics Target variable – ‘dependent variable’ Target variable – ‘dependent variable’ Predictors – ‘independent variables’ Predictors – ‘independent variables’ Regression statistics use association between variables to predict one (DV) from another (IV). Regression statistics use association between variables to predict one (DV) from another (IV). Simplest form y = b 0 + b 1 (x) Simplest form y = b 0 + b 1 (x) where y = predicted value, b 0 = regression constant, b 1 = regression coefficient where y = predicted value, b 0 = regression constant, b 1 = regression coefficient Multiple regression Multiple regression y = b 0 +b 1 (x 1 )+b 2 (x 2 )+…b n (x n ) y = b 0 +b 1 (x 1 )+b 2 (x 2 )+…b n (x n )

Regression statistics - logistic Linear multiple regression Linear multiple regression –DV and IV quantitative For non-quantitative DV (e.g. dead/alive), logistic regression is used For non-quantitative DV (e.g. dead/alive), logistic regression is used –Relationship with IV may be non-linear For each IV, odds are calculated for likelihood of having DV For each IV, odds are calculated for likelihood of having DV Odds very assymetrical Odds very assymetrical –very small number (0 – 1) if event unlikely –very large if event likely (>1 - ∞) Rectified by using natural log off odds – called logit – makes it a linear function Rectified by using natural log off odds – called logit – makes it a linear function

Regression statistics - logit Logit = b 0 +b 1 (x 1 )+b 2 (x 2 )+…b n (x n ) Logit = b 0 +b 1 (x 1 )+b 2 (x 2 )+…b n (x n ) Probability = odds/(1 + odds) Probability = odds/(1 + odds) Logit = ln odds Logit = ln odds  p = e logit /(1 + e logit )  p = e logit /(1 + e logit )

PRISM Pediatric Risk of Mortality Score Pediatric Risk of Mortality Score 14 physiological variables 14 physiological variables –Worst measurement in first 24 hours –Now on PRISM III – relies on scores in first 12 or 24 hours Probability of PICU death Probability of PICU death = e R /1 + e R = e R /1 + e R Where R = 0.207  PRISM – 0.005  age(mo) – 0.433  operative status – 4.782

PRISM – example 60 month old non-surgical patient PRISM Score Mortality Risk (%) 31.6 62.7 95 128.9 1515.3 1825.2 2138.6 2446.1 2768.5 3080.2

PRISM - disadvantages Data collection cumbersome (14 variables over a 24 hour period) Data collection cumbersome (14 variables over a 24 hour period) May diagnose death rather than predict it (40% deaths occur in first 24 hours) May diagnose death rather than predict it (40% deaths occur in first 24 hours) Score may not allow comparison between units – patients poorly managed in first 24 hours will develop high PRISM score, so disease severity will appear to be greater Score may not allow comparison between units – patients poorly managed in first 24 hours will develop high PRISM score, so disease severity will appear to be greater

PRISM III

PIM – Paediatric Index of Mortality – initial cohorts 678 consecutive admissions PICU RCHM 1988 678 consecutive admissions PICU RCHM 1988 814 consecutive admissions RCHM 1990 814 consecutive admissions RCHM 1990 1412 consecutive admissions 1994–5 RCHM 1412 consecutive admissions 1994–5 RCHM

PIM – identifying variables Data collected for admission (for most) and first 24 hours Data collected for admission (for most) and first 24 hours 34 Physiological Stability Index measurements 34 Physiological Stability Index measurements MAP, PIP, PEEP, and others MAP, PIP, PEEP, and others Worst value in first 24 hours for all Worst value in first 24 hours for all

PIM – derivation of model All PRISM data collected plus additional information All PRISM data collected plus additional information Univariate analysis carried out on all factors to test for association with mortality (Chi squared dichotomous variables, Copas p by x plots continuous variables) Univariate analysis carried out on all factors to test for association with mortality (Chi squared dichotomous variables, Copas p by x plots continuous variables) Factors not associated (p>0.1) excluded from further analysis Factors not associated (p>0.1) excluded from further analysis Logistic regression analysis used to derive preliminary model. Logistic regression analysis used to derive preliminary model.

PIM – testing the model Learning and Test cohorts Learning and Test cohorts –1994 – 96 5695 patients in 8 PICUs (Australia, Birmingham) Enough patients in each unit to include 20 deaths. Enough patients in each unit to include 20 deaths. Learning sample data analysed to calculate regression coefficients Learning sample data analysed to calculate regression coefficients Model then tested on test sample, and examined for goodness of fit. Model then tested on test sample, and examined for goodness of fit. Regression coefficients re-estimated using all 8 units for final model. Regression coefficients re-estimated using all 8 units for final model. Risk of death assigned to 5 groups - <1%, 1–4%, 5– 14%, 15–29% and 30% Risk of death assigned to 5 groups - <1%, 1–4%, 5– 14%, 15–29% and 30%

PIM - results

PIM – final equations e logit /(1+e logit ) Logit = (2.357.pupils) +(1.826.specified diagnosis) +(–1.552.elective admission) +(1.342.mechanical ventilation) +(0.021.(SBP–120)) +(0.071.Baseex) +(0.415.(100.FiO2/PaO2)) –4.873

UHW PICU PIM

PIM and PRISM compared Variables used by PIM that are not used by PRISM are – – presence of a specified diagnosis – – use of mechanical ventilation – –plasma base excess Variables used by PRISM that are not used by PIM –diastolic blood pressure, heart rate –respiratory rate, pCO2 – the Glasgow Coma Score (three separate variables) –prothrombin time, serum bilirubin, serum potassium, serum calcium, blood glucose and plasma bicarbonate

PRISM vs PIM PRISM predicted 66% more deaths in this sample PRISM predicted 66% more deaths in this sample Score altered by treatment in the first 24 hours Score altered by treatment in the first 24 hours May diagnose rather than predict death May diagnose rather than predict death PRISM III data requires 96 measured variables PRISM III data requires 96 measured variables License required License required Note that neither are adequate fro individual case prediction – apply to populations only Note that neither are adequate fro individual case prediction – apply to populations only

PIM - recalibration PICU outcomes change with time PICU outcomes change with time Referral patterns change with time Referral patterns change with time Attitudes to withdrawing and limiting care may change with time Attitudes to withdrawing and limiting care may change with time

PIM 2 14 PICUs 14 PICUs –8 Australia –4 UK –2 NZ 20 787 patients 1997-1998 20 787 patients 1997-1998 Units randomly assigned to be learning sample or testing sample for new model Units randomly assigned to be learning sample or testing sample for new model

PIM 2 PIM applied to new population (all units) PIM applied to new population (all units) –Observed to expected deaths Poorly performing variables altered to make prediction better Poorly performing variables altered to make prediction better Re-tested by forward and backward logistic regression to produce new model Re-tested by forward and backward logistic regression to produce new model New model applied to learning sample – coeffciients adjusted and applied to testing sample New model applied to learning sample – coeffciients adjusted and applied to testing sample

Calibration findings Specific diagnosis Specific diagnosis –Resp illness O:E 160:212 –Non-cardiac post-op O:E 48:82 293 coded diagnostic categories examined 293 coded diagnostic categories examined –In-hospital cardiac arrest associated with increaed risk of death –Asthma. Bronchiolitis, croup, obstructive sleep apnoea, DKA associated with reduced risk New ‘high risk’ and ‘low risk’ categories introduced New ‘high risk’ and ‘low risk’ categories introduced Post – op subdivided into with or without CBP. Post – op subdivided into with or without CBP. IQ <35 omitted (difficult to code reliably) IQ <35 omitted (difficult to code reliably)

SMR Australia and New Zealand SMR 0.84 (0.76–0.92) UK 0.89 (0.77–1.00).

New coefficients

United Kingdom Paediatric Intensive Care Outcome Study UK PICOS (phase I)

Mortality ratio calculated using the UK PICOS calibration of PIM in the UK. Upper and lower control limits represent a 99.9% confidence interval around a mortality ratio of 1 based on the UK PICOS overall mortality of 6.2%.. PIM mortality ratio (observed/expected unit deaths) by unit. Generated using UK PICOS recalibration

Phase I outcome PRISM III 24 hour score re-calibrated for UK PRISM III 24 hour score re-calibrated for UK Performance of PIM-2 and PRISM III very similar Performance of PIM-2 and PRISM III very similar PIM – 2 recommended as model of choice as data easier to collect PIM – 2 recommended as model of choice as data easier to collect

DoH/ WAG funded DoH/ WAG funded Run from Universities of Sheffield, Leicester and Leeds Run from Universities of Sheffield, Leicester and Leeds First annual report March 2003 – February 2004 First annual report March 2003 – February 2004

PELOD Death is relatively infrequent outcome (6%) in PICU Death is relatively infrequent outcome (6%) in PICU –Sample sizes needed for trials need to be large to detect different outcomes MODS more prevalent (11 – 27%) MODS more prevalent (11 – 27%) –Correlates well with risk of death –Good proxy outcome measure for risk of death

PELOD Prospective study – 7 PICUS France, Canada, Switzerland Prospective study – 7 PICUS France, Canada, Switzerland 18months 11998 – 2000 18months 11998 – 2000 1806 patients (<18yrs) 1806 patients (<18yrs)

probability of death=1/(1+exp [7·64–0·30PELOD score])

League tables Governments like them Governments like them Journalists like them Journalists like them Local politicians like them Local politicians like them Patients groups like them Patients groups like them Do any of the above understand them?

9 NICUs over 6 years 9 NICUs over 6 years Crude and risk adjusted (CRIB score) mortality Crude and risk adjusted (CRIB score) mortality Hospitals ranked in league tables each year according to W score Hospitals ranked in league tables each year according to W score – –W= 100  (observed - expected deaths)/No of admissions. – –Mortality lower than expected if W < 0

Results

Conclusions Hospitals varied annually in their league position Hospitals varied annually in their league position Confidence intervals for W scores overlapped for all hospital every year except year 3 Confidence intervals for W scores overlapped for all hospital every year except year 3 ‘Overall, hospital 1 did perform significantly better than expected but it is debatable whether this makes it a model hospital since its performance was inconsistent’.

Summary PIM/ PIM 2 data easy to collect PIM/ PIM 2 data easy to collect Useful in comparing unit performance Useful in comparing unit performance Interpret with care if number of deaths low (especially <20). Interpret with care if number of deaths low (especially <20). Not for use as an individual prediction test Not for use as an individual prediction test Important to complete as accurately as possible Important to complete as accurately as possible PICANET randomly check to ensure data quality PICANET randomly check to ensure data quality League tables are unreliable League tables are unreliable

Risk scoring Allan Wardhaugh. Why bother? Comparison of performance between units Comparison of performance between units Used in RCT to adjust for case-mix.

Similar presentations

Presentation on theme: "Risk scoring Allan Wardhaugh. Why bother? Comparison of performance between units Comparison of performance between units Used in RCT to adjust for case-mix."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Risk scoring Allan Wardhaugh. Why bother? Comparison of performance between units Comparison of performance between units Used in RCT to adjust for case-mix.

Similar presentations

Presentation on theme: "Risk scoring Allan Wardhaugh. Why bother? Comparison of performance between units Comparison of performance between units Used in RCT to adjust for case-mix."— Presentation transcript:

Similar presentations

About project

Feedback