BENEFIT-OF-THE-DOUBT APPROACHES FOR CALCULATING A COMPOSITE MEASURE OF QUALITY By Michael Shwartz, James F. Burgess, Jr. (Presenting), and Dan Berlowitz.

Slides:



Advertisements
Similar presentations
Protecting the Public through Disciplinary Action Maryann Alexander, PhD, RN, FAAN Kathleen Russell, JD, RN.
Advertisements

Hawawini & VialletChapter 7© 2007 Thomson South-Western Chapter 7 ALTERNATIVES TO THE NET PRESENT VALUE RULE.
SUMMARY OF THE CHANGES TO FIVE STAR ANNOUNCED BY CMS Mark Parkinson AHCA/NCAL President & CEO All member call February 13 th, 2015.
Chapter 10 Decision Making © 2013 by Nelson Education.
DEVELOPMENT OF A PREFERENCE-BASED, CONDITION SPECIFIC PATIENT REPORTED OUTCOME MEASURE FOR USE WITH VENOUS ULCERATION Simon Palfreyman 1, John E Brazier.
Quality-Based Purchasing: Challenges, Tough Decisions, and Options R. Adams Dudley, MD, MBA Support: Agency for Healthcare Research and Quality, California.
CONCEPTUAL ISSUES IN CONSTRUCTING COMPOSITE INDICES Nadia Farrugia Department of Economics, University of Malta Paper prepared for the INTERNATIONAL CONFERENCE.
Chapter 10 Establishing the Performance Management System
1 Single Indicator & Composite Measures UAPP 702: Research Design for Urban & Public Policy Based on notes by Steven W. Peuquet. Ph.D.
Spring INTRODUCTION There exists a lot of methods used for identifying high risk locations or sites that experience more crashes than one would.
Demand and Elasticity A high cross elasticity of demand [between two goods indicates that they] compete in the same market. [This can prevent a supplier.
Example 4.7 Data Envelopment Analysis (DEA) | 4.2 | 4.3 | 4.4 | 4.5 | Background Information n Consider a group of three hospitals.
Establishing the Performance Management System
AN INTRODUCTION TO PORTFOLIO MANAGEMENT
Rankings: What do they matter, what do they measure? Anne McFarlane August 18, 2010.
© 2008 Pearson Addison Wesley. All rights reserved Chapter Four Demand.
COST–EFFECTIVENESS ANALYSIS AND COST-UTILITY ANALYSIS
What is the Most Effective Way to Produce Food Safety? INFORMS Seminar Series Isenberg School of Management October 29, 2004 Julie A. Caswell
LP, Excel, and Merit – Oh My! (w/apologies to Frank Baum) CIT Research/Teaching Seminar Series (Oct 4, 2007) John Seydel.
PSY 307 – Statistics for the Behavioral Sciences Chapter 8 – The Normal Curve, Sample vs Population, and Probability.
Automated Assessment of Mobility in Bedridden Patients Advisor: Dr. Chun-Ju Hou Presenter: Si-Ping Chen Date:2014/12/10 35th Annual International Conference.
Model Reports for the AHRQ Quality Indicators Shoshanna Sofaer, Dr.P.H. School of Public Affairs Baruch College.
AN INTRODUCTION TO PORTFOLIO MANAGEMENT
TAYLOR HOWARD The Employment Interview: A Review of Current Studies and Directions for Future Research.
Constructing the Welfare Aggregate Part 2: Adjusting for Differences Across Individuals Bosnia and Herzegovina Poverty Analysis Workshop September 17-21,
Evaluation of portfolio performance
Measurement in Survey Research MKTG 3342 Fall 2008 Professor Edward Fox.
 Daimler Benz in 1993 under German GAAP reported a profit of 168 million DM but under US GAAP for the same period, the company reported a loss of almost.
9-1 Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Multicriteria Decision Making Chapter 9.
Sharing and explaining the standardized infection ratio (SIR): Does your audience prefer words, colors, and/or δymβφĨs? Dana Burshell, MPH, CPH, CIC HAI.
Descriptive Statistics Used to describe the basic features of the data in any quantitative study. Both graphical displays and descriptive summary statistics.
MEASUREMENT OF VARIABLES: OPERATIONAL DEFINITION AND SCALES
Summary of measures of population Health Farid Najafi MD PhD School of Population Health Kermanshah University of Medical Sciences.
Performance Measurement and Analysis for Health Organizations
DNA microarray technology allows an individual to rapidly and quantitatively measure the expression levels of thousands of genes in a biological sample.
Chapter 6 Production. ©2005 Pearson Education, Inc. Chapter 62 Topics to be Discussed The Technology of Production Production with One Variable Input.
Leapfrog Hospital Rewards Program™: Implementation Options Catherine Eikel February 6, 2006.
Chapter 10 Performance Management GROUP MEMBERS Muhammad Waqas Aftab Tahir Ahsan Ijaz Waqas Mehmood Shahyar Shahzad Muhammad Subayal.
Considerations in Public Reporting of the AHRQ QIs Shoshanna Sofaer, Dr.P.H. School of Public Affairs Baruch College.
Quality Through the Eyes of the Patient: State-of-the-Art Concepts Paul D. Cleary, Ph.D. April 10, 2001 Quality Through the Eyes of the Patient: State-of-the-Art.
The Leapfrog Hospital Recognition Program A program of The Leapfrog Group.
Gary J. Young 1 Designing and Implementing Pay-for-Performance Programs: Ongoing Challenges Gary J. Young, J.D., Ph.D. Boston University Presentation for.
Integrating Clinical Data Warehouses: How Can Multi- System Care for Older Veterans Be Measured Consistently? AcademyHealth Annual Research Meeting Tuesday,
Slides to accompany Weathington, Cunningham & Pittenger (2010), Chapter 3: The Foundations of Research 1.
Employing Empirical Data in Judgmental Processes Wayne J. Camara National Conference on Student Assessment, San Diego, CA June 23, 2015.
Stephen G. CECCHETTI Kermit L. SCHOENHOLTZ Understanding Risk Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin.
Copyright © 2011 by the American Academy of Actuaries Potential Approaches to Calculating Actuarial Value Cori E. Uccello, FSA, MAAA, MPP Senior Health.
Evaluating Risk Adjustment Models Andy Bindman MD Department of Medicine, Epidemiology and Biostatistics.
1 Psych 5500/6500 Measures of Variability Fall, 2008.
Copyright © 2009 Pearson Prentice Hall. All rights reserved. Chapter 8 Investor Choice: Risk and Reward.
Measurement Theory in Marketing Research. Measurement What is measurement?  Assignment of numerals to objects to represent quantities of attributes Don’t.
IMPORTANCE OF STATISTICS MR.CHITHRAVEL.V ASST.PROFESSOR ACN.
Constructing the Welfare Aggregate Part 2: Adjusting for Differences Across Individuals Salman Zaidi Washington DC, January 19th,
Measurement. Proposal Second Draft  Title Page  Introduction  Methods Section Participants Materials Procedure  Appendix IRB Form Consent Form Debriefing.
Sampling Design & Measurement Scaling
Recall: Consumer behavior Why are we interested? –New good in the market. What price should be charged? How much more for a premium brand? –Subsidy program:
Part II – Chapters 6 and beyond…. Reliability, Validity, & Grading.
Chapter Twelve Copyright © 2006 McGraw-Hill/Irwin Attitude Scale Measurements Used In Survey Research.
Towards an Agenda for Measuring Efficiency in Health Care Michael Chernew Sept. 27, 2007.
Improving Nursing Home Compare for Consumers Five-Star Quality Rating System.
Chapter 8 Performance Management and Appraisal
Usefulness of Nursing Home Quality Measures and Quality Indicators for Assessing Skilled Nursing Facility Rehabilitation Outcomes Burton Silverstein, PhD.
Thinking about Well-being: nef’s dynamic model October 2011 OSI Education Programme workshop Charles Seaford Head of the Centre for Well-being, nef.
Hospital Use of Supplemental Nurses and Patient Mortality and Failure to Rescue Jingjing Shang, PhD, RN Columbia University School of Nursing Ying Xue,
5-Star Nursing Home Rating System
Classroom Grading: A Summative Evaluation
Network Screening & Diagnosis
Conjoint analysis.
Understanding How the Ranking is Calculated
Presentation transcript:

BENEFIT-OF-THE-DOUBT APPROACHES FOR CALCULATING A COMPOSITE MEASURE OF QUALITY By Michael Shwartz, James F. Burgess, Jr. (Presenting), and Dan Berlowitz Funded by VA Health Services Research and Development grant IIR

Context and Background I Standard Approaches for Creation of Composite Measures of Quality (Quality Indicators -- QIs) – Equal Weighting – Prevalence Based Weights – Judgment Based Weights Concept of Benefit-of-the-Doubt Approaches – Relative Performance represents a Measure of Revealed Preferences by the Organizational Unit on Relative Importance

Context and Background II Distinguish two types of Composite Measures – Reflective Measures (manifestations of construct) – Formative Measures (defined by individual QIs) Illustrate some approaches to create Formative Measures from QIs QIs are not Highly Correlated and Explicitly are Added to Include More QI Dimensions

Benefit-of-the-Doubt Measures Nardo et al. (OECD-2005) Review of Methods Benefit-of-the-Doubt Approaches Recognize Revealed Preferences w/Higher Weights Cherchye et al. (JORS-2007) and Semple (EJOR-1996) note this is the Natural Outcome of Nash evaluation game: Regulator v. Org. Mostly used to date to Compare Countries (e.g. Lovell (IJPE-1995), Despotis (JORS-2005))

Criticism and Intuition If Weights are Organization-Specific are Comparisons Across Units Possible? – Dropping Lowest Grade Example – Data Envelopment Analysis (DEA) does this If Final Comparisons are made on Relative Basis then what happens in practice? – No one knows in advance who benefits most – Actual rankings may not change much – Dropping or downweighting lower scores may buy good will from the organization/student at low cost

Purpose Statement Imagine we have a fixed set of QIs with a reporting period just ended Goals for the Regulator might be: – Facilitating consumer choice with gestalt value – Pay-for-Performance to reward high performers – Quality improvement learning to spread value Comparative Approaches – DEA (here all QIs are reported on the same scale) – Simple LP Optimizing subject to constraint that weights sum to 1 (needs QIs on the same scale)

Example: VA Nursing Homes (1998) 35 Nursing Homes in VA (Berlowitz et al. 2003) Five QIs Reflecting Patient Change Over Time – Pressure Ulcer Development – Functional Decline – Behavioral Decline – Mortality – Preventable Hospitalization All QIs are Risk Adjusted w/Published Models 32 Nursing Homes with no Missing Data used

Calculating the QIs Many ways can be used to calculate a QI, not of importance in this example Model generates Predicted Probability of 6 month adverse event given initial risk Add up observed adverse events (O) Add up predicted probabilities (E) We create O/E Ratios which are widely used

Comparisons of Composites Equal Weights Model Facility-Specific Prevalence Weights Model Overall Prevalence-Based Weights Model Simple LP Model (weights sum to 1) Weight Constrained DEA Model – Employ Rachel Allen/Thanassoulis Constrained Ratio of the Weights Measure – This does not permit some QIs to drop weights to near zero (the student drop the lowest grade model)

Table 2: Composite scores and facility ranks for high and low ranked facilities Composite ScoreFacility Ranks facility- specific prevalence- based weights facility- specific prevalence- based weights overall prevalence- based weights overall prevalence- based weights equal weights simple LP model* equal weights simple LP model* facility DEA* *: results for the benefit-of-the-doubt approaches are for allowable weight adjustments of of overall prevalence-based weights

Weight Constrained Models Tested We Test Differences in Ranks/Correlation Levels: Allowable Weight Adjustments 1: *overall prevalence-based weights 2: * overall prevalence-based weights 3: *overall prevalence-based weights 4: *overall prevalence-based weights 5: no constraints

Figure 1: Comparison of ranks using overall prevalence-based weights to ranks using each of the benefit-of-the-doubt approaches with different amounts of allowable weight adjustments (previous slide) Part A: Average difference in ranks

Part B: Correlation

Outcomes of Benefit-of-the-Doubt There is no gold standard for weighting But “equal weighting” is a choice and may generate: “these weights do not reflect what is important to our patients” Face validity? A moving concept? Post Hoc Discussion of Weights can only be Self-Serving But if true preferences are reflected in performance this approach should lessen tensions and improve trust and engagement No Need to Blame the Messenger!

Other Outcomes and Benefits We know Risk Adjustment is imperfect, so some adjustment is made Using Weight Constraints in DEA Allows Policymakers to Choose how far to go – We used simple constraints but others possible DEA has been used before and has favorable properties (Nash outcome, flexible to scores) DEA also has negatives (best with large amounts of data to set benchmarks) Simple LP must be normalized but may be more transparent than DEA – Simplicity a Virtue

Incentive Effects and Gaming If a organization performs similarly on all QIs it gains no value from the approach – Unless scoring “high” relative performance suffers Managers will focus on QIs where they can improve & which are most important to them P4P Programs now leaning toward rewards for attainment and improvement (to balance incentives), this method can combine or use regulator weights between them for totals

Limitations and Improvements Simple O/E ratios can be improved upon – (O-E)/variance (O) or z-scores – Hierarchical modeling results (Bayesian or not) More data, more measures, more recent data, more data over time all can be incorporated CMS Nursing Home Compare has a relatively complex algorithm while CMS Hospital Compare currently using simpler methods – Concept of “Five Star” systems

Final Thoughts Explosion of Quality Measures (QIs) in recent years Measurement of Composites is going to continue to be debated Inherent limitations (safety net facilities, incomplete risk adjustment) support flexibility to generate trust and buy-in Benefit-of-the-Doubt Measures should be part of the discussion