Presentation is loading. Please wait.

Presentation is loading. Please wait.

Research Across Multiple Systems: Probabilistic Population Estimation (PPE) Diane Haynes, University of South Florida Rebecca Larsen, University of South.

Similar presentations


Presentation on theme: "Research Across Multiple Systems: Probabilistic Population Estimation (PPE) Diane Haynes, University of South Florida Rebecca Larsen, University of South."— Presentation transcript:

1 Research Across Multiple Systems: Probabilistic Population Estimation (PPE) Diane Haynes, University of South Florida Rebecca Larsen, University of South Florida Shabnam Mehra, University of South Florida Louis de la Parte Florida Mental Health Institute, Mental Health Law & Policy, Policy & Services Research Data Center, Tampa, FL

2 Overview The Policy & Services Research Data Center (PSRDC) performs analyses on multiple administrative databases from a variety of agencies The intent of this presentation is to discuss the ability to perform Probabilistic Population Estimation (PPE) for cross-system analyses using SAS ®

3 Introduction No single agency can meet the all needs to all individuals. Community administrators are using cross system analysis to better understand the needs of the community, barriers to access of services, and patterns of access.

4 Introduction cont.. Cross system analysis can be done using administrative data from multiple systems already available to administrators. Administrative data is everywhere, some data sources include criminal justice, social service, education, community mental health, child protection, emergency medical services, etc.

5 Problem: Agencies data systems do not always share a common identifier Criminal Justice System System Person ID Emergency Medical Services SSN Mental Health/Substance Abuse Service System SSN

6 Statistical Solution Probabilistic Population Estimation (PPE) Caseload Segregation/Integration Ratio (C/SIR) This process relies on administrative data and agency systems do not have to share unique person identifiers. It also avoids the expense of case-by-case matching and sensitive issues of client-patient confidentiality.

7 Probabilistic Population Estimation (PPE) A statistical method for determining the number of people represented in a data set that does not contain a unique identifier. The estimation is based on a comparison of information on the distribution of Date of Birth and Gender in a general population with the distribution of Date of Birth and Gender observed in a data set. The number of distinct birthday/gender combinations that occurred in each data subset are counted. The number of people necessary to produce the observed number of birthday/gender combinations are then calculated.

8 Caseload Segregation / Integration Ratio C/SIR = Duplicated Count Unduplicated Count _________________ * 100 C/SIR is a rating between 0 and 100 which indicates the amount of overlap of clients between agencies. Zero being no overlap at all and 100 being total overlap.

9 Example #1 Overlap between MH/SA & CJIS MH/SA 9,609 Individuals

10 Example #1 Overlap between MH/SA & CJIS MH/SA 9,609 Individuals CJIS 34,169 Individuals

11 Example #1 – FINDINGS Overlap between MH/SA & CJIS - C/SIR rating of 13.9 MH/SA 9,609 Individuals CJIS 34,169 Individuals 1,753 MH/SA System CJIS System

12 Example #2 Overlap between MH/SA & EMS MH/SA 9,609 Individuals

13 Example #2 Overlap between MH/SA & EMS MH/SA 9,609 Individuals EMS 33,207 Individuals

14 Example #2 – FINDINGS Overlap between MH/SA & EMS - C/SIR rating of 6.7 MH/SA 9,609 Individuals EMS 33,207 Individuals 937 MH/SA System EMS System

15 The SAS ® code accomplishes: Computes the actual number of individuals in the first data set Computes the frequency distribution of the number of DOB and gender combinations in the data set Computes the expected number of individuals needed to fill the number of DOB and gender combinations found in the file being used.

16 SAS ® code (Cont….) Computes the lower and upper bounds for the 95% confidence intervals and the z-score Repeats the first four steps above for the second file Combines both data sets and repeats the first four steps

17 SAS ® code (Cont….) Computes the overlap of individual between the two files Computes the C/SIR and prints results Repeats the first four steps for the second data set

18 Issues of PPE and C/SIR 95% Confidence Interval 1:20 Ratio Large data sets Potential to fill up all possible birth/gender combinations

19 Conclusion PPE and C/SIR are two useful tools with which to conduct cross system analysis, especially during a time when pressures from government and other funding sources are increasing their demand for accountability across multiple systems, and the public’s demand for confidentiality of data.

20 Acknowledgements Steve M. Banks & John A. Pandiani www.thebristolobservatory.com The Pinellas County Mental Health & Substance Abuse Data Collaborative Paul Stiles Martha Lenderman Reference Banks, Steven M. & Pandiani, J (2001). Probabilistic population estimation of the size and overlap of data sets based on the date of birth. Statistics in Medicine 20, pp. 1421-1421. Pandiani, J., Banks, S., & Schacht, L. (1998). Personal privacy versus public accountability: A technological solution to an ethical dilemma. The Journal of Behavioral Health Services & Research, 25, pp. 456- 463

21 About the Speaker Speaker Location of company Telephone Fax E-Mail Diane Haynes, M.A. Policy Analysis University of South Florida Mental Health Law & Policy 13301 Bruce B. Downs Blvd, MHC2621 Tampa, Florida 33612 (813) 974-9244 (813) 974-6411 Haynes@fmhi.usf.edu http://psrdc.fmhi.usf.edu/PPE_Savannah_2002.ppt

22 System Integration/Segregation Cumulative of All Four Systems C/SIR Rating of 16 CJIS 34,078IDS 11,351 7,035 DSS 16,101 MMH Unique ID CountPPE Count CJIS 35,351 34,170 DSS 16,176 16,193 IDS 11,640 11,443 MMH 7,104 7,127 * * Overlap between all systems is estimated at 92 people


Download ppt "Research Across Multiple Systems: Probabilistic Population Estimation (PPE) Diane Haynes, University of South Florida Rebecca Larsen, University of South."

Similar presentations


Ads by Google