Download presentation
Presentation is loading. Please wait.
Published byMark Eaton Modified over 9 years ago
1
A Centralized De- Duplication Service A Centralized De- Duplication Service 2003 Immunization Registry Conference Paul Schaeffer, MPA, NYC DOHMH pschaeff@health.nyc.gov Daryl Chertcoff, HLN Consulting daryl@hln.com Co-Authors: Alexandra Ternier Angel Aponte (DOHMH)
2
Objectives To describe the NYC Department of Health and Mental Hygiene’s (DOHMH) centralized de-duplication service To describe the NYC Department of Health and Mental Hygiene’s (DOHMH) centralized de-duplication service
3
Rationale – Centralized De- Duplication Service Duplication of records – Duplication of records – a department-wide database problem
4
Duplication Rates - DOHMH Databases Program Current Estimated Duplication Rates CIR30% LQ7% CDSS30%
5
Key Terms Master Client Index (MCI) – database that stores information from different programs for matching Master Client Index (MCI) – database that stores information from different programs for matching Core Services – implementation of Business Rules governing the MCI Core Services – implementation of Business Rules governing the MCI De-Duplication Service – matches duplicate records De-Duplication Service – matches duplicate records
6
Background - MCI The MCI integrates data from and provides a centralized de-duplication service to: The MCI integrates data from and provides a centralized de-duplication service to: Citywide Immunization Registry (CIR) Lead Quest Registry (LQ) from the Lead Poisoning and Prevention Program Vital birth records Communicable Disease (Spring 2004) Additional health databases (in the future)
7
Development of MCI Developing Requirements & Specs Developing Requirements & Specs Selecting middleware technology Selecting middleware technology Building MCI Core Services Building MCI Core Services Configuring servers and platforms Configuring servers and platforms Building MCI Administration Tools Building MCI Administration Tools
8
Development of MCI (Continued) Modifying CIR and LQ (first clients) Modifying CIR and LQ (first clients) Training artificial intelligence de-duplication software Training artificial intelligence de-duplication software Data loads into MCI Data loads into MCI Deployment Deployment
9
Master Client Index De-Duplication Service MCI Core Services Win 2000 Servers LQ Client CIR Client MCIAdministrationTools (VB Application) MCIDatabase(Oracle) Unix Server CIRDatabase (Oracle) (Oracle) Unix Server LQ Database (Microsoft SQL ) Win 2000 Server CIR Front End Power Builder Application LQ Front End Power Builder Application CDSS Client CDSS Database (Microsoft SQL ) Win 2000 Server CDSS Front End JSP Web Application
10
MCI – Core Services MCI’s main function - to facilitate matching and be extensible to all DOHMH databases MCI’s main function - to facilitate matching and be extensible to all DOHMH databases Data model - designed with attributes common to all systems Data model - designed with attributes common to all systems Information specific to a particular system may also be stored in the MCI to improve matching Information specific to a particular system may also be stored in the MCI to improve matching
11
MCI – Core Services (Continued) “Person-centric" model “Person-centric" model Artificial intelligence is “trained” by program- specific data Artificial intelligence is “trained” by program- specific data Matching based on probabilistic algorithm Matching based on probabilistic algorithm
12
De-Duplication : Features Potential duplicate pairs are reviewed by humans to train the model Potential duplicate pairs are reviewed by humans to train the model “Artificial Intelligence” model created “Artificial Intelligence” model created Match thresholds are determined Match thresholds are determined
13
De-Duplication : Process Incoming Records to MCI (not client systems) Incoming Records to MCI (not client systems) De-Duplication happens in MCI and trickles down to client systems De-Duplication happens in MCI and trickles down to client systems Clients have access to each other’s data for human review process Clients have access to each other’s data for human review process
14
De-Duplication Service – Some Numbers Estimated 94% of new reports will be either merged or inserted Estimated 94% of new reports will be either merged or inserted Remaining 6% - sent to hold queue for Human Review Remaining 6% - sent to hold queue for Human Review 99.7% accuracy of De-Duplication Service 99.7% accuracy of De-Duplication Service
15
Benefits – Centralized De-Duplication Service Cross-program leveraging of resources Cross-program leveraging of resources Programs have access to other program’s data Programs have access to other program’s data Less FTEs needed for human review – able to re-deploy staff Less FTEs needed for human review – able to re-deploy staff
16
Challenges Who will be responsible for cross program record review – individual programs, or an MCI team? Who will be responsible for cross program record review – individual programs, or an MCI team? Ownership of data – CIR will now disseminate LQ data Ownership of data – CIR will now disseminate LQ data Confidentiality Issues Confidentiality Issues All Clients have access to VR information All Clients have access to VR information CIR has access to LQ data CIR has access to LQ data
17
Fiscal issues Fiscal issues Joint Project Activities – data dissemination Joint Project Activities – data dissemination MCI System Operations & Maintenance – need to divide responsibilities between MIS, MCI, CIR and LQ staff MCI System Operations & Maintenance – need to divide responsibilities between MIS, MCI, CIR and LQ staff Challenges (Continued)
18
Future Plans Environmental Health - Adult Heavy Metal Poisoning Database Environmental Health - Adult Heavy Metal Poisoning Database Expanding the MCI to the rest of DOHMH Expanding the MCI to the rest of DOHMH
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.