Finding a PersonBOS Finding a Person! Building an algorithm to search for existing people in a system Rahn Lieberman Manager Emdeon Corp (Emdeon.com)

Slides:



Advertisements
Similar presentations
Directorate of Learning Resources Accessing electronic journals from off-campus This causes lots of headaches, but dont despair, heres how to do it! If.
Advertisements

Accessing electronic journals from off- campus This causes lots of headaches, but dont despair, heres how to do it! (Please note – this presentation is.
Organisation Of Data (1) Database Theory
Access 2007 ® Use Databases How can Microsoft Access 2007 help you structure your database?
Extended DISC Online System User Instruction: How to Run a Team Analysis.
Customizing the MOSS 2007 Search Results November 2007 Rafael Perez.
Top Reasons why users call ECCA. Agenda Reason for the call: What is the question or problem? Reason for the call: What is the question or problem? Answer.
Substitute FAQs SubFinder Overview. FAQs Do I have to have touch-tone service to use SubFinder? No, but you do need a telephone that can be switched from.
Let’s try Oracle. Accessing Oracle The Oracle system, like the SQL Server system, is client / server. For SQL Server, –the client is the Query Analyser.
NMED 3850 A Advanced Online Design February 25, 2010 V. Mahadevan.
Database Design Concepts INFO1408 Term 2 week 1 Data validation and Referential integrity.
Pet Fish and High Cholesterol in the WHI OS: An Analysis Example Joe Larson 5 / 6 / 09.
Introduction to Database Systems
03/07/08 © 2008 DSR and LDAP Authentication Avocent Technical Support.
Page 1 Returns Receivings By MIS Department. Page 2 The Returns Process When a store or customer wants to return goods, they are supposed to contact the.
Module 3: Table Selection
Federal Student Aid Identification username and password – this is how students and parents will sign the FAFSA application. The FSA ID process replaced.
Crystal And Elliott Edward M. Kwang President. Crystal Version Standard - $145 Professional - $350 Developer - $450.
FPDS- NG Reports Overview December 16, Today’s Goals Provide an overview of the FPDS-NG reporting capability Demonstrate each of the reporting tools.
24 GOLDEN COINS, 1 IS FAKE ( WEIGHS LESS). DATABASE CONCEPTS Ahmad, Mohammad J. CS 101.
PHP meets MySQL.
MIS 301 Information Systems in Organizations Dave Salisbury ( )
**Database Notes** New Unit Plan Microsoft Access - known as a database management system or DBMS Database – a collection of organized information. Can.
Chapter 5: Data Types (2013) Revision Candidates should be able to know: Identify different data types? Key terms: File, record, field and key field Database.
ISV Innovation Presented by ISV Innovation Presented by Business Intelligence Fundamentals: Data Cleansing Ola Ekdahl IT Mentors 9/12/08.
System Development Lifecycle Verification and Validation.
1.NET Web Forms Business Forms © 2002 by Jerry Post.
Create Lists in Millennium Jenny Schmidt SWITCH Library Consortium.
Page 1 Non-Payroll Cost Transfer Enhancements Last update January 24, 2008 What are the some of the new enhancements of the Non-Payroll Cost Transfer?
Improving Data Quality Tuscaloosa County School System STI Office/District, McAleer PR.
FILES AND DATABASES. A FILE is a collection of records with similar characteristics, e.g: A Sales Ledger Stock Records A Price List Customer Records Files.
EPI 218 Queries and On-Screen Forms Michael A. Kohn, MD, MPP 9 August 2012.
6 th Annual Focus Users’ Conference 6 th Annual Focus Users’ Conference Import Testing Data Presented by: Adrian Ruiz Presented by: Adrian Ruiz.
Confidential Web Ordering Overview. Confidential LOG ON:   Enter your login name &
Database Design Normalisation. Last Session Looked at: –What databases were –Where they are used –How they are used.
A337 - Reed Smith1 Structure What is a database? –Table of information Rows are referred to as records Columns are referred to as fields Record identifier.
Verification & Validation. Batch processing In a batch processing system, documents such as sales orders are collected into batches of typically 50 documents.
Chapter 15 Linked Data Structures Slides prepared by Rose Williams, Binghamton University Kenrick Mock University of Alaska Anchorage Copyright © 2008.
Database Objective Demonstrate basic database concepts and functions.
CPSC 252 Hashing Page 1 Hashing We have already seen that we can search for a key item in an array using either linear or binary search. It would be better.
Customer Service Website. What’s so great about it? Latest information Stock status Pricing Barcodes numbers Back / Forward orders Place.
 Shopping Basket  Stages to maintain shopping basket in framework  Viewing Shopping Basket.
Training Day 5 Customer Order Processing Recipe Manager © User Training September 2014 Recipe Manager Vydata Systems Training Presentation.
M1G Introduction to Programming 2 3. Creating Classes: Room and Item.
Classwork: Common Errors Primary keys: don’t forget them! Primary keys: choose the best one! – “Name” and “birthday” are not the best choices. – “Phone.
Connecting (relating) Data Tables to get Custom Records (Queries) Database Basics.
1 Work Orders. 2 Generating a Work Order There are two methods to generating a Work Order in the WYNNE STSTEM. First method: Option 11 – 12 – 13 * Open.
CSCI 6962: Server-side Design and Programming Shopping Carts and Databases.
Millennium/Agresso Interface Yvonne Desmond, Gillian Donagher, Dublin Institute of Technology
Arizona’s Sentinel Site Data Quality Efforts Fragmented Records and MOGE Coding Lisa Rasmussen Arizona Department of Health Services March 30, 2011.
MICROSOFT ACCESS – CHAPTER 5 MICROSOFT ACCESS – CHAPTER 6 MICROSOFT ACCESS – CHAPTER 7 Sravanthi Lakkimsety Mar 14,2016.
1 Linking Social Security Death Index (SSDI) Data with Registry Data to Update Demographics and Vital Status David O’Brien, PhD, GISP Alaska Cancer Registry.
Patient Identification at DUH “ALWAYS the Right Patient” We put the person who needs our care at the center of everything we do Requirements for Staff.
HMIS Mark For Delete Training 211 Orange County1.
The Concepts of Business Intelligence Microsoft® Business Intelligence Solutions.
Normalisation Unit 6: Databases. Just to recap  What is an Entity  What is an Attribute?
Database (Microsoft Access). Database A database is an organized collection of related data about a specific topic or purpose. Examples of databases include:
Programming Logic and Design Fourth Edition, Comprehensive Chapter 10 Using Menus and Validating Input.
Interacting with Assay Data. Basic Ways to Interact: Experiment: cuts across all assay types Assay: by batch, run or sample.
M M Waseem Iqbal.  Cause: Unverified/unsanitized user input  Effect: the application runs unintended SQL code.  Attack is particularly effective if.
Emdeon Office Batch Management Services This document provides detailed information on Batch Import Services and other Batch features.
THE NAE SCREEN (And Other Relevant Things) “ N ame and A ddress Entry”
DATA TYPES.
Recruiter 2.0 Overview May 1, 2012.
ConnectingOntario ClinicalViewer
1002 Individual Animal Transactions in ZIMS
Local Government Corporation
NextGen Trustee General Ledger Accounting
Presentation transcript:

Finding a PersonBOS Finding a Person! Building an algorithm to search for existing people in a system Rahn Lieberman Manager Emdeon Corp (Emdeon.com)

Finding a PersonBOS Understanding the problem - We work with hospitals around the county - These hospitals (our customers) send us sets of people to work with - We put these people into our internal system for our workflow

Finding a PersonBOS Understanding the problem A little bit of complication: - A person can visit a hospital more than one time. - When this happens, we need to recognize the person so we don’t load them repeatedly And more complication: - The person can then visit another hospital, and we still need to recognize them

Finding a PersonBOS Our Environment We get data from customers in batch files and through real-time (HL7) interfaces Data base be entered into our system via our interfaces or manually We process upwards of a quarter million records a day, importing approximately 10K new records. Electronic entry accounts for approximately 85% of the data entry

Finding a PersonBOS Bragging We get reports of duplicate data being entered approximate once every 2-3 months. On investigation, these are almost always user errors. These are so rare, I don’t have statistics on actual number of errors.

Finding a PersonBOS How’s it Done? What to Know: Know your data! Know that it will take some time to get it right And of course, know your users The data is the actual elements of what you’re working with. Names, birthdates, etc.

Finding a PersonBOS How’s it Done? (cont.) Start with your basic object first. In my case, it’s a person Take stock of common elements your object that should always be know, and are commonly part of your data: First Name, Last Name, Middle Initial Date of Birth Gender Is that enough to uniquely ID someone? If not, keep adding, but realize you may not get everything you want

Finding a PersonBOS How’s it Done? (cont.) How about adding an address and a social security number? That gets close to giving enough information. What? SSN’s are unique! Not everyone has one, or know what it is People may share them or make them up

Finding a PersonBOS How’s it Done? (cont.) Do the same thing for other items in your domain. In my case, hospitals have standard data elements: Account Number, Medical Record Number Unique identifier in our system to distinguish the hospitals Once you have these core data elements, setup a query in your database for them, allowing elements to by null You’ll need to know how data is stored (null versus empty strings, default values, etc.)

Finding a PersonBOS How’s it Done? (cont.) Search and weigh the results Go through all reasonable variations of your data, and collect the PersonID found in your system if one is found. Create a dictionary of all values found After all values are collected, sort them and count up the number of like PersonID’s The PersonID with the most hits is most likely the correct person Note that if none are found, then you end up with a “0” as the most common PersonID. Be sure to account for this.

Finding a PersonBOS Example of weighing First + Last + DOB + Account Number + CustomerID: 20 Last + SSN: 20 Last + Account Number + DOB: 20 First + Medical Record: 19 First + Address + CustomerID: 19 Total count of 20: 3 Total count of 19: 2 Our person is PersonID 20

Finding a PersonBOS More details The key is the number of searches, and making sure the searches provide unique results. In my production system, we do 30+ searches F+M+L names + DOB + Address information SSN searches with first and last names (need to account for last name changes) Customer searches based on our entire customer hierarchy I even go as far as cleaning the data to remove dashes, hyphens and other possible data that may not be entered consistently

Finding a PersonBOS Testing Search for a know person in your database (i.e. PersonID 1, where you know all the data about them) More complicated tests: Pull X number of records, remove some of the data from them, and try to find them again. This works well because you always know what you’re expected to find Sorry, SQL Compact doesn’t support TOP, so I can’t demonstrate. 

Finding a PersonBOS Known issues Babies are hard to match because they don’t have SSN’s and they don’t have names Most hospitals will enter a baby with a generic name, Baby Boy Smith, then change it in their system at a later date (after paperwork has been filed) If a Senior and a Junior live at the same address, and we don’t have both of them in our system already, we may get a false result Speed! 30 queries take a long time to run. I’m looking into ways to speed this up Running multiple instances of the import/search at the same time will fail. We have safeguards against this in place

Finding a PersonBOS Demo This sample data is ALL FAKE! Do not think these are real people. If they are, it’s purely a coincidence Social Security numbers are the phone numbers minus 1 digit This is not my full production system. It’s been simplified to show proof of concept, and it uses SQL Compact for ease of demonstrating offline

Finding a PersonBOS Questions Slides and code available at