Download presentation
Presentation is loading. Please wait.
Published byJudith Boone Modified over 9 years ago
1
Finding a PersonBOS Finding a Person! Building an algorithm to search for existing people in a system Rahn Lieberman Manager Emdeon Corp (Emdeon.com) rahn@theliebermans.com http://rahnsworld.com
2
Finding a PersonBOS Understanding the problem - We work with hospitals around the county - These hospitals (our customers) send us sets of people to work with - We put these people into our internal system for our workflow rahn@theliebermans.com http://rahnsworld.com
3
Finding a PersonBOS Understanding the problem A little bit of complication: - A person can visit a hospital more than one time. - When this happens, we need to recognize the person so we don’t load them repeatedly And more complication: - The person can then visit another hospital, and we still need to recognize them rahn@theliebermans.com http://rahnsworld.com
4
Finding a PersonBOS Our Environment We get data from customers in batch files and through real-time (HL7) interfaces Data base be entered into our system via our interfaces or manually We process upwards of a quarter million records a day, importing approximately 10K new records. Electronic entry accounts for approximately 85% of the data entry rahn@theliebermans.com http://rahnsworld.com
5
Finding a PersonBOS Bragging We get reports of duplicate data being entered approximate once every 2-3 months. On investigation, these are almost always user errors. These are so rare, I don’t have statistics on actual number of errors. rahn@theliebermans.com http://rahnsworld.com
6
Finding a PersonBOS How’s it Done? What to Know: Know your data! Know that it will take some time to get it right And of course, know your users The data is the actual elements of what you’re working with. Names, birthdates, etc. rahn@theliebermans.com http://rahnsworld.com
7
Finding a PersonBOS How’s it Done? (cont.) Start with your basic object first. In my case, it’s a person Take stock of common elements your object that should always be know, and are commonly part of your data: First Name, Last Name, Middle Initial Date of Birth Gender Is that enough to uniquely ID someone? If not, keep adding, but realize you may not get everything you want rahn@theliebermans.com http://rahnsworld.com
8
Finding a PersonBOS How’s it Done? (cont.) How about adding an address and a social security number? That gets close to giving enough information. What? SSN’s are unique! Not everyone has one, or know what it is People may share them or make them up rahn@theliebermans.com http://rahnsworld.com
9
Finding a PersonBOS How’s it Done? (cont.) Do the same thing for other items in your domain. In my case, hospitals have standard data elements: Account Number, Medical Record Number Unique identifier in our system to distinguish the hospitals Once you have these core data elements, setup a query in your database for them, allowing elements to by null You’ll need to know how data is stored (null versus empty strings, default values, etc.) rahn@theliebermans.com http://rahnsworld.com
10
Finding a PersonBOS How’s it Done? (cont.) Search and weigh the results Go through all reasonable variations of your data, and collect the PersonID found in your system if one is found. Create a dictionary of all values found After all values are collected, sort them and count up the number of like PersonID’s The PersonID with the most hits is most likely the correct person Note that if none are found, then you end up with a “0” as the most common PersonID. Be sure to account for this. rahn@theliebermans.com http://rahnsworld.com
11
Finding a PersonBOS Example of weighing First + Last + DOB + Account Number + CustomerID: 20 Last + SSN: 20 Last + Account Number + DOB: 20 First + Medical Record: 19 First + Address + CustomerID: 19 Total count of 20: 3 Total count of 19: 2 Our person is PersonID 20 rahn@theliebermans.com http://rahnsworld.com
12
Finding a PersonBOS More details The key is the number of searches, and making sure the searches provide unique results. In my production system, we do 30+ searches F+M+L names + DOB + Address information SSN searches with first and last names (need to account for last name changes) Customer searches based on our entire customer hierarchy I even go as far as cleaning the data to remove dashes, hyphens and other possible data that may not be entered consistently rahn@theliebermans.com http://rahnsworld.com
13
Finding a PersonBOS Testing Search for a know person in your database (i.e. PersonID 1, where you know all the data about them) More complicated tests: Pull X number of records, remove some of the data from them, and try to find them again. This works well because you always know what you’re expected to find Sorry, SQL Compact doesn’t support TOP, so I can’t demonstrate. rahn@theliebermans.com http://rahnsworld.com
14
Finding a PersonBOS Known issues Babies are hard to match because they don’t have SSN’s and they don’t have names Most hospitals will enter a baby with a generic name, Baby Boy Smith, then change it in their system at a later date (after paperwork has been filed) If a Senior and a Junior live at the same address, and we don’t have both of them in our system already, we may get a false result Speed! 30 queries take a long time to run. I’m looking into ways to speed this up Running multiple instances of the import/search at the same time will fail. We have safeguards against this in place rahn@theliebermans.com http://rahnsworld.com
15
Finding a PersonBOS Demo This sample data is ALL FAKE! Do not think these are real people. If they are, it’s purely a coincidence Social Security numbers are the phone numbers minus 1 digit This is not my full production system. It’s been simplified to show proof of concept, and it uses SQL Compact for ease of demonstrating offline rahn@theliebermans.com http://rahnsworld.com
16
Finding a PersonBOS Questions Slides and code available at http://rahnsworld.comhttp://rahnsworld.com rahn@theliebermans.com http://rahnsworld.com
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.