MEDICAL RECORD BROKER -LAVANYA GUNDAMARAJU
Introduction Introduction n Database and database systems have become an essential part of everyday life. n Many a times information has to be gathered from multiple sources, especially in distributed architectures. n It is difficult to cull together the large repositories of data to form a consistent database n Data Mining techniques can be used to extract advantageous of information.
PROBLEM UNDER CONSIDERATION PROBLEM UNDER CONSIDERATION n MAHI database. n Kelly’s code to access data from MAHI Database.
Database:- n Large and extensive n Null values in SSN and Name fields. n Duplicate records. n Inconsistent data. n Invalid entries. n Wife and husband sharing the same SSN. n Mother and child sharing name and SSN
PROBLEMS WITH KELLY’S CODE. n Key used to identify the record is SSN. n Incase of failure of SSN the combination of Name and DOB are used as keys. n Probability check became an overhead.
n
Suggested solution n Data Mining. n Criteria of selection may be improved by choosing the SSN, Date Of Birth and Sex at the first level n A check for name match will be made only for the sake of support. n Weight of the record also forms a key.
n Incase the first level search fails, name is the criteria used but with DOB as an added on criterion. n Update the database by using the details in hand and the data from the database after the record has been found. n Use partitioning technique to reduce the number of database scans.
n
n Transaction reduction rule a Data Mining technique is used to reduce the working set of transactions. n The algorithm takes into consideration, match mismatch and null values in the record being tested. n The final record that is returned is weighted record.
Conclusion n The algorithm used will survive distributed architecture. n Data Mining tools are used in the algorithm which support optimization. n Data Mining is the science that forms a yard stick to various applications.
??Questions?? n