Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Data Linkage Project Florida’s Newborn Screening Program Gary Sammet Bureau of Vital Statistics.

Similar presentations


Presentation on theme: "1 Data Linkage Project Florida’s Newborn Screening Program Gary Sammet Bureau of Vital Statistics."— Presentation transcript:

1 1 Data Linkage Project Florida’s Newborn Screening Program Gary Sammet Bureau of Vital Statistics

2 2 Outline Data Linkage Approach Data Linkage Approach Start with Probabilistic Linking Start with Probabilistic Linking Data Linkage Automated Process Flow Data Linkage Automated Process Flow Data Processing Design: Linking Variables, Weights, Bonuses, Use of Jaro-Winkler Data Processing Design: Linking Variables, Weights, Bonuses, Use of Jaro-Winkler Data Processing Sample Results Data Processing Sample Results

3 3 Data Linkage Approach VS & LAB work closely together VS & LAB work closely together System can accommodate needs System can accommodate needs Reduce duplication of efforts Reduce duplication of efforts Reconciliation Reconciliation All births have a screening record All births have a screening record All screening records have a birth All screening records have a birth Most cost effective with best results Most cost effective with best results

4 4 Start With Probabilistic Linking Identify linking variables - assign initial weight based on understanding & experience w/data Identify linking variables - assign initial weight based on understanding & experience w/data Run initial linking - sort by weight & display linkage flags to see data patterns/anomalies Run initial linking - sort by weight & display linkage flags to see data patterns/anomalies Adjust weights as needed w/o changing code Adjust weights as needed w/o changing code Define deterministic rules to ensure consistent linking in automated process Define deterministic rules to ensure consistent linking in automated process

5 Data Linkage Automated Process

6 Linking Variables & Weights Time of birth0.85 Facility Name and Zipcode0.75 Facility Name w Jaro-Winkler Score.899+ and match ZipCode0.65 Facility Name w Jaro-Winkler Score.800-.898 and match ZipCode0.55 Facility Name w Jaro-Winkler score.899+ and match Facility City0.65 Facility Name w Jaro-Winkler score.800-.898 and match Facility City0.55 Facility Name w Jaro-Winkler score.899+ and Facility Address w JW score.85+0.65 Facility Name w Jaro-Winkler score.800-.898 and Facility Address w JW score.85+0.55 Facility Address w Jaro-Winkler score.85+ and match Facility City0.65 Infant Last Name w JW score.899+ match to Last Name/Mother Last Name0.65 Infant Last Name w JW score.800-.898 match to Last Name/Mother Last Name0.30 Mother Current Last Name0.25 Mother SSN0.25 Mother Address0.25 Mother Full Name – Bonus0.25

7 Linking Variables & Weights Sex of Infant0.25 Infant Full Name – Bonus0.25 Infant First Name w JW score.76+ and Infant Last Name w JW score 85+0.20 Mother First Name0.20 Mother First Name w JW score.76+ and Mother Last Name w JW score.85+0.20 Mother Address w JW score.85+0.20 Current Address to Mother Address w JW score.899+ and match City0.20 Current Address to Mother Address w JW score.85+ and match Mother First Name0.20 Infant Full Name w JW score of.85+ – Bonus0.15 Mother Full Name w JW score of.85+ – Bonus0.15 Father Last Name0.10 Plurality0.10 Birth Order0.10 Infant First Name0.05 Infant Last Name0.05

8 8 Weight Bonuses DOB, Time of Birth, Sex, Facility + Zipcode (MFirst or MSSN) BONUS =.50 DOB, Time of Birth, Sex, Facility + Zipcode (MFirst or MSSN) BONUS =.50 DOB, Time of Birth, Sex, Facility-JW + Zipcode (MFirst or MSSN) BONUS =.40 DOB, Time of Birth, Sex, Facility-JW + Zipcode (MFirst or MSSN) BONUS =.40 DOB, Time of Birth, Sex, Facility + Zipcode BONUS =.20 DOB, Time of Birth, Sex, Facility + Zipcode BONUS =.20 DOB, Time of Birth, Sex, Facility-JW + Zipcode BONUS =.15 DOB, Time of Birth, Sex, Facility-JW + Zipcode BONUS =.15

9 Variables By % Linked DOB99.79 SEX97.65 BIRTH ORDER97.39 PLURALITY94.40 MOTHER FULL NAME – + JW94.37 TIME OF BIRTH93.22 MOTHER FIRST NAME92.13 MOTHER LAST NAME88.57 MOTHER FULL NAME82.54 MOTHER SSN82.06

10 Variables By % Linked TOTAL FACILITY – JW78.32 TOTAL MOTHER ADDRESS, CITY – JW73.08 MOTHER ADDRESS, CITY – JW56.58 LNAME43.59 FACILITY41.48 FACILITY – JW36.84 FATHER LAST NAME35.75 MOTHER ADDRESS, CITY16.50 MOTHER FULL NAME - JW11.83 INFANT FULL NAME11.70

11 11 Linking With Jaro-Winkler With Exact Facility + Zipcode Match 41% - Facility & Zipcode must match With Exact Facility + Zipcode Match 41% - Facility & Zipcode must match With Jaro-Winkler Facility + Zipcode Match Additional 36.84% With Jaro-Winkler Facility + Zipcode Match Additional 36.84% Total Match = 77.84% vs. just 41% Examples: LAB FACILITY NAME FLORIDA HOSP ORLANDO – LAB SHANDS AT THE UNIV OF FLA BROWARD MED CTR SHANDS AT JACKSONVILLE HOLLYWOOD BIRTH CENTER, INC VS FACILITY NAME FLORIDA HOSP ORLANDO SHANDS AT UF BROWARD MEDICAL CENTER SHANDS JACKSONVILLE HOLLYWOOD BIRTH CENTER

12 12 Linking Mother Address & City Only 16% match on exact mother address & city Only 16% match on exact mother address & city Additional 56% match on mother address & city, using Jaro-Winkler Additional 56% match on mother address & city, using Jaro-Winkler Total Match: 72% vs. just 16% Examples: Examples: LAB Mother Address VS Mother AddressLAB City VS City LAB Mother Address VS Mother AddressLAB City VS City 2323 SAMSON ROAD 2323 SAMSON RDORLANDO ORLANDO 2323 SAMSON ROAD 2323 SAMSON RDORLANDO ORLANDO 5105 NE 75TH AVE 5105 NE 75 AVENUEMIAMI MIAMI 5105 NE 75TH AVE 5105 NE 75 AVENUEMIAMI MIAMI 1001 MAIN ST APT A 1001 MAIN ST APT A KEY WEST KEY WEST 1001 MAIN ST APT A 1001 MAIN ST APT A KEY WEST KEY WEST 532 HORNET CT 532 HORNET COURTPENSACOLA PENSACOLA 532 HORNET CT 532 HORNET COURTPENSACOLA PENSACOLA 101 MAGIC CIR 101 MAGIC CIRCLE TAMPA TAMPA 101 MAGIC CIR 101 MAGIC CIRCLE TAMPA TAMPA

13 13 Data Processing Results LAB Data with DOB 12/1-31/2010 Unduplicated On OrigSpecID: 9,211 rows LAB Data with DOB 12/1-31/2010 Unduplicated On OrigSpecID: 9,211 rows VS Data with DOB 11/1 – 12/31/2010 Unduplicated on State File Number: 37,741 rows VS Data with DOB 11/1 – 12/31/2010 Unduplicated on State File Number: 37,741 rows 99% Unduplicated & Linked Records with weighted score > 2.5 99% Unduplicated & Linked Records with weighted score > 2.5

14 14 Overall Linkage Results 98 – 99 % using back-end approach 98 – 99 % using back-end approach Still not good enough Still not good enough Follow Rhode Island front-end approach Follow Rhode Island front-end approach

15 15 Advantages of Front-end Linkage Provide real-time linkage at hospital with VS Birth Date & NBS demographic data Provide real-time linkage at hospital with VS Birth Date & NBS demographic data Reduces data entry by hospital staff Reduces data entry by hospital staff Provide daily report of unlinked/missing records Provide daily report of unlinked/missing records Provide LAB w/checklist of incoming blood specimens Provide LAB w/checklist of incoming blood specimens Reduce follow-up by state staff to hospitals Reduce follow-up by state staff to hospitals Allow end-users (hospitals, MDs) ability to view electronic patient reports/results in real-time Allow end-users (hospitals, MDs) ability to view electronic patient reports/results in real-time

16 16 Acknowledgements Ken Jones Bureau Chief/Deputy State Registrar Bureau of Vital Statistics Sharon Dover Operations Manager Bureau of Vital Statistics Paula Stewart Database Analyst Health Statistics & Assessment


Download ppt "1 Data Linkage Project Florida’s Newborn Screening Program Gary Sammet Bureau of Vital Statistics."

Similar presentations


Ads by Google