Presentation is loading. Please wait.

Presentation is loading. Please wait.

Linking Large Datasets Why, How, and What Not To Do Bradley G Hammill Duke Clinical Research Institute.

Similar presentations


Presentation on theme: "Linking Large Datasets Why, How, and What Not To Do Bradley G Hammill Duke Clinical Research Institute."— Presentation transcript:

1 Linking Large Datasets Why, How, and What Not To Do Bradley G Hammill Duke Clinical Research Institute

2 Presenter disclosure information Bradley G Hammill Linking Large Datasets: Why, How, and What Not To Do FINANCIAL DISCLOSURE: None UNLABELED/UNAPPROVED USES DISCLOSURE: None

3 Acknowledgements Thanks to: Lesley Curtis Adrian Hernandez Gregg Fonarow Kevin Schulman Work initially funded by grant from GSK

4 Why link Medicare data to registry data? MedicationsVitals Lab results Procedures Clinical history In-hospital events etc. Long-term follow-up? Typical inpatient registry

5 Why link Medicare data to registry data? Potential endpoints MortalityReadmissionProcedure Adverse events (based on diagnoses) Inpatient Mortality (or censoring)

6 Why not link Medicare data to registry data? Linking will not help us address the limitations of either data source Medicare No information on VA hospitals or managed care patients Selective coverage under age 65 Registries Voluntary participation May over-represent certain regions or hospital types Data quality varies

7 How to link Medicare data with registry data Direct identifiers Name, address, SSN, date of birth, etc. Goal: Identify each registry patient in the Medicare data Indirect identifiers Service dates, date of birth (or age), sex Goal: Identify each registry hospitalization in the Medicare data

8 Linking registry data to Medicare claims Step 1. Subset registry data Step 2. Subset Medicare data Step 3. Link hospital identifiers Step 4. Link hospitalization records Described in: Hammill BG, Hernandez AF, Peterson ED, Fonarow GC, Schulman KA, Curtis LH. Linking inpatient clinical registry data to Medicare claims data using indirect identifiers. Am Heart J 2009 June;157(6):995-1000.

9 You will have this conversation [Episode 1] Me:You know, we can link these data to Medicare. Adrian:How? We don’t know who the hospitals or the patients are? Me:Turns out you don’t really need to know those things. [Brief explanation of how to link] Adrian:(flustered) This feels like a giant leap of faith.

10 Percent of unique records within sites 2007 Medicare HF Records Admit

11 Percent of unique records within sites 2007 Medicare HF Records AdmitDischarge

12 Percent of unique records within sites 2007 Medicare HF Records AdmitDischargeDOB

13 Percent of unique records within sites 2007 Medicare HF Records AdmitDOB

14 Percent of unique records within sites 2007 Medicare HF Records AdmitDischarge 2/3 DOB

15 Percent of unique records within sites 2007 Medicare HF Records AdmitDischargeAge

16 Percent of unique records within sites 2007 Medicare HF Records AdmitDischargeAgeSex

17 Percent of unique records within sites 2007 Medicare HF Records AdmitAgeSex

18 Percent of unique records within sites 2007 Medicare HF Records Admit  1d DischargeAgeSex

19 Percent of unique records within sites 2007 Medicare HF Records AdmitDischarge Age  1y Sex

20 Distinguishing records (DOB available) VariablesUnique AdmitDischargeDOBSex AdmitDischargeDOBSex>99.9% AdmitDOBSex AdmitDOBSex>99.9% DischargeDOBSex >99.9% AdmitDischarge 2/3 DOBSex AdmitDischarge 2/3 DOBSex 99.9% 99.9% AdmitDischargeDOB AdmitDischargeDOB>99.9% Within sites, what percent of 2007 Medicare HF records are unique given…

21 Distinguishing records (Age available) VariablesUnique AdmitDischargeAgeSex AdmitDischargeAgeSex99.4% Admit Discharge  1dAgeSex Admit Discharge  1dAgeSex98.5% Admit  1dDischargeAgeSex 98.4% AdmitDischarge Age  1ySex AdmitDischarge Age  1ySex98.3% AdmitDischargeAge AdmitDischargeAge98.9% Within sites, what percent of 2007 Medicare HF records are unique given…

22 Distinguishing records, in general Population 2007 HF Records per Site Median (Q1, Q3) All records 456(194, 1734) Heart failure, any 89(22, 391) Heart failure, primary 64(20, 168) CABG procedure 71(36, 124) ICD / CRT procedure 19(6, 50) Fewer records per site = Higher % unique records

23 Linking registry data to Medicare claims Step 1. Subset registry data Limit to records for patients 65 years or older Limit to records for patients 65 years or older Step 2. Subset Medicare data Step 3. Link hospital identifiers Step 4. Link hospitalization records

24 Example registry data to be used for linking OPTIMIZE-HF population Adults hospitalized for episodes of new or worsening heart failure 2003–2004 52,879 records from 255 sites overall 39,178 records for patients 65+ (74% of total)

25 Linking registry data to Medicare claims Step 1. Subset registry data Step 2. Subset Medicare data Limit to records for patients 65 years or older Limit using similar entry criteria as registry, if possible Step 3. Link hospital identifiers Step 4. Link hospitalization records

26 Example Medicare data to be used for linking Medicare inpatient population Hospitalizations with a diagnosis of HF in any position (ICD-9-CM Dx 428.x, 402.x1, 404.x1, 404.x3) 2003–2004 Age 65+ 5.5m records from >5000 sites overall

27 Linking registry data to Medicare claims Step 1. Subset registry data Step 2. Subset Medicare data Step 3. Link hospital identifiers Link records on exact values of all fields (service dates, date of birth, sex) Link records on exact values of all fields (service dates, date of birth, sex) Use resulting matches to inform links Use resulting matches to inform links Step 4. Link hospitalization records

28 OPTIMIZE-HF sample site link results Using DOB Using Age OPTIMIZE Site Medicare Site Exact Matches Medicare Site Exact Matches 1A105A114 E1K7 F1L6 1217 others 5555 2B589B631G2M28 40 others 1N28 3420 others  26 3C29C32D25D27 H1O4 I1 938 others 3333 4----P4Q4 541 others 3333

29 OPTIMIZE-HF site link results Of 255 registry sites… 247 (97%) identified in Medicare All non-VA sites with 25+ records identified

30 Linking registry data to Medicare claims Step 1. Subset registry data Step 2. Subset Medicare data Step 3. Link hospital identifiers Step 4. Link hospitalization records Determine rules to apply Determine rules to apply Decide if one-to-one correspondence needed Decide if one-to-one correspondence needed Go! Go! Get follow-up data from Medicare

31 OPTIMIZE-HF hospitalization link results Of 39,178 eligible registry hospitalizations… 31,753 (81%) identified in Medicare 25,964 unique patients Combinations used RecordsIdentified AdmitDischargeDOBSex AdmitDischargeDOBSex AdmitDOBSex AdmitDOBSex DischargeDOBSex AdmitDischarge 2/3 DOBSex AdmitDischarge 2/3 DOBSex AdmitDischargeDOB AdmitDischargeDOB 24,750 (86%) 1,171 (4%) 590 (2%) 2,258 (7%) 284 (1%)

32 You will have this conversation [Episode 2] Me:This is done using deterministic matching. Adrian:No, that’s clearly probabilistic matching. Me:Actually, it’s not. Adrian:Sure it is. We didn’t have names or SSNs.

33 Deterministic v. Probabilistic Linking DeterministicRule-based The rule determines the result Probabilistic Based on statistical theory Characteristics assigned weights and potential links are scored Data-based score threshold determines the result

34 You will have this conversation [Episode 3] Me:(excited) We were able to link 75% of the eligible records! Adrian:Golly, that seems low. Me:It’s about what I expected. Adrian:But [another registry] said they linked 98%.

35 Why might registry records not link to Medicare? Sample site All HF patients Linked to Medicare Not linked to Medicare

36 Why might registry records not link to Medicare? In Medicare claims, but… Inconsistent coding of procedures or diagnoses Inconsistent service dates or patient info Not in Medicare claims due to… Medicare as secondary payer Medicare managed care enrollment Age VA hospital (site-level)

37 Histogram of OPTIMIZE-HF site link rates

38 You will have this conversation [Episode 4] Adrian:The registry didn’t capture [obesity, anemia, etc.]. Now we can use prior claims to get that information. Me:We’re going to lose a bunch of patients if we try that. Adrian:But it’s so worth it. Me:Maybe not for that particular information, though.

39 Other uses of Medicare data Utilizing claims prior to registry hospitalization Requires prior enrollment in Medicare FFS 8% of OPTIMIZE-HF patients did not have 12 months of prior claims Inpatient data only can be limiting Need to understand coding limitations e.g. Anemia is poorly coded

40 You will have this conversation [Episode 5] Adrian:I want to validate our registry with these links. Me:You can’t easily do that with these data. Adrian:Sure we can, because now we know which Medicare patients are in the registry. Me:True, but that’s not the whole story.

41 Validation issues If you start with the registry population… You usually do not know exactly who you should find in Medicare claims data Cannot validate VA sites Cannot validate managed care patients Cannot validate younger patients Assumes all “linkable” records were linked

42 Validation issues If you start with the Medicare population… You usually do not know exactly who you should find in registry data Physician groups may be the registry participants, not hospitals Assumes all “linkable” records were linked Registry may have allowed sampling at larger sites

43 Do you want to link data with Medicare? Important caveats Acquisition requires major investment in claims data and infrastructure Use of Medicare claims data governed by strict data use agreements (DUA) Delays in data release are common [Currently available through 2008]

44 Why stop at inpatient Medicare data? Medicare data Inpatient Outpatient / Physician Pharmacy Mortality (or censoring)

45 Why stop with Medicare claims data? Other claims data sources exist Private insurer databases But more difficult as smaller % of site hospitalizations covered Payer Age 18-64 Age 65+ Medicare15%89% Medicaid20%1% Private48%8% Other (incl. self-pay, charity) 17%2% [Source: 2007 HCUP NIS, excluding maternal/neonate-related admissions]

46 Conclusion You can link your registry to Medicare claims Get long-term follow-up for registry patients 65+ enrolled in fee-for-service Medicare However… Manage expectations Understand claims data limitations

47 Contact Information Brad Hammill brad.hammill@duke.edu

48


Download ppt "Linking Large Datasets Why, How, and What Not To Do Bradley G Hammill Duke Clinical Research Institute."

Similar presentations


Ads by Google