Download presentation
Presentation is loading. Please wait.
Published byBranden Woods Modified over 8 years ago
1
1 NCHS Record Linkage Activities Kimberly A. Lochner Christine S. Cox NCHS Data Users Conference July 11, 2006 U.S. DEPARTMENT OF HEALTH AND HUMAN SERVICES Centers for Disease Control and Prevention National Center for Health Statistics
2
2 Overview What is record linkage? What is record linkage? Why do we do it? Why do we do it? How does NCHS link data? How does NCHS link data? What NCHS data has been linked? What NCHS data has been linked? What are the limitations? What are the limitations? How do you access the data? How do you access the data?
3
3 Administrative records Linked Data File NCHS Surveys What is record linkage?
4
4 Why do record linkage? Scientifically valuable & cost effective Scientifically valuable & cost effective Augments information Augments information Re-contacting survey respondents expensive Re-contacting survey respondents expensive Expands analytic potential, e.g. Expands analytic potential, e.g. Provides longitudinal component to data Provides longitudinal component to data Allows for the evaluation of policies and programs Allows for the evaluation of policies and programs
5
5 How do we link records? Obtain and standardize survey data Obtain and standardize survey data Create user submission records Create user submission records Use a matching algorithm Use a matching algorithm Score and classify potential matches Score and classify potential matches Deterministic/probabilistic matching algorithms Deterministic/probabilistic matching algorithms Determine matches/review some cases Determine matches/review some cases Create a final linked file Create a final linked file
6
6 Typical ID Data Used for Record Linkage 1. Social Security Number (SSN) 2. First Name 3. Middle Initial 4. Last Name (and Birth Surname) 5. Month, Day and Year of Birth 6. Sex 7. State of Birth 8. Race 9. State of Residence 10. Marital Status
7
7 Summary NCHS Data Linkage XXNNHS 1985 XXXNHANES III XXNHANES II XXXNHANES I XXXLSOA II XXXNHIS 1994-1998 XNHIS 1986-2000 Retirement & Disability (SSA) Medicare (CMS) Mortality (NDI)
8
8 NCHS Linked Data: Mortality National Death Index (NDI) National Death Index (NDI) NHIS 1986-2000 NHIS 1986-2000 Mortality follow-up through 2002 Mortality follow-up through 2002 Longitudinal Study of Aging II (baseline NHIS 1994) Longitudinal Study of Aging II (baseline NHIS 1994) Mortality follow-up through 2002 Mortality follow-up through 2002 NHANES I (baseline 1971-74) NHANES I (baseline 1971-74) NHANES II (baseline 1976-80) NHANES II (baseline 1976-80) NHANES III (baseline 1988-94) NHANES III (baseline 1988-94) All NHANES mortality follow-up through 2000 All NHANES mortality follow-up through 2000
9
9 Mortality: Data Elements Public ID Public ID Eligibility status Eligibility status Assigned vital status Assigned vital status Date of death Date of death Age at death Age at death Underlying and multiple causes of death Underlying and multiple causes of death Sample weights Sample weights Special request variables Special request variables
10
10 NCHS Linked Data: Medicare NCHS survey linked to Medicare data NCHS survey linked to Medicare data NHIS 1994-1998 NHIS 1994-1998 Includes disability supplements in 1994 and 1995 Includes disability supplements in 1994 and 1995 LSOA II (baseline 1994 NHIS) LSOA II (baseline 1994 NHIS) NHANES I (baseline 1971-74) NHANES I (baseline 1971-74) NHANES II (baseline 1976-80) NHANES II (baseline 1976-80) NHANES III (baseline 1988-94) NHANES III (baseline 1988-94)
11
11 NCHS Linked Data: Medicare Medicare entitlement and health care utilization and payment data for 1991-2000 Medicare entitlement and health care utilization and payment data for 1991-2000 Denominator file Denominator file MEDPAR Inpatient hospitalization MEDPAR Inpatient hospitalization MEDPAR Skilled nursing facility MEDPAR Skilled nursing facility Hospital outpatient Hospital outpatient Home Health Care Home Health Care Hospice Hospice Carrier (physician/supplier Part B file) Carrier (physician/supplier Part B file) Durable Medical Equipment Durable Medical Equipment
12
12 Medicare: Data Elements Public ID Public ID Eligibility status Eligibility status Match status Match status For each Medicare file For each Medicare file For each year 1991-2000 For each year 1991-2000 Linkage age (i.e., assumed age of survey participant at time of linkage - July 2001) Linkage age (i.e., assumed age of survey participant at time of linkage - July 2001)
13
13 Medicare: Data Elements Denominator file Denominator file Entitlement status Entitlement status Beneficiary demographic characteristics Beneficiary demographic characteristics Monthly enrollment status Monthly enrollment status HMO enrollment HMO enrollment Claims files Claims files Diagnoses codes Diagnoses codes Service dates Service dates Reimbursement amount Reimbursement amount
14
14 NCHS Linked Data: Retirement/Disability NCHS surveys linked to Social Security data NCHS surveys linked to Social Security data NHIS 1994-1998 NHIS 1994-1998 LSOA II (baseline 1994 NHIS) LSOA II (baseline 1994 NHIS) NHANES I (baseline 1971-74) NHANES I (baseline 1971-74) NHANES III (baseline 1988-94) NHANES III (baseline 1988-94) National Nursing Home Survey (1985) National Nursing Home Survey (1985)
15
15 NCHS Linked Data: Retirement/Disability Social Security data from Retirement, Survivors, and Disability Insurance (RSDI) and Supplemental Security Insurance (SSI) programs Social Security data from Retirement, Survivors, and Disability Insurance (RSDI) and Supplemental Security Insurance (SSI) programs Master Beneficiary Record (MBR) Master Beneficiary Record (MBR) 1962-2003 1962-2003 Payment History Update System (PHUS) Payment History Update System (PHUS) 1984-2003 1984-2003 Supplemental Security Record (SSR) Supplemental Security Record (SSR) 1974-2003 1974-2003
16
16 Social Security: Data Elements Public ID Public ID Eligibility status Eligibility status Match status Match status For each SSA file For each SSA file Linkage age (i.e. assumed age of survey participant at time of linkage - July 2001) Linkage age (i.e. assumed age of survey participant at time of linkage - July 2001)
17
17 Social Security: Data Elements RSDI information RSDI information Master Beneficiary Record (MBR), 1962 - 2003 Master Beneficiary Record (MBR), 1962 - 2003 RSDI program eligibility, benefit amount, payment status, dual entitlement RSDI program eligibility, benefit amount, payment status, dual entitlement Payment History Update System (PHUS), 1984-2003 Payment History Update System (PHUS), 1984-2003 RSDI benefit payment amounts, including withholding information for Medicare Part B premiums RSDI benefit payment amounts, including withholding information for Medicare Part B premiums Actual benefit payment in a given month (Form 1099) Actual benefit payment in a given month (Form 1099)
18
18 Social Security: Data Elements SSI information SSI information Supplemental Security Record (SSR), 1974 to 2003 Supplemental Security Record (SSR), 1974 to 2003 SSI program eligibility SSI program eligibility For those eligible for SSI For those eligible for SSI basic demographic information (sex & race) basic demographic information (sex & race) benefit information benefit information actual payment amounts actual payment amounts sources and amounts of other income information sources and amounts of other income information
19
19 What are the limitations? Quantity and quality of identification data Quantity and quality of identification data High refusal rates for key identifiers High refusal rates for key identifiers Incomplete or inaccurate reporting/recording of identification data in the survey interview Incomplete or inaccurate reporting/recording of identification data in the survey interview Legitimate changes in data over time Legitimate changes in data over time Inability to match records leads to potential bias in linked files Inability to match records leads to potential bias in linked files
20
20 Limitations: some examples Survey respondents ineligible for matching Survey respondents ineligible for matching Cannot attempt to match their records to other data sources – Why? Cannot attempt to match their records to other data sources – Why? Refused to provide SSN Refused to provide SSN Under the age of 18 (for mortality only) Under the age of 18 (for mortality only) Lack key identifying information Lack key identifying information Ineligibles MUST BE DROPPED from all analysis Ineligibles MUST BE DROPPED from all analysis
21
21 Percent of ineligible and deceased adult NHIS respondents by survey year : NHIS 1986-2000
22
22 Ineligible Population and Medicare Linkage Rate by Survey (65+ years) % Ineligible % Linked among eligible NHIS 1994 17.992.8 NHIS 1995 19.392.8 NHIS 1996 22.292.1 NHIS 1997 30.793.7 NHIS 1998 40.392.4 LSOA II 20.496.2 NHEFS7.184.9 NHANES II 0.081.0 NHANES III 1.995.9
23
23 How do you access the data? NCHS Research Data Center (RDC) provides access to the restricted linked data files NCHS Research Data Center (RDC) provides access to the restricted linked data files Access Methods Access Methods On-site access – NCHS, Hyattsville, MD On-site access – NCHS, Hyattsville, MD Off-site access - User’s remote location Off-site access - User’s remote location Staff assisted – on-site programming for remote researchers Staff assisted – on-site programming for remote researchers Email:: rdca@cdc.gov Phone: (301)458-4732 www.cdc.gov/nchs/r&d/rdc.htm
24
24 www.cdc.gov/nchs/r&d/nchs_datalinkage/data_linkage_activities.htm
25
25 Estimated RDC Fees for Linked Mortality Data Access Guest Researcher (on site)… Remote Access.......... RDC staff assisted research…. Minimum $250 fee per day for analytic file creation and $200 per day on-site user fee (2-day minimum). Minimum $250 fee per day for analytic file creation and $250 per month remote access fee (50% reduction). TBD User Fees: Mortality Data
26
26 Estimated RDC Fees for Linked SSA/CMS Data Access Guest Researcher (on site)… Remote Access…... RDC staff assisted research….. Minimum $500 fee per day for analytic file creation and $200 per day on-site user fee (2-day minimum). Minimum $500 fee per day for analytic file creation and $500 per month remote access fee. An additional $500 per day is charged as needed for additional file creation and special handling, such as merging of additional data or creating custom file formats. TBD User Fees: SSA & CMS Data
27
27 Research Potential of Linked Medicare Data Examine risk factors for health conditions Examine risk factors for health conditions Compare survey reported health conditions to claims records Compare survey reported health conditions to claims records Examine reliability of survey data Examine reliability of survey data Compare survey reported Medicare enrollment to Medicare claims records Compare survey reported Medicare enrollment to Medicare claims records Examine survey report of disability with program participation eligibility criteria Examine survey report of disability with program participation eligibility criteria Examine disparities in Medicare service utilization Examine disparities in Medicare service utilization
28
28 Linked Mortality Files Number of Deaths by Survey 3,384NHANES III 4,143NHANES II 6,656NHEFS 3,958LSOA II 121,138NHIS 1986-2000 Total DeathsNCHS Survey NHIS and LSOA II have mortality follow-up through 12/31/2002. NHEFS, NHANES II and III have mortality follow-up through 12/31/2000.
29
29 Limitations: some examples (cont) Names Names Nick name conversion (e. g Beth/Elizabeth or Bill/William) Nick name conversion (e. g Beth/Elizabeth or Bill/William) Hispanic Naming Conventions (e.g. Alberto Ruis De La Rosa) Hispanic Naming Conventions (e.g. Alberto Ruis De La Rosa) Doesn’t fit into standard data fields Doesn’t fit into standard data fields Clustering of Hispanic names Clustering of Hispanic names Date of birth misreporting (MM/DD/YYYY) Date of birth misreporting (MM/DD/YYYY) Allow matches on separate components Allow matches on separate components MM/DD or MM/YYYY MM/DD or MM/YYYY For year, allow +/- 1 For year, allow +/- 1
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.