Research With Medicare Claim Data Xinhua Yu, MD PhD Division of Epidemiology and Biostatistics School of Public Health University of Memphis June 22, 2012
Outlines Medicare program overview Bill process and claim data Data structure and important variables Requesting CMS Medicare data Data analysis Research applications Discussion
1. Medicare Overview
Medicare Program National health insurance for age >=65, or people with certain disabilities, or people with ESRD etc. 1965 - Title XVIII of the Social Security Act 7/1/1966 - Medicare Program started 2003, Medicare Prescription Drug, Improvement, and Modernization Act (MMA) 2006 prescription drug program (Part D) started
Medicare Coverage (Entitlement) Part A , or Hospital Insurance (HI) Part B, or Supplemental Medical Insurance (SMI) Part “C”, or Medicare Advantage Plans (HMO, PPO) Part D, or Prescription Drug Coverage
Medicare Part A Benefits Hospital care Skilled nursing facility (SNF) care Home health care skilled nursing and rehabilitation care patient confined to home Hospice care (added in 1983) For terminally ill patients with a life expectancy of 6 months or less
Part A Eligibility Elderly Disabled Person is eligible if they or their spouse worked 40, or more, quarters in their lifetime and paid Medicare tax while working For those who did not work 40 quarters, enrollment is possible by paying a monthly premium Disabled a person who has received Social Security disability benefits for 24 months ESRD- persons with end-stage renal disease, ALS - persons with Amyotrophic Lateral Sclerosis (ALS), or Lou Gehrig’s Disease
Part B Benefits Physician services (including nurse practitioners, physician assistants etc), and services provided by other providers (e.g., health departments) Facility charges for hospital outpatient services and ambulatory care centers Note: a person who is seen in a hospital or hospital outpatient setting will generally generate two claims, one from the facility and one from the physician Durable Medical Equipment Must pay a premium to be enrolled in part B
Medicare Funding and Payment Part A: Medicare Hospital Insurance Trust Fund (Medicare tax) 98% people >=65 are enrolled in part A Part B and D: Supplementary Medicare Insurance Fund (beneficiary premium and congress appropriation) 96% elderly part A beneficiaries are enrolled in part B ~60% elderly enrolled in Part D Deductable and coinsurance
Types of Medicare Program Fee-for-service (FFS) or traditional Medicare program Medicare managed care (now Medicare Advantage plan, Part C) began in 1985 Risk based: insurance co. receive a capitated money, and plan assumes financial risk Cost based 12-16% of beneficiaries are in managed care Higher in west coast (CA, OR etc.) Medicare claims are likely incomplete for these managed care enrollees, thus often excluded in the analysis
2. Bill Processing and Claim Data
Bill and Claims Claims are bills for services given to the Medicare enrollees Claims are processed sequentially and through hierarchical system Help understand the contents of Medicare data and validity of data fields
Type of Services Institutional Non-Institutional Hospital Inpatient Hospital Outpatient Skilled Nursing Care Home Health Care Hospice Non-Institutional Physician, Laboratory and Other Supplier Services Durable Medical Equipment
Medicare Beneficiary Update entitlement data Treatment Medicare Beneficiary Institutional Provider Non-institution Provider Payment/ Denial Claim CWF Host Claim (daily) Update entitlement data Check claims for entitlement, deductible, remaining benefit, and duplicates Authorize full payment, partial payment, denial, or request additional data Medicare Administrative Contractor (MAC) Fiscal Intermediary Carrier Response Enter claim into system Perform consistency and utilization edits Calculate payment Deny claims based on Medicare policy Return denied claims to provider Entitlement data (Daily) Claims data (Weekly) CMS Update EDB with entitlement data Add claims to National Claims History Repository (NCHR)
Claim Forms Uniform bill: UB-92/UB-04, for institutional providers (e.g., hospitals, Skilled Nurse Facilities, home health, hospice) Facility (institutional) claims Used to be processed through Fiscal Intermediaries CMS-1500 form: for non-institutional providers (e.g., physicians, lab, ambulance services, medical equipment bills) Non-institutional claims Used to be processed through Carriers 23 Medicare Administrative Contractors (MACs) process both bills 15 MACs for part A and B, 4 MACs for DME, and 4 for Home health and hospice Components are different between these two forms
Research Claim Files SAF: Standard Analytical Files, i.e., claim based files Contain “final action” claims Inpatient, outpatient, physician services etc. MedPAR: Medicare Provider Analysis and Review Each observation contains aggregated data of all facility claims related to one episode of care An episode of care is either a hospital or skilled nursing facility stay.
SAFs and MedPAR Each SAF contains “final action” claims All adjustment (partial pay, denial, amendment) are rolled up into one record SAF is available for each type of services For inpatient services SAF is more detailed (e.g., attending physician ID) But MedPAR is easier to work with 99% of inpatient SAF contain only one record for each hospital stay, thus essentially the same as MedPAR Requesting SAF costs more than requesting MedPAR
Example: Emergency Room Visit ER services are considered outpatient services But ER is usually attached to a hospital Billed using facility forms (UB-92/04) Outpatient SAF What if ER results in a hospital admission? Becomes Part A (hospitalization services) Inpatient SAF/ MedPAR Physician services are Part B Carrier files will have them So you need all files to capture diagnosis, procedures, and discharge destination for an ER visit
3. Medicare Data Structure http://www.ccwdata.org/data-dictionaries/index.htm
3.1 Beneficiary Summary File
Beneficiary Summary File (Denominator File) A calendar year file (cross-sectional file) All eligible Medicare beneficiaries who ever enrolled (>= 1day) in Medicare Limited by the criteria you requested Served as the denominator for calculating rate or prevalence Contains basic demographic, coverage, HMO, and part D enrollment information (discussed later)
Demographic Variables Encrypted beneficiary ID Encrypted from HIC (11 digit unique identifier that is related to SSN) Can be linked with multiple claim files Date of birth (age) There are disabled people with age <65 Sex Race Sources: social security administration (SSA), railroad board (RRB)
Age It is better to calculate the age variable by yourself based on date of birth The age variable in the file is calculated as of Dec. 31 in the previous year, thus misclassify those turning 65 during the study year as 64 Something wrong with really really old people Medicare had higher percent of people with >100 than the census There are people with age >120 which is still very unlikely i.e., some deaths are missed Could be excluded in the analysis (a very small population)
Race/ethnicity Since 1994, race codes were: white, black, Asian, Hispanic, Native American, other, unknown The sensitivity for the Hispanics code is estimated about 35%, i.e., only one third of Hispanics recorded themselves as Hispanics But specificity is very high, i.e., if they claim themselves Hispanics, they are almost sure Hispanics Many people claim themselves as other No penalty for doing that Research Triangle Institute Race variable Higher sensitivity (60%+) for identifying Hispanic population
Sex Sex is coded 1=male 2=female There are no missing values for this field Persons with missing information have it filled according to the rule: if age is less than 65 and sex missing then sex=male if age is greater than or equal to 65 and sex is missing then sex=female Thus there are “female” people with prostate disease
Mortality Date of death Date of death validation indicator (“V”) If date of death is not empty, beneficiary is died 100% deaths are validated But about 96% of death dates are validated Survival time may be over-estimated if unvalidated date of death is recorded as end of month Source is from SSA and claim info (e.g. hospital discharge status is dead)
Medicare Enrollment Status Medicare Status Code (MSC) combines current entitlement and ESRD 10= Aged w/out ESRD 11= Aged w/ ESRD 20= Disabled w/out ESRD 21= Disabled w/ ESRD 31= ESRD only Often we excluded those with ESRD as they have different health care utilization patterns Disabled with age <65 are often excluded as well Many of them are in Medicaid as well
State Buy-in Medicaid paying Medicare premiums All states exercise the option of paying Medicare premiums for at least some people This can take 3 forms: State pays premiums only (5%) State pays premiums and cost sharing (45%) State provides full Medicaid benefits (50%) Monthly indicator (buy-in part A, B, or both) Those with state buy-in can e assumed to have lower income
Benefit Coverage/Enrollment Indicator Monthly entitlement/buy-in indicator Not entitled (0) Part A only (1) Part B only (2) Part A and Part B (3) Part A, State buy-in (A) Part B, State buy-in (B) Parts A and B, State buy-in (C) Also summary month counts for part A,B and buyin 94% have both Part A and Part B Often we limited the study to this population Part A is entitled, while part B is not required
Examples: bene_mdcr_entlmt_buyin_ind CCCCCCCCCCCC (12 months, A&B SBI) 333333333333 (12 months A&B) 111111333333 (5 mon. A, then 7 mon A&B) 111111111111 (12 months A) 333300000000 (4 mon A&B,8 mon not elig) 000000000033 (10 mon not elig,2 mon A&B) 333333330000 (8 mon A&B, 4 mon not elig)
HMO indicator Monthly HMO enrollment indicator Those in the HMO often have incomplete claim history Claims are not required to be submitted to CMS or not released from CMS a summary count of Months HMO coverage No information on the actual managed care types and plans 12-16% of HMO enrollment
Examples of Monthly HMO Indicators 000000000000 (never in MCO) 111111111111 (12 months non-lock-in) 00000CC00000 (months 6 & 7 in risk MCO) CCCCCCCCCCCC (12 months in risk MCO) 00000CCCCCCC (months 6-12 in risk MCO)
Dual Eligible Status Eligible for both Medicare and Medicaid Medicaid is means based: i.e., primary for people with income lower than some standard, or needs based Some dual eligible are in HMO or managed care Dual eligible variable is better to identify low income patients than state buyin Dual eligible variable identify more low income patients
Beneficiary Residency Available in Research Identifiable file (RIF) State, county and ZIP code of residence are the mailing address for official correspondence From SSA data Some persons have their mail sent to another person (e.g., son, daughter, guardian) Analyses comparing state of treatment with state of residency generally show high concordance Always use denominator residence information Residence info on other claims is not validated
3.2 Institutional Claims
Type of Claims Institutional (facility) claims: UB-92 /UB-04 forms Inpatient Outpatient Skilled nursing facility Home health agency Hospice Non-institutional claims: CMS 1500 form Physician (and other providers) services Lab tests and diagnostic exams Durable medical equipment (DME) Standard alone ambulatory services
UB 92 /UB-04 Form Patient demographics Provider (hospital) ID and location (zip) Admission/discharge date Disease diagnosis and procedure: ICD-9 codes Detailed services (revenue centers in SAF) Payment and coinsurance Discharge destination
Hospitalization/MedPAR Medicare Provider Analysis and Review Short-stay/Long stay hospitals Short stay 85% Long stay hospital 2% Skilled Nursing Facility (SNF) 13% Reimbursement for SNF is different (per diem based) One record per hospital stay in MedPAR One stay may consist of several records in Inpatient SAF, but these are small proportion Categorized payment info in MedPAR Original revenue center codes in SAF
Finding Provider (Hospitals) Organization NPI Intelligence free identifier HIPPA compliant PRVDR_NUM variable 6 columns: SSA state (2)+type of facility(4) Traditional acute care hospitals: 0001-0879 critical access hospitals: 1300-1399 Critical access hospitals may not use PPS Short stay hosp, long stay hosp, and skilled nursing facility (SS_LS_SNF_IND_CD) Need to separate them in analysis
Length of Stay/Admission and Discharge Dates LOS=discharge date – admission date Plus one if the same day hospitalization LOS for SNF is different SNF is paid as per diem based on resource utilization groups (RUGs) and has limit in days of stay
Diagnosis, Procedures and DRGs Clinical information available in four sources: Medicare Severity Diagnosis Related Group (MS-DRG) (1 per stay, per record) ICD-9 diagnoses (up to 10 codes: 1 primary, 8 secondary, 1 injury code) ICD-9 coded Procedures (up to 6 per claim) Admission diagnosis code Diagnoses and procedures are consistent with DRG. However, not all DRGs require specific diagnoses
Example: AMI Almost all persons with primary discharge diagnosis of 410 have following DRGs: 231-236: CABG with PTCA 237-238: Major Cardiovascular Procedure Diagnosis, procedure and DRGs can be used to define distinct population
ICD-9 V codes “Supplementary Classification of Factors Influencing Health Status and Contact with Health Services” 23% of hospitalizations have some V code 2.8% have a V code as their primary reason for hospitalization Examples: V56.0 Renal dialysis V58.1 Chemotherapy V58.61 Long-term use of anticoagulants V59.4 Kidney donor V67.4 Follow-up examination after treatment of a fracture V70.2 General psychiatric examination
Hospital Charges& Payments MedPAR contains 34 fields describing charges Total charges Total accommodation charges Total departmental charges Specific charges for accommodation sub-types and specific departments or groups of departments Patient’s payments Inpatient deductible coinsurance amount CMS total reimbursements bill total per diem Primary Payer (other than CMS) amount
Estimating Payments from MedPAR Total paid by CMS: total reimbursements + bill total per diem Total paid by the beneficiary: inpatient deductible + coinsurance amount+blood deductible Total paid by all sources: total reimbursement+ bill total per diem + inpatient deductible + coinsurance amount +blood deductible + primary payer amount Note: Physician charges/payments are not in the MedPAR
Charge/Reimbursement Ratio Most hospitals are on Prospective Payment systems (PPS) per stay payment based on DRG (include labor and non-labor cost, with some geographic and risk adjustment) Claim PPS_IND_CD Charge and reimbursement ratio for specific hospitals may not be meaningful But population wise, we often use this (or derived) ratio to obtain estimated payment in hospital discharge data (e.g., cost/discharge ratio)
Categorized Cost Variables In MedPAR Cost unit: e.g., Intensive care unit indicator Coronary care unit indicator Diagnostic Radiology CT/MRI DME use Indicators for certain service use: Pharmacy Physical therapy Laboratory Emergency room
Discharge Destination Information provided by hospital Home/self care Other short-term general hospital Skilled nursing facility (SNF) Intermediate care facility Other institution Home health service care Left AMA Home IV drug therapy Died
Additional Comments on MedPAR People admitted to hospitals through ER or outpatient visit (planned or unplanned) will appear in MedPAR/inpatient SAF, often not in the outpatient claims Check admission type variable Info in MedPAR is care received, not care needed Some disease diagnoses may be missing, or some conditions may not be diagnosed or recorded (e.g., hypertension) Combining with other claims, MedPAR is often a start point (e.g., studying the follow up care for those with CABG surgery)
Outpatient Claim File Facility claims, use UB-92 /04 forms Data structure is the same as inpatient SAF CMS provides data in two files: Base claim Revenue centers (detailed info and charge) Can be linked by bene_id and claim_id If you request CCW data Chronic condition files: condition, span and health care cost/values
Basic Claim File Patient demographics From and through date Provider number Attending and operating physician NPI ICD 9 diagnosis and procedures (up to 25) Total charge and payment Discharge status
Physician NPI National Provider Identifier (NPI) Unique ID (intelligence free identifier) Note: in old data, physician UPIN etc. Not encrypted in Research Identifiable Files(RIF) Can be linked with AMA master file and other commercial physician data Usually attending physician NPI is used Operating physician NPI may be useful in surgery if they are different
Revenue Center files Multiple records per claim One basic claim record is linked to many revenue records (up to 450 revenue codes) Matched with claim file by bene_id and claim_id Indicated by the number of line variable as well Variables include revenue center, procedure performed, modifiers, service units, charge and payment per unit
Revenue Centers Are institutional cost centers for which separate charges are billed Facilities are not required to have every revenue center reported because overall payment is based on DRG Examples: 0141 Private room, medical/surgical 0258 Pharmacy, IV solution 0305 Laboratory, hematology 0350 CT scan, general classification 0382 Whole blood 0961 Professional fees, psychiatric
HCPCS: Health Care Common Procedure Coding System Also called HCFA common procedure coding system Include: current procedure terminology (CPT ) and some additional level II and III codes created by CMS More detailed than ICD-9 procedures Change over time: some are added, some are abandoned Used in billing for revenue center, physician services, etc. In outpatient claim files, ICD9 procedure codes are not complete and validated. Better use HCPCS in revenue center files for searching procedures and outcomes
HCPCS examples Level 1: CPT codes: 00100 -01999 Anesthesia 10040 - 69990 Surgery 70010 - 79999 Radiology 80049 - 89399 Pathology and Laboratory 90281 - 99199 Medicine 99201 - 99499 Evaluation and Management
Example: HCPCS Level 2 Codes Level 3 codes Chemotherapy: J codes A0000 - A0999 Transportation Services including Ambulance A4000 - A8999 Medical and Surgical Supplies A9000 - A9999 Administrative, Miscellaneous and Investigational B4000 - B9999 Enteral and parenteral therapy Preventive services Influenza vaccine 90724 Influenza vaccine administration G0008 Chemotherapy: J codes Level 3 codes
HCPCS Modifiers Can have up to 4 modifiers Related to charge and payment Level 1 – numeric: e.g., 21 - Prolonged Evaluation and Management Services 26 - Professional Component Level 2 - alpha or alpha-numeric: e.g., TC - Technical Component LT = left, RT = right
Services Unit and Payment For each revenue center and associated procedure Number of services units, payment per unit Charge and payment for each revenue center Patient deductible, coinsurance, and provider responsibility, in addition to CMS payment Good for detailed analysis at revenue center level
3.3 Non-Institutional File
Carrier Files Physician services, lab test, exam, and supplier Part B services Billed using CMS 1500 form Include two linkable files Basic claim file Line item file for detailed services and payment Stand-alone ambulatory surgical center also in this file
Carrier Basic Claim File Bene_id and claim_id Patient demographics From and through date Claim ICD-9 diagnosis codes (up to 12) Principle diagnosis indicated Claim total charge, CMS allowed charge, and CMS payment Patient portion of payment (deductible) as well
Carrier Line Item File Linked with Carrier_claim file by bene_id and claim_id Multiple records is matched with one claim Up to 13 line items Each line item is one record in line file Usually the largest file of all claim data Date of services ICD-9 diagnosis and HCPCS procedure for each line of services Physician information Charge and payment
Line Diagnosis and Procedure Can up to 13 line items for each claim Line diagnosis should be included in the claim diagnosis as well HCPCS codes and modifiers are used in physician services Most useful for searching procedures performed Lab test, diagnostic exams based on HCPCS
BETOS codes Purposefully aggregated based on HCPCS codes Berenson-Eggers Type of Service (BETOS) codes Useful in tracking different types of services Charge & payment Utilization patterns Relatively stable
Examples of BETOS codes M1A = Office visits - new M1B = Office visits - established M2A = Hospital visit - initial M2B = Hospital visit - subsequent M2C = Hospital visit - critical care M3 = Emergency room visit M4A = Home visit M4B = Nursing home visit M5A = Specialist - pathology M5B = Specialist - psychiatry M5C = Specialist - opthamology M5D = Specialist - other M6 = Consultations P0 = Anesthesia
Physician Information Referring physician and performing physician NPI Performing physician NPI is complete Although referring physician ID is required, but self referral is OK NPI is not encrypted Group NPI for group practice Its usefulness is complicated
Example: Line Physician Specialty 01 = General practice 02 = General surgery 03 = Allergy/immunology 04 = Otolaryngology 05 = Anesthesiology 06 = Cardiology 07 = Dermatology 08 = Family practice 10 = Gastroenterology 11 = Internal medicine
Example: Line Place of Service 11 = Office 12 = Home 21 = Inpatient hospital 22 = Outpatient hospital 23 = Emergency room - hospital 24 = Ambulatory surgical center 31 = Skilled nursing facility
Line Type of Services Distinguish different type of services 1 = medical care 2 = surgery 3 = Consultation 4 = Diagnostic radiology 5 = Diagnostic laboratory 6 = Therapeutic radiology 7 = Anesthesia Etc.
Line Charge and Payment CMS payment, beneficiary payment, provider payment, and primary payer codes Line Allowed Charge Amount - the charges allowed by CMS Line NCH Payment Amount - the amount paid by CMS CMS actual payment is generally 80% of allowed charge for physician services. Patients have copay/part B deductible, coinsurance etc. For Lab test, they are the same
Line Service Units Carrier Line Miles/Time/Units/Services (MTUS) count Actual counts of service units Ambulances are based on miles Carrier Line Miles/Time/Units/Services indicator code e.g., 0=not allowed unit, 1=transportation, 2= anesthesia time, 3 = number of services, 4= oxygen volumes, 5=blood units Majority are 3
3.4 Part D data
Medicare Part D data Medicare drug benefit Data is processed and managed differently from traditional Medicare Part A/B files Files: Prescription drug event (PDE) file (can be linked with Medicare claims) Drug characteristics Plan characteristics
Part D Enrollment Monthly patient enrollment information (in beneficiary summary file) Contract ID, Plan benefit package ID, segment ID and cost share group Can be linked to plan, drug, prescriber and pharmacy characteristics
Medicare Part D enrollment, 2010
Monthly Drug Plan Contract ID Unique to each plan Indicates what types of plan, based on the first letter of the contract ID: H: local MA-PD, PACE, cost plans and demo R: Regional MA-PD S: PDP (prescription drug plan) N: Not Part D enrolled (no part D data) E: Employee-sponsored plans 0: not enrolled in Medicare (no part D data)
Cost Share Group Indicate whether subsidy is provided 00 = Not Medicare enrolled for the month XX = Enrolled in Medicare A and/or B, but no MIIR record for the month 01 = Bene is deemed with 100% premium-subsidy and no copayment 02 = Bene is deemed with 100% premium-subsidy and low copayment 03 = Bene is deemed with 100% premium-subsidy and high copayment 04 = Bene with LIS, 100% premium-subsidy and high copayment 05 = Bene with LIS, 100% premium-subsidy and 15% copayment 06 = Bene with LIS, 75% premium-subsidy and 15% copayment 07 = Bene with LIS, 50% premium-subsidy and 15% copayment 08 = Bene with LIS, 25% premium-subsidy and 15% copayment 09 = No premium subsidy nor cost sharing = not LIS 10 -13 = not in Part D
Part D Event Data Contain drug filled (not actually taken) Detailed medication data based on pharmacy bills, but not exactly the same as the pharmacy claim and so differs from point-of-service Post-transaction adjustments between plan and pharmacy Plan-to-plan adjustments for misenrollees Plan-to-CMS adjustments for some demonstration projects 37 variables: prescription date, national drug codes (NDC), dosage, brand names, generic names, days supply, payment, and coverage indication (gaps?)
Drug Codes PROD_SRVC_ID: National drug codes (11 bytes)
Other Drug Information Both brand names and generic names Need to use fuzzy search for drug names First Data Bank therapeutic class Useful in grouping drugs: e.g., anti-diabetic drugs Drug dosage Drug strength
Medication Days and Supply Prescription Service Date (SRVC_DT) Prescription initiation date Prescription Days Supply (DAYS_SUPLY_NUM) Key variable to construct medication adherence measures Median/Mode: 30 days Prescription amount (Quantity Dispensed) Not very useful so far
Medication Adherence Several measures have been propose Medication Possession Ratio (MPR) Proportion of Days Covered (PDC) Proportion of days supply during a specified time period or over a period of refill intervals One year? Months? Study period? Overlapping days? Medication Gaps: The proportion of days without medication during a specified time period or interval
Medication Utilization Management Information Quantity limit: plans limit the numbers (or amounts) of a drug in a given time period Prior authorization: preapproval is required before coverage Step therapy (maximum step number): specified drugs should be tried before moving to other drugs About one third prescriptions subject to utilization management
Payment and Coverage Gross drug cost (total cost) : includes patient payment, other true out-pocket payment, low income cost-share (subsidy), patient liability reduction due to other payer amount, covered part D payment, and uncovered payment
Characteristics Data Drug characteristics Plan characteristics Linked by drug codes, including strength, dosage, brand names, generic names etc. (appended to PDE file) Plan characteristics Includes plan type, benefit design, premium, cost-sharing and service area of Part D plans Prescriber (provider) and pharmacy files Basic characteristics of providers and pharmacy
Note on Requesting Drug Data Need to plan carefully on what variables and what files you want to request CMS requires justification for every variable included request CMS charges differently based on how many variables you request
3.5 Comments
Other Claim Data Home health agency (HHA) and Hospice data are facility claim data, similar to inpatient and outpatient SAF But the payment system are different DME files are the same structure as Carrier files Part D (drug) data follow different structure Can be linked with Medicare claims by bene_id
Advantages of Medicare Claim Data Claim data include services covered and received, payment, and disease information, in addition to patient demographics Almost complete elderly population in the US Only limited by your inclusion criteria and study design Can be combined with other data such as census and large surveys or cohort study Data available timely (usually available after June next year)
Limitations of Medicare Disease diagnosis and procedure but no other clinically important information Cancer staging, histology Lab test or diagnostic exam results No disease severity Duration of disease is unclear (e.g., for chronic disease such as diabetes, hypertension, unknown starting point) ICD-9 procedure codes are used in facility claim, while CPT/HCPCS codes are used in non-facility claims, complicating matching
Limitations (cont.) Only covered services are included Uncovered services are not reported No information for Part B services for managed care enrollees Hospitalization for managed care enrollees are limited and unknown quality Charge and payment information are accurate but diagnosis may be incomplete if they have no impact on payment Fraud exists Outliers are not necessary fraud
4. Requesting CMS Data
Before You Request… Having a research proposal Know the data and how to analyze them Having sufficient funding CMS Medicare data cost a lot (~$14,000 for one year of total claims) Allow some waiting time Requesting process, CMS Privacy Board Review, and data purchase may take 3 or more months
Researcher’s Tasks Research proposal Identify possible data source Research goals Data analysis plan Identify possible data source Is Medicare data appropriate for this project? Are there any other data sources that are better fit for the project? Research identifiable files Most commonly used Contain more information
Role of ResDAC Email ResDAC about potential Medicare projects Identify appropriate data source Estimate sample size Estimate cost Prepare requesting package http://dev.resdac.umn.edu/Medicare/requesting_data_NewUse.asp Prepare for several rounds of modification Consultation for free Also helpful when writing research grant that uses CMS data
Requesting Package Written Data Request Letter Study plan /protocol/executive summary Data Use Agreement (DUA) IRB approval / HIPAA waiver Evidence of Funding Specification Worksheet CMS Cost Estimate CMS Disclaimer User Agreement ResDAC Review Letter
Data Use Agreement (DUA) Legal contract for use of CMS data use data only for the purpose cited in the request not to release CMS data to other organizations details safeguards to prevent unauthorized access obtain CMS review of findings prior to publication return or destroy data by retention date Need to renew/extension with CMS every year
CMS Review Criteria Potential for benefit to Medicare beneficiaries or the Medicare program Potential benefit outweighs risk to beneficiary privacy Compliance with the terms of the DUA Request data covered for release under the Privacy Act Does not result in product that will be marketed Manuscripts, presentations, any release of findings will be submitted to CMS first Highlight specific sections/tables HIPAA waiver criteria will be met
Other ResDAC Services All sorts of CMS associated data CMS data file content Data extraction methodology Data request process Reading in the data Data element interpretation Data source, cost estimate during grant writing
CCW data Chronic Conditions Warehouse (CCW) 21 chronic conditions 100% and 5% enhanced sample Chronic Conditions Flagged Control Group available 1999 to present available http://www.ccwdata.org/index.htm
Chronic Conditions Acute Myocardial Infarction Alzheimer's Disease Alzheimer's Disease, Related Disorders, or Senile Dementia Atrial Fibrillation Cataract Chronic Kidney Disease Chronic Obstructive Pulmonary Disease Congestive Heart Failure Diabetes Glaucoma Hip/Pelvic Fracture
Chronic Conditions Ischemic Heart Disease Major Depression Osteoporosis Stroke / Transient Ischemic Attack Breast Cancer Colorectal Cancer Prostate Cancer Lung Cancer Endometrial Cancer Osteoarthritis Rheumatoid Arthritis
4. Data Analysis
Create Study Cohort Study cohort can be defined: Geographically: state, region By time: calendar year Demographically: age, race, sex Clinically: having certain diagnosis (e.g., diabetic patients) Having certain procedures (e.g., CABG patients) Combination of the above
Example: West TN Beneficiaries Study cohort: All elderly people residing in west TN in 2009 List of county in West TN was used Beneficiary county only Age: >=65 in 2009 People in HMO: excluded Must have both Part A and Part B enrollment
Example: BPH Cohort Patients who had benign prostate hyperplasia (BPH) surgical procedures from 2002-2008 ICD-9 codes: TURP (60.29), TUMT (60.96) etc. CPT codes: TURP (52601), TUMT (53850) etc. Note: BPH diagnosis code (600) was not used as a required diagnosis, as some BPH procedures did not have a 600 diagnosis, but other related diagnosis Need to search inpatient, outpatient, and Carrier (physician services) files for the above procedure codes TUMT are often performed in office setting, not in hospitals
Research Finder File Finder file is used to request CMS data Finder file defines your study cohort Better be broad at the beginning (i.e., requesting more than needed) E.g., may include HMO, ESRD, etc. For geographically defined data, submit a list of state and county is sufficient For clinically defined data, submit the ICD-9 and CPT/HCPCS codes, and define what claim files and how you will search for these codes Some clear algorithm is needed (consult ResDAC) Pre-defined cohorts with patient’s SSN IRB issues
Linking with Different Files All claims data (including drug data) can be linked with Bene_ID Basic claim files and revenue center or Carrier line item files can be linked by bene_ID and claim_ID Some roll up may be needed in revenue center or line item files Often we find specific services in revenue center or line item files and then link these with basic claims
Matching Between Different Claims To link physician services with surgical procedures performed in hospitals (thus creating a complete episode), we need to match the date of services between different claim files, in addition to bene_ID and clinical services However, date of services in the claim files may be off slightly A fuzzy matching by allowing +/- 3 days (or more) will do
Linking with Other Files Area based files, using state, county, and zip codes E.g., census, area resource file Individual based files, e.g., existing cohort study Need SSN or HIC Name, DOB may be possible too
Defining Outcomes Easier to define Yes/No type of outcomes AMI hospitalization (yes/no)? Urinary stricture after BPH surgery? (yes/no) Usually need to search multiple claims files MedPAR for hospitalization, outpatient and Carrier files for physician services or complications Matching the final outcome files to de-duplicate Match by diagnosis, procedure and date of services Mortality data is usually valid Some date of death may be not validated
Time to Event Outcome Starting time is often the date of surgery Admission date can also used if no date of surgery For outcomes resulting hospitalization or ER visit, the event time is the admission date on the claim For outcomes resulting only physician office visits or small procedures (e.g., complications), searching both outpatient and Carrier claims Matching and de-duplicated process is needed Event time is the earliest from-date on the matched claims Censored at the date of death or the last day of claim files (e.g., 12/31/2009)
Comorbidities, pre-existing Conditions, and Complications Can be difficult to distinguish between comorbidities and complications if they appear in the same claim Need prior clinical knowledge to define complications Existing algorithms exist for getting comorbidities Combining with other claims (outpatient, physician services) to obtain complications
Identifying Comorbidities Charlson Score, the most common comorbidity index, can be applied to claims data Sum of number of comorbidities with some weights Total 22 comorbidities Is calibrated to predict 1 year mortality Most often, we categorize them into 0,1,2,3 or more, due to small sample size in the 3+ group
Charlson Score Distribution
General Algorithm for Obtaining Charlson Score Any related diagnosis in inpatient and outpatient services claims (facility claims) before the hospitalization (i.e., excluding the current one) Diagnosis appeared in at least two separate physician services (Carrier) claims (different dates) Often excluding lab test, diagnostic exams to avoid “rule out” diagnosis Combining the indicators of Charlson Score related diagnosis and Sum them over with weights
Other Methods for Calculating Comorbidities Direct search important comorbidities E.g., AMI, stroke, diabetes, etc. Elixhauser comorbidities method (AHRQ) Predicting inpatient LOS, hospitalization charge, death based on CA data 30 comorbidities, directly used as indicators in the model Generally, same algorithm as that of the previous slide, with some modification in conditions
Forming Analytic Files Combining patient socio-demographics, original disease status and procedures, date of services, comorbidities, time to event and outcomes, and charges and payment into one record Inclusion and exclusion criteria Warning: don’t forget those having no events They are from denominator files Often we search multiple claims for outcomes
5. Applications
Comparative Effectiveness Analysis Compare outcomes among different treatments Example: comparing rates of repeated treatment and complications among elderly patients undergone BPH surgery For minimally invasive BPH surgery, there is a higher rates of re-treatment for BPH in long term (e.g., >5% for TUMT in five years)
BPH Cohort Age >=66 at the time of surgery BPH surgeries: TURP, TUMT, TUNA, Laser Not died within 30 days of surgery Having both Part A and B enrollment Not enrolled in any HMO during the follow up Seems limited too much Residing in 50 States (thus excluding those US territories)
Defining First BPH Surgery Our main outcome is repeated BPH surgery But our cohort starts at 2001, some patients may already have a BPH surgery before then Thus the “first” BPH surgery in our data may in fact the repeated surgery Reduce the rate of repeated surgery We request additional year (2000) Search possible BPH surgery, remedial treatment, or complications If exists, then these patients may have BPH surgery before They will be excluded in the analysis
Defining Outcomes Second BPH surgery A list of BPH complications Two types of outcomes Yes/no outcomes, separated by period (e.g., 1 year, 3 years, and 5 years) Time to event outcomes Only first of same outcomes included (e.g., claims with the same complication may be found at different time) Censoring Censored at date of death or last date of claim file (12/31/2008) The last year (2008) claims were used as follow up to ensure everybody has at least one year of follow up. Thus BPH surgery occurred in this year were not included the study cohort
Comorbidities Searching ICD-9 diagnosis in all claim files (MedPAR, outpatient, Carrier) One year before the date of BPH surgery Thus we did not use one year of data for study cohort Age now starts with 66 at the time of surgery Charlson Score was calculated, and classified into 0, 1, 2, 3+
Socio-economic data Patient race, age, sex, and state buyin status Census data were linked with denominator files based on beneficiary residential zip codes Zip code level income, percent of high school education, and percent of blacks were used
Questions?