Download presentation
Presentation is loading. Please wait.
Published byDennis Long Modified over 9 years ago
1
Making Large Data Sets Work for You Advantages and Challenges Lesley H Curtis Soko Setoguchi Bradley G Hammill
2
Presenter disclosure information Lesley H Curtis Large Data Sets: An Overview FINANCIAL DISCLOSURE: None UNLABELED/UNAPPROVED USES DISCLOSURE: None
3
Agenda Large Data Sets: An Overview Prescription Drug Data: Advantages, Availability, and Access Linking Large Data Sets: Why, How, and What Not to Do Practical Examples
4
Which large data sets? Relevant for cardiovascular research Available to researchers Potential for linkage Claims data—federal and commercial Inpatient registries Longitudinal cohort studies
5
Claims data n Derived from payment of bills n Payor-centric n Examples l Medicare l Medicaid l Thomson-Reuters l United Health Care
6
Medicare claims data n Inpatient services (Part A) n Outpatient services (Part B) n Physician services (Carrier, Part B) n Durable medical equipment n Home health care n Skilled nursing facilities n Hospice
7
Medicare claims data elements n What data are available l Demographics l Service dates l Diagnoses l Procedures l Hospital / Physician n What data are not available l Physiological measures l Test results l Times of admission, procedures, etc. l Medications
8
Medicare claims data coverage n National scope n What patients will be represented? l Patients enrolled in traditional (fee-for- service) Medicare n What patients will not be represented? l Patients receiving care through the Veterans Health Administration l Patients enrolled in Medicare managed care plans
9
Medicare claims data quality n Main point l Reliability of specific claims data elements depends on importance for reimbursement n Good data on… l Major procedures l Hospitalizations l Mortality n Inconsistent data on… l Comorbidities and illness severity l Procedures with low reimbursement rates
10
Acquiring CMS claims data n All requests begin with ResDAC (www.resdac.umn.edu) n Cost l $15K per year of inpatient+denominator data l $20K per year of 5% data across all files l $30K+ per year of data for custom requests n Detailed approval process l Prepare request packet for ResDAC review (4-6 weeks) l Review by CMS privacy board (4 weeks) l Request processed by contractor (6-8 weeks)
11
Preparing for CMS claims data n Make space l 16 GB for 100% denominator and inpatient files l 57 GB for 5% denominator, inpatient, outpatient, and carrier* files n Manage expectations l Time to process files l Transforming raw claims into usable information Coding algorithms Coding changes l Learning curve
12
The Learning Curve
13
Claims data n Derived from payment of bills n Payor-centric n Examples l Medicare l Medicaid l Thomson-Reuters l United Health Care
14
Commercial claims data elements n What data are typically available l Demographics l Service dates l Diagnoses l Procedures l Medications l Hospital / Physician n What data may not be available l Physiological measures l Test results
15
Commercial claims data coverage n National scope n What patients will be represented? l Individuals who are commercially insured n What patients will not be represented? l The uninsured l Medicare managed care?
16
Commercial claims data quality n Similar to Medicare claims data l Reliability of specific claims data elements depends on importance for reimbursement n Good data on… l Major procedures l Hospitalizations n Inconsistent data on… l Mortality l Comorbidities and illness severity l Procedures with low reimbursement rates
17
Preparing for commercial claims data n Cost l $25-70K depending on size, scope of data request n Size l 100 GB per year of data l Analysis sample sizes will differ from advertised sample sizes n Manage expectations!
18
Registry data n Observational cohorts of patients undergoing specific treatments or having specific conditions n Purpose may be to assess… l Quality of care l Provider performance l Treatment safety/effectiveness n Of interest today are hospital-based registries
19
OPTIMIZE-HF registry n Hospital-based quality improvement program and internet-based registry for heart failure. n 2002-2005: 50,000 patients; > 250 hospitals n Transitioned to GWTG-HF in 2005
20
Registry data coverage n Only patients treated at participating hospitals will be included + All patients at these hospitals included regardless of payor – Participating hospitals may not be representative of hospitals nationwide % of group in selected states State US Elderly Medicare FFS OPTIMIZE-HF California10.1%7.7%13.8% Florida7.4%7.0%8.7% Michigan3.4%4.0%9.5% New York 6.6%6.1%3.5% Pennsylvania5.2%4.4%6.7% Texas6.0%6.5%5.4%
21
Registry data quality n Good data on… l Many of the things not included in Medicare data: Labs, medications, treatment timing, process measures, contraindications (if collected) n Inconsistent data on… l Post-hospitalization follow-up care l Outcomes, particularly long-term
22
Accessing registry data n Networking and partnering l Many require that analyses be performed at selected analytical centers which may have long queues n Approval process via steering or executive committee
23
NHLBI longitudinal cohort studies n Atherosclerosis Risk in Communities Study (ARIC) n Cardiovascular Health Study (CHS) n Framingham Heart Study n Jackson Heart Study n Multi-Ethnic Study of Atherosclerosis n Women’s Health Initiative
24
Cardiovascular Health Study (CHS) n n Prospective, observational study of CV disease in the elderly (Washington Co. Maryland, Forsyth Co. NC, Sacramento Co. CA, and Pittsburgh, PA.) n n Baseline exams occurred from 1989-90. n n Minority cohort added at Year 5 n n Annual exams, with ‘major’ exams occurring at year 5 (1992-93), and year 9 (1996-97). Last exam was year 11 (1998-99). n n 5,201 participants at baseline; 687 additional minority participants 5,888
25
Cardiovascular Health Study data elements n What data are available l Demographics l Medical, personal history l Physiological measures, test results l QOL, depression l Cognitive function n What data are not available l Service dates l Procedures l Hospital/physician
26
Cardiovascular Health Study data quality n Main point l Data collected are of high quality n Good data regarding… l Cardiovascular risk factors l Cardiovascular endpoints l General health n Limited data on… l Non-cardiovascular risk factors l Non-cardiovascular endpoints
27
Accessing NHLBI cohort studies n Via the NHLBI data repository l HIPAA identifiers, geography removed n Via Coordinating Center for identifiable data n Size l 20MB per year of data
28
NHLBI-Medicare linked data sets n CMS linked with… l CHS (1991-2004, 2005-2009 pending) l Framingham (2000-2009 pending) l Jackson Heart Study (2000-2009 pending) l Multi-Ethnic Study of Atherosclerosis (2000- 2009 pending) l Atherosclerosis Risk in Communities l Women’s Health Initiative
29
Conclusion n Large data sets abound n Do yourself a favor…manage expectations!
30
Contact Information Lesley Curtis Lesley.curtis@duke.edu
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.