National Hospital Discharge Survey: A Hands-On Workshop Using Public-Use Data Files Michelle N. Podgornik, MPH 2006 Data Users Conference July 11, :30am-10:00am Session #23
National probability survey of discharges from non-federal, short-stay hospitals in the United States Conducted annually since 1965 Latest data available: 2004 NHDS Background
Three-Stage Design Primary Sampling Units (PSUs) Primary Sampling Units (PSUs) Hospitals Hospitals Discharges Discharges NHDS Design
Age Sex Race Source of payment Discharge status Marital status Patient Data
Geographic region Bed size Ownership Facility Data
Diagnoses Surgical and non-surgical procedures International Classification of Diseases, 9 th Revision, Clinical Modification (ICD-9-CM) Medical Data
Days of care Month of discharge Diagnosis-related groups (DRGs) Analysis weight Additional Data
At the end of this session, you will be able to: Understand some of the advantages and limitations of using downloadable NHDS data Understand some of the advantages and limitations of using downloadable NHDS data Successfully download data files from the Internet and create a SAS dataset for analysis Successfully download data files from the Internet and create a SAS dataset for analysis Perform simple analyses using SAS Perform simple analyses using SAS Session Goals
Advantages Obtain current data and data file documentation Obtain current data and data file documentation Obtain diagnosis-related group (DRG) information Obtain diagnosis-related group (DRG) information Using Online Data Files
Limitations Only single-year data files from 1996 through 2004 are available Only single-year data files from 1996 through 2004 are available Variables necessary to run SUDAAN are not publicly available Variables necessary to run SUDAAN are not publicly available Using Online Data Files
Download data and data file layout from the NCHS website Fixed width ASCII files are available for each data collection year since 1996 Downloading Data
Downloaded data files must be “unzipped” using extraction software (e.g. WinZip) Your computer may have extraction software pre-installed; if not, a free evaluation version of WinZip is available at: Extracting Data
Each document includes: A description of the NHDS, including survey methodology A description of the NHDS, including survey methodology File layout File layout Parameters and equations used to calculate relative standard errors, a measure of the reliability of an estimate Parameters and equations used to calculate relative standard errors, a measure of the reliability of an estimate Data File Documentation
Each document also includes: ICD-9-CM code changes ICD-9-CM code changes Census population estimates Census population estimates Unweighted and weighted frequencies of selected variables Unweighted and weighted frequencies of selected variables Medical Abstract Form Medical Abstract Form Data File Documentation ( - cont’d - )
Select “Extract All…” from the drop-down menu
Double-click: My Computer Double-click:Local Disk (C:) Double-click:2006 Data Users Conference Double-click:Exercises Hands-On Exercises
Creating a SAS dataset Generating simple unweighted and weighted frequencies Calculating first-listed, any-listed, and all-listed diagnoses Calculating all-listed procedures Hands-On Exercises
Exercise #1 Creating a SAS dataset
Exercise #2 Generating simple unweighted frequencies
Exercise #3 Generating simple weighted frequencies
Exercise #4 Calculating first-listed diagnoses
Definition First-listed diagnosis Principal diagnosis (if specified) or the diagnosis listed first on the medical record face sheet Principal diagnosis (if specified) or the diagnosis listed first on the medical record face sheet
Exercise #5 Calculating any-listed diagnoses
Definition Any-listed diagnosis The occurrence of a diagnosis at least once in a record, regardless of position The occurrence of a diagnosis at least once in a record, regardless of position
Exercise #6 Calculating all-listed diagnoses
Definition All-listed diagnosis Total number of times (up to seven) that a diagnosis appears in a record Total number of times (up to seven) that a diagnosis appears in a record
Exercise #7 Calculating all-listed procedures
Definition All-listed procedure Total number of times (up to four) that a procedure appears in a record Total number of times (up to four) that a procedure appears in a record
Reliability of Estimates Estimates should be based on at least 30 sample records Estimates should also have a relative standard error (RSE) of less than 30 percent
Combine multiple years of data until you have at least 30 raw cases in cells of interest RSE improves with the number of years combined How To Increase Reliability
Multi-year data file CD-ROM Admission month is available on the multi-year data file but not on the single-year data files Admission month is available on the multi-year data file but not on the single-year data files Diagnostic-related group (DRG) information is available on the single-year data files but not on the multi-year data file Diagnostic-related group (DRG) information is available on the single-year data files but not on the multi-year data file Order of variables slightly different Order of variables slightly different Other Sources of Public-Use Data
Visit the NHDS website Contact the Hospital Care Statistics Branch by calling or by ing How to Obtain More Information
To request electronic copies of this PowerPoint presentation and the SAS exercises, please send an to Michelle Podgornik at