Analysis of Complex Survey Data Katherine M. Keyes

Slides:



Advertisements
Similar presentations
1 Does Disadvantage Start at Home? Racial and Ethnic Disparities in Early Childhood Home Routines, Safety, and Educational Practices/Resources Glenn Flores,
Advertisements

Behavioral Risk Factor Surveillance System
NHANES: A Potential Database for Risk Assessment Clifford Johnson Centers for Disease Control and Prevention National Center for Health Statistics.
Healthy Border 2010: History and Health Measures Sam Notzon National Center for Health Statistics.
An Assessment of the Impact of Two Distinct Survey Design Modifications on Health Insurance Coverage Estimates in a National Health Care Survey Steven.
Measures of Child Well-Being from a Decentralized Statistical System: A View From the U.S. National Center for Health Statistics Stephen J. Blumberg, Ph.D.
National Center for Health Statistics DCC CENTERS FOR DISEASE CONTROL AND PREVENTION Changes in Race Differentials: The Impact of the New OMB Standards.
Van S. Hubbard, M.D., Ph.D. National Institutes of Health and Clifford L. Johnson, M.S.P.H. Centers for Disease Control and Prevention Department of Health.
Associations between Obesity and Depression by Race/Ethnicity and Education among Women: Results from the National Health and Nutrition Examination Survey,
NIMH Collaborative Psychiatric Epidemiology Surveys
2.2: Sampling methods (pp. 17 – 20) Probability sampling: methods that can specify the probability that a given sample will be selected. Randomization:
Methodologic Overview of Two National Data Sets Centers for Disease Control and Prevention National Center for Health Statistics Issues in Comparing Findings.
2010 National Conference on Health Statistics Session 4. Finding Key Resources on the NCHS Website Anthony Quintana Evangeline Adams Alana Yick Office.
Osteoporosis Prevention Recognizing the Importance of Calcium Consumption at a Young Age Backgrounder: Melissa Raney Identifier: Noël Konken Evaluator:
Trends in Herpes Simples Virus Type 2 infection in the United States — Data from NHANES Centers for Disease Control and Prevention (CDC) Emory University.
David Card, Carlos Dobkin, Nicole Maestas
Introducing HealthStats Eleanor Howell, MS Manager, Data Dissemination Unit State Center for Health Statistics February 2, 2012.
Press Release FOR IMMEDIATE RELEASE:CONTACT: Roseanne Pawelec, Tuesday, July 23, 2002(617) NEARLY HALF OF ALL MASSACHUSETTS RESIDENTS OVERWEIGHT.
Aspects of the National Health Interview Survey (NHIS) Chris Moriarity National Conference on Health Statistics August 16, 2010
NHANES Analytic Strategies Deanna Kruszon-Moran, MS Centers for Disease Control and Prevention National Center for Health Statistics.
National Health and Nutrition Examination Survey: Overview U.S. DEPARTMENT OF HEALTH AND HUMAN SERVICES Centers for Disease Control and Prevention National.
Complexities of Complex Survey Design Analysis. Why worry about this? Many government studies use these designs – CDC National Health Interview Survey.
Unlocking the Power of NHANES. Agenda I.Introduction Joshua Murphy, Vice President II.Demonstration/Training Dennis Wijnker, Senior Software Architect,
Sampling. Concerns 1)Representativeness of the Sample: Does the sample accurately portray the population from which it is drawn 2)Time and Change: Was.
Definitions Observation unit Target population Sample Sampled population Sampling unit Sampling frame.
Women’s Health in Massachusetts Highlights from the Massachusetts Behavioral Risk Factor Surveillance System (BRFSS): Health Survey Program Bureau.
Obesity among Hispanics - a brief demographic account Rodolfo Valdez, Ph.D., M. Sc. Division of Diabetes Translation Centers for Disease Control and Prevention.
Measuring Years of Healthy Life: Use of Summary Measures in The Healthy People Initiative Ritu Tuteja, MPH National Center for Health Statistics.
Comparable Health Data Between Canada and the U.S. n Many organizations such as the United Nations, World Health Organization and the Organization of Economic.
HS499 Bachelor’s Capstone Week 6 Seminar Research Analysis on Community Health.
Jacqueline Wilson Lucas, B.A., MPH Renee Gindi, Ph.D. Division of Health Interview Statistics Presented at the 2012 National Conference on Health Statistics.
HEALTHY PEOPLE 2010 Objectives for Improving Health Richard Harvey, Ph.D. VA National Center for Health Promotion and Disease Prevention (NCP)
Medical Expenditure Panel Survey SURVEY OVERVIEW.
Secondary Data Analysis Linda K. Owens, PhD Assistant Director for Sampling and Analysis Survey Research Laboratory University of Illinois.
DIABETES National Healthcare Quality and Disparities Report Chartbook on Effective Treatment.
1 Clinical Investigation and Outcomes Research Research Using Existing Databases Marcia A. Testa, MPH, PhD Department of Biostatistics Harvard School of.
National Health and Nutrition Examination Survey: A Very General Overview Taken from various NHANES sources and Lein’s comments.
1 Introduction to Survey Data Analysis Linda K. Owens, PhD Assistant Director for Sampling & Analysis Survey Research Laboratory University of Illinois.
JENNIFER SAYLOR, PHD, RN, ANCS-BC UNIVERSITY OF DELAWARE SEPTEMBER 14, 2012 Essentials of Complex Data Analysis Utilizing National Survey.
National Center for Health Statistics DCC CENTERS FOR DISEASE CONTROL AND PREVENTION Women’s Health Data in the National Survey of Family Growth (NSFG)
WWEIA, NHANES Dietary Data: Data Preparation Steps for Dietary Analysis Randy P. LaComb Food Surveys Research Group Beltsville Human Nutrition Research.
Rosemarie Hirsch M.D., M.P.H.
1 Using National Hospital Ambulatory Medical Care Survey (NHAMCS) data for injury analysis Linda McCaig Ambulatory Care Statistics Branch Division of Health.
National Center for Health Statistics National Health and Nutrition Examination Survey OP96S002.
Introduction to Secondary Data Analysis Young Ik Cho, PhD Research Associate Professor Survey Research Laboratory University of Illinois at Chicago Fall,
RESEARCH DATA CENTER Types of Data. Major NCHS Surveys and Data Systems National Health and Nutrition Examination Survey (NHANES) National Health Interview.
Chapter 6: 1 Sampling. Introduction Sampling - the process of selecting observations Often not possible to collect information from all persons or other.
The National Health and Nutrition Examination Survey U.S. DEPARTMENT OF HEALTH AND HUMAN SERVICES Centers for Disease Control and Prevention National Center.
U.S. DEPARTMENT OF HEALTH AND HUMAN SERVICES Centers for Disease Control and Prevention National Center for Health Statistics Occupational exposure to.
Health Checks. Introductions Today’s Layout 14:00 – 14:30 Welcome and Introductions Update from Hospital Discharges Slot for any updates from Go To people.
Feeding Infants and Toddlers Study: The Types of Foods Fed To Hispanic Infants and Toddlers JULIE A. MENNELLA, PhD; PAULA ZIEGLER, PhD, RD; RONETTE BRIEFEL,
Analytical Example Using NHIS Data Files John R. Pleis.
 2013 Cengage-Wadsworth A National Nutrition Agenda for the Public’s Health.
National Health Interview Survey Early Release Program: Overview and Key Health Indicators Report Jeannine S. Schiller, M.P.H. Division of Health Interview.
CASE STUDY: NATIONAL SURVEY OF FAMILY GROWTH Karen E. Davis National Center for Health Statistics Coordinating Center for Health Information and Service.
Clifford Johnson, Director U.S. DEPARTMENT OF HEALTH AND HUMAN SERVICES Centers for Disease Control and Prevention National Center for Health Statistics.
National Health and Nutrition Examination Survey (NHANES): Overview and Analysis Strategy Session 2 July 10, :30 AM - NOON.
Using Data from the National Survey of Children with Special Health Care Needs Centers for Disease Control and Prevention National Center for Health Statistics.
NHANES Analytic Strategies Deanna Kruszon-Moran, MS Centers for Disease Control and Prevention National Center for Health Statistics.
Sample Design of the National Health Interview Survey (NHIS) Linda Tompkins Data Users Conference July 12, 2006 Centers for Disease Control and Prevention.
Healthy People 2010 Focus Area 5: Diabetes Progress Review October 20, 2006.
Slide 7.1 Saunders, Lewis and Thornhill, Research Methods for Business Students, 5 th Edition, © Mark Saunders, Philip Lewis and Adrian Thornhill 2009.
Kirsten Herrick, MSc, PhD
RESEARCH METHODS Lecture 28. TYPES OF PROBABILITY SAMPLING Requires more work than nonrandom sampling. Researcher must identify sampling elements. Necessary.
Introduction to NCHS Rob Weinzimer, Special Assistant for Outreach Centers for Disease Control and Prevention National Center for Health Statistics.
Table 1. Methodological Evaluation of Observational Research (MORE) – observational studies of incidence or prevalence of chronic diseases Tatyana Shamliyan.
1 ANALYZING DATA FROM THE NATIONAL IMMUNIZATION SURVEY __________________________________________ Michael P. Battaglia Abt Associates Inc. Meena Khare.
Complex Surveys
Deanna Kruszon-Moran, MS
Presentation transcript:

Analysis of Complex Survey Data Katherine M. Keyes

Purpose of this class Teach you how to analyze complex survey data using SUDAAN Provide you with the tools to: – 1) find datasets that fit your research interests; – 2) download and manage those datasets; – 3) do your own analyses

Structure of the class 1:00-2:00Lecture 2:00-3:30Guided exercise 3:30-3:45Break 3:45-5:00Independent research project

Today’s schedule Introduction to each other Key concepts in complex surveys Introduction to the NHANES – Focus on describing the complexities in sample and design weights PREPARING AN ANALYTIC DATASET – Locate variables – Download data files – Append and merge datasets – Clean and recode data – Format and label variables – Save datasets

Who am I?

Who are you?

What is ‘complex survey data’ Complex survey data usually refers to sample designs in which respondents have been sampled in a way that is multi-stage, stratified, unequally weighted, and/or clustered. Because of these design elements, the sample is no longer “randomly selected”, which violates the assumptions of basic large-sample statistics

What is ‘complex survey data’ Because of this, we need to take into account the design elements when estimating standard errors.

Two types of weights commonly used SAMPLE WEIGHTS: adjust for oversampling of certain typically hard to reach groups (e.g., young people) and informative nonresponse DESIGN WEIGHTS: adjust the standard errors for the nonrandom probability of selection into the sample TAKE HOME MESSAGE: Sample weights affect the ESTIMATES and not the STANDARD ERRORS Design weights affect the STANDARD ERRORS and not the ESTIMATES We need SUDAAN to incorporate the design weights.

Design weights: what are they Strata: larger geographic unit Primary Sampling Units (PSUs): generally single counties or groups of small counties Households

Introduction to the data we will be using in this class National Health and Nutrition Examination Survey “A program of studies designed to assess the health and nutritional status of adults and children in the United States. The survey is unique in that it combines interviews with physical examinations.”

Introduction to the data we will be using in this class Years Survey nameNHES INHES IINHES III NHANES I NHANESI I NHANES III Phase I NHANES III Phase II NHANES NHANES NHANES NHANES NHANES NHANES Age range

Domains of inquiry in the NHANES Demographic background Housing characteristics Smoking Consumer behavior Income Food security Tracking and tracing Acculturation Arthritis Audiometry Blood pressure Cardiovascular disease Dermatology Diabetes Dietary screener Dietary behavior Early childhood Health insurance Hospital utilization and access to care Immunization Kidney conditions Occupation Oral health Osteoporosis

Domains of inquiry in the NHANES Physical activity and physical fitness Physical functioning Respiratory Health and Disease Sleep disorders Weight history Reproductive health Illegal drug use Depression Alcohol use Pesticide use Bowel health

Physical exam includes measures of: Arthritis Audiometry Bone density (DXA) Anthropometry Oral Glucose Tolerance Test Oral Health Physician’s Exam Respiratory Health

Laboratory components include measures of: Venipuncture Urine collection Bone mineral status markers Diabetes profile Infectious disease profile Oral HPV C-reative protein Thyroid profile Standard biochemical profile Kidney disease profile Pregnancy test Prostate Specific Antigen Nutritional biochemistries and hematologies STD profile Blood lipids Environmental health profile

DNA Blood samples for DNA purification were collected from participants age 20 or more years in survey years and These are restricted access data

Landmark findings and public health results High blood lead levels – Lead out of gasoline Low folate levels – Mandatory food fortification Rising levels of obesity – Public health action plan Racial/ethnic disparities in Hepatitis B – Universal vaccination of all infants and children

NHANES not for you? The concepts we will discuss apply to many other publicly available datasets, and you are encouraged to use these data for your in-class project if your research questions are not covered in the NHANES Where can I find other publicly available datasets? – ICPSR:

SAMPLE WEIGHTING IN THE NHANES

Design weights: variable names Strata: SDMVSTRA PSU: SDMVPSU

Sample weights in the NHANES If only data from the interviewed sample is used, then the appropriate SAS variable is: – WTINT2YR If data from the medical examination is used, then the appropriate SAS variable is: – WTMEC2YR Some data are only collected on sub-samples of NHANES participants. These data are generally not publicly available or are only released a few years after the main interview data. If you are using data on a subsample of NHANES participants, appropriate subsample weights must be used and they are included on any data file where relevant.

Combining NHANES samples For NHANES , SDMVSTRA is numbered 1 to 13; for NHANES SDMVSTRA is numbered 14-28; for NHANES SDMVSTRA is numbered 29-43; etc. Therefore, two year NHANES cycles can be combined without any recoding of this variable

Combining NHANES samples: For the and survey periods, Mexican Americans were oversampled but non-Mexican American Hispanics were not oversampled. Therefore, estimates for Hispanics that are not Mexican Americans are generally unreliable and should not be analyzed Further, estimates for ‘all Hispanics’ should not be calculated

Combining NHANES samples: , The sample design of NHANES is different than the sample designs for earlier cycles. Adolescents were no longer oversampled Non-Mexican American Hispanics were oversampled, allowing for estimates of “all Hispanics” (but smaller subgroups remain unreliable).

Summary: combining samples The NHANES sample designs for the periods and were similar, such that combining data cycles within these periods does not present any analytic issues. When combining with the data, however, data users should not create estimates for total Hispanics for the data period. For non-Hispanic white, non-Hispanic black, and Mexican American sample domains, rescaling the sample weights to create four-year weights should be sufficient But users should check estimates carefully to see if the four year estimates and sampling errors are consistent with each set of 2 year estimates.

Reweighting the data when combining samples When combining two or more 2-year cycles of the continuous NHANES, the user must calculate new sample weights before beginning any analysis of the data. A set of four year weights has already been created for the data (e.g., for the MEC sample it’s WTMEC4YR). For four year estimates for , one can create a new variable for a four year weight by assigning ½ of the 2 year weight for if the person was sampled in or assigning ½ of the 2 year weight for if the person was sampled in For an estimate for the 6-years of , a 6-year weight variable can be created by assigning 2/3 of the 4 year weight for if the person was sampled between or assigning 1/3 or the 2 year weight for if the person was sampled in

LAB #1: PREPARING AN ANALYTIC DATASET Open the Word document “Lab 1: Preparing an analytic dataset”