Statistical Weights and Methods for Analyzing HINTS Data HINTS Data Users Conference January 21, 2005 William W. Davis, Ph.D. Richard P. Moser, Ph.D. National.

Slides:



Advertisements
Similar presentations
1 Unequal Treatment for Young Children? Racial and Ethnic Disparities in Early Childhood Health and Healthcare Glenn Flores, MD, 1 Sandy Tomany, MS 1 and.
Advertisements

Preliminary Results from the 2008 Oklahoma Health Care Insurance and Access Survey Presentation to the Oklahoma Health Care Authority Board November 13,
9. Weighting and Weighted Standard Errors. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2 Training.
1 Sampling Telephone Numbers and Adults, and Interview Length, and Weighting in the California Health Survey Cell Phone Pilot Study J. Michael Brick, Westat.
Associations between Obesity and Depression by Race/Ethnicity and Education among Women: Results from the National Health and Nutrition Examination Survey,
Does Unequal Income Translate into Unequal Knowledge? The Knowledge Gap and Cancer Health Disparities.
Exploring Multiple Dimensions of Asthma Disparities Using the Behavioral Risk Factor Surveillance System Kirsti Bocskay, PhD, MPH Office of Epidemiology.
Use of Spiritual Healing Therapy in Relation to Race and Ethnicity Catherine Simile, Ph.D. and Hanyu Ni, Ph.D. Division of Health Interview Statistics.
Sex Differences in the Prevalence and Correlates of Colorectal Cancer Testing: Health Information National Trends Survey Sally W. Vernon 1, Amy.
April 25 Exam April 27 (bring calculator with exp) Cox-Regression
CHILDREN’S MENTAL HEALTH PROBLEMS IN RHODE ISLAND: THE PREVALENCE AND RISK FACTORS Hanna Kim, PhD and Samara Viner-Brown, MS Rhode Island Department of.
You want to survey a school You draw your sample from the first day of school student enrollment list This list would be your ____???____ Which students.
AP Statistics Section 13.2 A
Scot Exec Course Nov/Dec 04 Ambitious title? Confidence intervals, design effects and significance tests for surveys. How to calculate sample numbers when.
TEMPLATE DESIGN © What about (Having) the Children? Rosalind B. King, National Institute of Child and Health Development.
Joint Canada/U.S. Health Survey Catherine Simile, National Center for Health Statistics Patrice Mathieu, Statistics Canada Ed Rama, Statistics Canada NCHS.
Biostat Didactic Seminar Series Analyzing Binary Outcomes: Analyzing Binary Outcomes: An Introduction to Logistic Regression Robert Boudreau, PhD Co-Director.
NHANES Analytic Strategies Deanna Kruszon-Moran, MS Centers for Disease Control and Prevention National Center for Health Statistics.
Condom Use Questions as Predictors of Urogenital Gonorrhea Adrianne M. Williams, MD, Philana Liang, PA-C, MPH, Renee M. Gindi, MPH, Khalil G. Ghanem, MD,
Deanna E. White, Adam Stevens, John Barbaro, Kristy McGill and Lynne Russell.
Biostatistics Case Studies 2005 Peter D. Christenson Biostatistician Session 4: Taking Risks and Playing the Odds: OR vs.
Risk and Worry as Predictors of Cancer Screening Behavior: Results Using HINTS Data Richard P. Moser, Ph.D. 1, Kevin McCaul, Ph.D. 2, Ellen Peters, Ph.D.
Chapter 7 Statistical Inference: Confidence Intervals
18b. PROC SURVEY Procedures in SAS ®. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2 Training.
Secondary Data Analysis Linda K. Owens, PhD Assistant Director for Sampling and Analysis Survey Research Laboratory University of Illinois.
What Does “No Opinion” Mean in the HINTS? Michael P. Massagli, Ph.D. K. Vish Viswanath, Ph.D. Dana-Farber Cancer Institute.
Tobacco Use Supplement to the Current Population Survey User’s Workshop June 9, 2009 Tips and Tricks Analyzing TUS-CPS Data Lloyd Hicks Westat.
Racial and Ethnic Disparities in the Knowledge of Shaken Baby Syndrome among Recent Mothers Findings from the Rhode Island PRAMS Hanna Kim, Samara.
April 6 Logistic Regression –Estimating probability based on logistic model –Testing differences among multiple groups –Assumptions for model.
1 Introduction to Survey Data Analysis Linda K. Owens, PhD Assistant Director for Sampling & Analysis Survey Research Laboratory University of Illinois.
JENNIFER SAYLOR, PHD, RN, ANCS-BC UNIVERSITY OF DELAWARE SEPTEMBER 14, 2012 Essentials of Complex Data Analysis Utilizing National Survey.
Panel Study of Entrepreneurial Dynamics Richard Curtin University of Michigan.
Growing Challenges to State Telephone Surveys of Health Insurance Coverage: Minnesota as a Case Study Supported by a grant from the Minnesota Department.
1 Introduction to Survey Data Analysis Linda K. Owens, PhD Assistant Director for Sampling & Analysis Survey Research Laboratory University of Illinois.
Who’s behind the screen & what are they looking for? Comparing data from the Health Information National Trends Survey, the American Customer Satisfaction.
Sampling Error.  When we take a sample, our results will not exactly equal the correct results for the whole population. That is, our results will be.
Shane Lloyd, MPH 2011, 1,2 Annie Gjelsvik, PhD, 1,2 Deborah N. Pearlman, PhD, 1,2 Carrie Bridges, MPH, 2 1 Brown University Alpert Medical School, 2 Rhode.
Today - Messages Additional shared lab hours in A-269 –M, W, F 2:30-4:25 –T, Th 4:00-5:15 First priority is for PH5452. No TA or instructor Handouts –
Comparative Analyses of Three Measures of Concordance between Current and Longest Held Jobs Orlando Gómez-Marín MSc PhD, Lora E. Fleming MD PhD, William.
Acute and Chronic Disability Among US Farmers and Pesticide Applicators: The National Health Interview Survey O Gómez-Marín, D Zheng, W LeBlanc, D Lee,
Mary Hrywna, MPH Cristine D. Delnevo, PhD, MPH Dorota Staniewska, MS University of Medicine & Dentistry of New Jersey (UMDNJ) School of Public Health (SPH)
Awareness of National Cancer Information Resources Linda Squiers, Ph.D. Lila Finney Rutten, Ph.D., MPH Audie Atienza, Ph.D. Mary Anne Bright, R.N., M.N.
ICCS 2009 IDB Workshop, 18 th February 2010, Madrid 1 Training Workshop on the ICCS 2009 database Weighting and Variance Estimation picture.
Introduction to Secondary Data Analysis Young Ik Cho, PhD Research Associate Professor Survey Research Laboratory University of Illinois at Chicago Fall,
Multiple Logistic Regression STAT E-150 Statistical Methods.
F UNCTIONAL L IMITATIONS IN C ANCER S URVIVORS A MONG E LDERLY M EDICARE B ENEFICIARIES Prachi P. Chavan, MD, MPH Epidemiology PhD Student Xinhua Yu MD.
Analytical Example Using NHIS Data Files John R. Pleis.
Lecture 4 Ways to get data into SAS Some practice programming
SAMPLE DESIGN: HOW MANY WILL BE IN THE SAMPLE—SAMPLE SIZE ADJUSTMENTS?
Statistics Canada Citizenship and Immigration Canada Methodological issues.
CASE STUDY: NATIONAL SURVEY OF FAMILY GROWTH Karen E. Davis National Center for Health Statistics Coordinating Center for Health Information and Service.
1 Optimal Number of Replicates for Variance Estimation Mansour Fahimi, Darryl Creel, Peter Siegel, Matt Westlake, Ruby Johnson, and Jim Chromy Third International.
Impact of Perceived Discrimination on Use of Preventive Health Services Amal Trivedi, M.D., M.P.H. John Z. Ayanian, M.D., M.P.P. Harvard Medical School/Brigham.
Using Data from the National Survey of Children with Special Health Care Needs Centers for Disease Control and Prevention National Center for Health Statistics.
5 A Day Behavior and Knowledge of Recommendations in Relation to Health Communication in the Health Information National Trends Survey (HINTS) Jennifer.
NHANES Analytic Strategies Deanna Kruszon-Moran, MS Centers for Disease Control and Prevention National Center for Health Statistics.
Sample Design of the National Health Interview Survey (NHIS) Linda Tompkins Data Users Conference July 12, 2006 Centers for Disease Control and Prevention.
The Practice of Statistics Third Edition Chapter 10.3 (12.1): Estimating a Population Proportion Copyright © 2008 by W. H. Freeman & Company Daniel S.
Presented by: Khaleel S. Hussaini PhD Bureau Chief, Public Health Statistics Division of Public Health Preparedness Judy Bass Arizona’s BRFSS Coordinator.
138 th American Public Health Association Annual Meeting Denver, Colorado November 8, 2010 Determinants of HIV Testing Among High School Students with.
Arnold School of Public Health Health Services Policy and Management 1 Women’s Cancer Screening Services Utilization Versus Their Insurance Source Presenter:
Weighting and imputation PHC 6716 July 13, 2011 Chris McCarty.
Bootstrap and Model Validation
Working with the ECLS-B Datasets Weights and other issues.
APHA 135th Annual Meeting November 7, 2007
Using Weights in the Analysis of Survey Data
Melanie Dove, MPH, ScD UC Davis
Using Weights in the Analysis of Survey Data
Presentation transcript:

Statistical Weights and Methods for Analyzing HINTS Data HINTS Data Users Conference January 21, 2005 William W. Davis, Ph.D. Richard P. Moser, Ph.D. National Cancer Institute

HINTS Survey Carried Out by Westat  List of telephone exchanges purchased  Exchanges and numbers sampled using random digit dialing (RDD)  Screens out unwanted exchanges (e.g., business exchanges)  Exchanges with high minority representation were over-sampled (HINTS stratification)  For more information see L. Rizzo’s document on our website  “NCI HINTS Sample Design and Weighting Plan”

HINTS Statistical Weight  Statistical weight:  Sampled person represents this many in the population  HINTS Statistical weights derived from  Selection probabilities,  Number of telephones in the household  Response rates  Post-stratification adjustment

HINTS: Race Ethnicity Race EthN% Wgt NWgt %Diff % Hispanic % 23,340, %0.9% White % 143,031, %-1.1% Afr Amer % 20,905, %1.3% Others3124.9%12,028,3375.7%-0.8% Missing3014.7% 10,148,8124.8%-0.1% Total % 209,454, % Reflects the planned oversampling of minority exchanges.

Minorities Were Oversampled: Boxplot of Statistical Weights

Older Folks Participated at a Higher Rate

Females Participated at a Higher Rate

Participation Increased with Education

HINTS: Weighted vs. Unweighted Analyses Unweighted HINTS analyses would have  Too many African Americans and Hispanics  Too many 45+ and too few year olds  Too many females and too few males  Too many people with high education

Variance/Bias Tradeoff for Mean EstimateMeanConfidence Interval Unweighted Weighted The unweighted mean is biased The weighted mean has a larger variance

HINTS Design Effect RestrictionCV(1+CV 2 ) 1/2 African American females Hispanic males  CV is the coefficient of variation of the stat. weights  1+CV 2 is called the design effect  CIs are 17-31% larger due to the weights  Small price to pay for correct centering

Replicate weights  What are replicate weights?  HINTS 50 replicate weights (fwgt1-fwgt50) were obtained by deleting 1/50th of the subjects in the full sample (and re-weighting)  Why do we need replicate weights?  Used to estimate the variance of estimates obtained from the full sample -- for example a mean or a regression coefficient  For more information see the SUDAAN manual or  Korn, E.L. and Graubard, B.I. (1999). Analysis of Health Surveys. John Wiley, p. 29.

Examples of HINTS Weights Subfwgtfwgt1fwgt2 114,36714,69314, ,694111,069111, ,767014, ,46719,3010 Full sample (fwgt) and 2 replicate weights for 4 sampled people First two subjects are in both replicates while other two are not The sum of each column of weights is the same – 209,454,391

Jackknife Estimate of Variance Full sample estimate Replicate estimate Jackknife estimate of variance

HINTS Example  I’m going to read you a list of organizations. Before being contacted for this study, had you ever heard of: a.The National Institutes of Health?  Estimate population proportion (and give 95% confidence intervals (CIs))  by race-ethnicity/gender

Heard of NIH? – 95% CIs

SAS and SUDAAN Procedures Analysis typeSASSUDAAN Not designed for survey analysis Designed for survey analysis Mean MEANSDESCRIPT Crosstab FREQCROSSTAB Multiple regression REG or GLMREGRESS Logistic regression LOGISTICRLOGIST

Comparing Results with Logistic Regression: SAS vs. SUDAAN  SAS Proc Logistic (unweighted; weighted)  SUDAAN Proc Rlogist (weighted)  Proc Logistic (Standalone)  Model: Internet= Age Education Race  Where:  Outcome= Ever accessed internet (Yes=1)  Age (continuous)  Education (4 levels; ref= LT High School)  Race (5 levels; ref= Hispanic)

SUDAAN Syntax proc rlogist data=test design=jackknife; weight fwgt; jackwgts fwgt1-fwgt50/adjjack= ; class educ newrace; reflev educ=1 newrace=1; model internet= spage educ newrace ; run; Note: Design, Weight, Jackwgts, and Adjjack statements are used regardless of procedure in SUDAAN

Results: Education Value Some College vs. LT High School (ref) SAS Proc Logistic Unweighted SAS Proc Logistic Weighted SUDAAN Proc Rlogist Log Odds (Odds) 2.23 (9.30) 2.04 (7.71) 2.04 (7.71) Standard Error (β) % CI (β)(2.01, 2.45)(1.85, 2.23)(1.74, 2.34) Note: Larger standard error and corresponding CI with SUDAAN

Results: Race Value White vs. Hispanic (ref) SAS Proc Logistic Unweighted SAS Proc Logistic Weighted SUDAAN Proc Rlogist Log Odds (Odds) 0.62 (1.86) 1.04 (2.84) 1.04 (2.84) Standard Error % CI(0.43, 0.81)(0.85, 1.24)(0.74, 1.35)

Summary  HINTS unweighted estimates are biased  HINTS weights vary by race/ethnicity, gender, age and education  HINTS replicate weights can be used to obtain valid confidence intervals  We compare weighted and unweighted analyses using means and logistic regression