Secondary Data Analysis Linda K. Owens, PhD Assistant Director for Sampling and Analysis Survey Research Laboratory University of Illinois.

Slides:



Advertisements
Similar presentations
9. Weighting and Weighted Standard Errors. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2 Training.
Advertisements

Multiple Indicator Cluster Surveys Survey Design Workshop
Complex Surveys Sunday, April 16, 2017.
Chapter 7 Sampling Distributions
Dr. Chris L. S. Coryn Spring 2012
2.2: Sampling methods (pp. 17 – 20) Probability sampling: methods that can specify the probability that a given sample will be selected. Randomization:
Sampling Prepared by Dr. Manal Moussa. Sampling Prepared by Dr. Manal Moussa.
Ratio estimation with stratified samples Consider the agriculture stratified sample. In addition to the data of 1992, we also have data of Suppose.
A new sampling method: stratified sampling
Probability Sampling.
Formalizing the Concepts: Simple Random Sampling.
Sampling ADV 3500 Fall 2007 Chunsik Lee. A sample is some part of a larger body specifically selected to represent the whole. Sampling is the process.
SAMPLING METHODS. Reasons for Sampling Samples can be studied more quickly than populations. A study of a sample is less expensive than studying an entire.
Trends in Chronic Diseases by Demographic Variables, Hawaii’s Older Population, Hawaii Health Survey (HHS) K. Kromer Baker 1, A. T. Onaka 1, B. Horiuchi.
Understanding and Using NAMCS and NHAMCS Data Data Tools and Basic Programming Techniques 2010 National Conference on Health Statistics August 16, 2010.
Joint Canada/U.S. Health Survey Catherine Simile, National Center for Health Statistics Patrice Mathieu, Statistics Canada Ed Rama, Statistics Canada NCHS.
How survey design affects analysis Susan Purdon Head of Survey Methods Unit National Centre for Social Research.
Sample Design.
Aspects of the National Health Interview Survey (NHIS) Chris Moriarity National Conference on Health Statistics August 16, 2010
Complexities of Complex Survey Design Analysis. Why worry about this? Many government studies use these designs – CDC National Health Interview Survey.
McGraw-Hill/Irwin McGraw-Hill/Irwin Copyright © 2009 by The McGraw-Hill Companies, Inc. All rights reserved.
Definitions Observation unit Target population Sample Sampled population Sampling unit Sampling frame.
DESIGN FEATURES OF NCHS SURVEYS By Iris Shimizu Mathematical Statistician Office of Research and Methodology, NCHS Disclaimer: The opinions in this presentation.
CRIM 430 Sampling. Sampling is the process of selecting part of a population Target population represents everyone or everything that you are interested.
Sampling: Theory and Methods
“Analyzing Health Equity Using Household Survey Data” Owen O’Donnell, Eddy van Doorslaer, Adam Wagstaff and Magnus Lindelow, The World Bank, Washington.
Design Effects: What are they and how do they affect your analysis? David R. Johnson Population Research Institute & Department of Sociology The Pennsylvania.
Scot Exec Course Nov/Dec 04 Survey design overview Gillian Raab Professor of Applied Statistics Napier University.
1 Introduction to Survey Data Analysis Linda K. Owens, PhD Assistant Director for Sampling & Analysis Survey Research Laboratory University of Illinois.
1 Hair, Babin, Money & Samouel, Essentials of Business Research, Wiley, Learning Objectives: 1.Understand the key principles in sampling. 2.Appreciate.
LECTURE 3 SAMPLING THEORY EPSY 640 Texas A&M University.
JENNIFER SAYLOR, PHD, RN, ANCS-BC UNIVERSITY OF DELAWARE SEPTEMBER 14, 2012 Essentials of Complex Data Analysis Utilizing National Survey.
Sampling Methods.
Sampling Design and Analysis MTH 494 Lecture-30 Ossam Chohan Assistant Professor CIIT Abbottabad.
Current Population Survey Sponsor: Bureau of Labor Statistics Collector: Census Bureau Purpose: Monthly Data for Analysis of Labor Market Conditions –CPS.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 1-1 Statistics for Managers Using Microsoft ® Excel 4 th Edition Chapter.
Sampling Design and Analysis MTH 494 LECTURE-12 Ossam Chohan Assistant Professor CIIT Abbottabad.
Growing Challenges to State Telephone Surveys of Health Insurance Coverage: Minnesota as a Case Study Supported by a grant from the Minnesota Department.
1 Introduction to Survey Data Analysis Linda K. Owens, PhD Assistant Director for Sampling & Analysis Survey Research Laboratory University of Illinois.
5-4-1 Unit 4: Sampling approaches After completing this unit you should be able to: Outline the purpose of sampling Understand key theoretical.
National Health Care Surveys Budget Update/Options NCHS Board of Scientific Counselors April 24, 2008 Jane E. Sisk Centers for Disease Control.
1 Using National Hospital Ambulatory Medical Care Survey (NHAMCS) data for injury analysis Linda McCaig Ambulatory Care Statistics Branch Division of Health.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 7-1 Chapter 7 Sampling Distributions Basic Business Statistics.
Sampling Design and Procedures 7 th Session of Marketing Reseach.
Introduction to Secondary Data Analysis Young Ik Cho, PhD Research Associate Professor Survey Research Laboratory University of Illinois at Chicago Fall,
RESEARCH DATA CENTER Types of Data. Major NCHS Surveys and Data Systems National Health and Nutrition Examination Survey (NHANES) National Health Interview.
Chapter 6: 1 Sampling. Introduction Sampling - the process of selecting observations Often not possible to collect information from all persons or other.
Introduction to Survey Sampling
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 7-1 Chapter 7 Sampling and Sampling Distributions Basic Business Statistics 11 th Edition.
Statistics Canada Citizenship and Immigration Canada Methodological issues.
Basic Business Statistics
CASE STUDY: NATIONAL SURVEY OF FAMILY GROWTH Karen E. Davis National Center for Health Statistics Coordinating Center for Health Information and Service.
1 of 22 INTRODUCTION TO SURVEY SAMPLING October 6, 2010 Linda Owens Survey Research Laboratory University of Illinois at Chicago
Sampling technique  It is a procedure where we select a group of subjects (a sample) for study from a larger group (a population)
© 2007 Thomson, a part of the Thomson Corporation. Thomson, the Star logo, and Atomic Dog are trademarks used herein under license. All rights reserved.
1 STAT 500 – Statistics for Managers STAT 500 Statistics for Managers.
Chapter 12 Vocabulary. Matching: any attempt to force a sample to resemble specified attributed of the population Population Parameter: a numerically.
Using Data from the National Survey of Children with Special Health Care Needs Centers for Disease Control and Prevention National Center for Health Statistics.
NHANES Analytic Strategies Deanna Kruszon-Moran, MS Centers for Disease Control and Prevention National Center for Health Statistics.
Sample Design of the National Health Interview Survey (NHIS) Linda Tompkins Data Users Conference July 12, 2006 Centers for Disease Control and Prevention.
Sampling. Census and Sample (defined) A census is based on every member of the population of interest in a research project A sample is a subset of the.
Presented by: Khaleel S. Hussaini PhD Bureau Chief, Public Health Statistics Division of Public Health Preparedness Judy Bass Arizona’s BRFSS Coordinator.
1. 2 DRAWING SIMPLE RANDOM SAMPLING 1.Use random # table 2.Assign each element a # 3.Use random # table to select elements in a sample.
Working with the ECLS-B Datasets Weights and other issues.
Population and samples
SAMPLE DESIGN.
Meeting-6 SAMPLING DESIGN
Introduction to Survey Data Analysis
Random sampling Carlo Azzarri IFPRI Datathon APSU, Dhaka
Presentation transcript:

Secondary Data Analysis Linda K. Owens, PhD Assistant Director for Sampling and Analysis Survey Research Laboratory University of Illinois

Survey Research Laboratory 2 of 27 What is secondary data? Data collected by a person or organization other than the users of the data

Survey Research Laboratory 3 of 27 Advantages of Secondary Data Unobtrusive Fast & inexpensive Avoid data collection problems Provide bases for comparison

Survey Research Laboratory 4 of 27 Disadvantages of Secondary Data Data availability Level of observation Quality of documentation Data quality control Outdated data

Survey Research Laboratory 5 of 27 Data Sources  Inter-university Consortium for Political and Social Research (ICPSR)  National Center for Health Statistics (NCHS)  Center for Medicare and Medicaid Services (CMS)  US Census Bureau

Survey Research Laboratory 6 of 27 Examples of Directly Downloadable Data from NCHS: National Health and Nutrition Examination Survey (NHANES) National Ambulatory Medical Care Survey (NAMCS) National Hospital Ambulatory Medical Care Survey (NHAMCS) National Hospital Discharge Survey (NHDS) National Home and Hospice Care Survey (NHHCS) National Nursing Home Survey (NNHS) National Survey of Ambulatory Surgery (NSAS) National Employer Health Insurance Survey (NEHIS) National Vital Statistics System (NVSS) National Health Interview Survey (NHIS) Data Sources (cont.)

Survey Research Laboratory Survey Documentation & Analysis Web-based analysis and documentation of 27

Survey Research Laboratory 8 of 27 Data Available for Use with Survey Documentation and Analysis (SDA): Aging Data Longitudinal Study of Aging, 70 Years and Older, National Survey of Self-Care and Aging: Follow-Up, 1994 National Health and Nutrition Examination Survey II: Mortality Study, 1992 National Hospital Discharge Survey, National Health Interview Survey, 1994, Second Supplement on Aging Criminal Justice Data International Crime Data Homicide Data National Crime Victimization Survey Data Corrections Data Data Sources (cont.)

Survey Research Laboratory 9 of 27 Data Available for Use with Survey Documentation and Analysis (continued): Substance Abuse Data Drug Abuse Warning Network Monitoring the Future National Household Survey on Drug Abuse National Pregnancy and Health Survey National Treatment Improvement Evaluation Study Treatment Episode Data Set Uniform Facility Data Set Washington, DC Metropolitan Area Drug Study (DC*MADS) Data Sources (cont.)

Survey Research Laboratory 10 of 27 Evaluation of Data Sources Purpose of the study Sponsor/collector of the data Mode of data collection Sampling procedures Consistency of data with other sources

Survey Research Laboratory 11 of 27 Evaluation of Data Sources (cont.) Documentation Number of observations Number of variables Coding scheme Summary statistics

Survey Research Laboratory 12 of 27 Types of Survey Sample Design Simple Random Sampling Systematic Sampling Complex sample designs ▪stratified designs ▪cluster designs ▪mixed mode designs

Survey Research Laboratory 13 of 27 Types of Survey Sample Design Simple Random Sampling  Each member of the population has an equal and known chance of being selected  Simple Random Sample With Replacement (SRSWR)  Simple Random Sample Without Replacement (SRSWOR)

Survey Research Laboratory 14 of 27 Types of Survey Sample Design Systematic Random Sampling  the selection of every k th element from a sampling frame with the sampling interval k (=N/n).

Survey Research Laboratory 15 of 27 Types of Survey Sample Design Stratified sample  The population is first divided into non- overlapping subpopulations: strata such as gender, race or SES.  Sample from each stratum.  Proportionate vs. disproportionate  Works most effectively when the variance of the dependent variable is smaller within the stratum than in the sample as a whole.

Survey Research Laboratory 16 of 27 Types of Survey Sample Design Cluster sample  Elements are selected in groups or clusters  PSU: Primary Sampling Unit. This is the first unit that is sampled in the design. For example, school districts from Chicago may be sampled and then schools within districts may be sampled.  Homogeneity within cluster: Intracluster correlation (ICC)

Survey Research Laboratory 17 of 27 Why complex survey design? Increased efficiency Decreased costs Sometimes the only option available

Survey Research Laboratory 18 of 27 Complex Survey Design Complex designs with clustering and unequal selection probabilities generally increase the sampling variance. Not accounting for the impact of complex sample design can lead to Type I error.

Survey Research Laboratory 19 of 27 Sample Weights “pweight” or selection weight: Used to adjust for differing probabilities of selection (=N/n). In theory, simple random samples are self-weighted In practice, simple random samples are likely to also require adjustments for non-response

Survey Research Laboratory 20 of 27 Types of Sample Weights Post-stratification weights: Typically used to adjust for minor differences in nonresponse by demographic subgroup. Bring the sample proportions in demographic subgroups into agreement with the population proportion in the subgroups. Requires auxiliary dataset to use as a comparison. Not a fix for bad sample design

Survey Research Laboratory Post-Stratification Weights Example 21 of 27 Sample Percent Population Percent Weight Male42%49%1.16 Female58%51%.879

Survey Research Laboratory 22 of 27 Types of Sample Weights (cont.) Non-response weights: Designed to inflate the weights of survey respondents to compensate for nonrespondents with similar characteristics. Only useful if nonresponse varies by stratum (unless inflating sample size to population size).

Survey Research Laboratory 23 of 27 Types of Sample Weights (cont.) “Blow-up” (expansion) weights: Weights sum to population total Provide estimates for the total population of interest

Survey Research Laboratory 24 of 27 Types of Sample Weights (cont.) Replicate weights: A series of weight variables that are used instead of PSUs and strata in an effort to protect the respondents' identity. Pweight and the replicate weights must be used for the correct calculation of the point estimate and its standard error.

Survey Research Laboratory Summary of Weights Weight for probability of selection Adjust for non-response Post-stratify Expand or contract to population/sample totals 25 of 27

Survey Research Laboratory 26 of 27 Syntax Examples of Design-based Analysis in STATA, SUDAAN & SAS STATA svyset strata strata svyset psu psu svyset pweight finalwt svyreg fatitk age male black hispanic SUDAAN proc regress data=”c:\nhanes.sav” filetype=spss desgn=wr; nest strata psu; weight finalwt subpgroup sex race; levels 2 3; model fatintk = age sex race;

Survey Research Laboratory 27 of 27 Syntax Examples of Design-based Analysis in STATA, SUDAAN & SAS SAS proc surveyreg data=nhanes; strata strata; cluster psu; class sex race; model fatintk = age sex race; weight finalwt;