Using the 2008 OFHS Public Use File A Self Guided Tutorial *Stata Version*

Slides:



Advertisements
Similar presentations
2004 Fannin County Health Survey Texas Behavioral Risk Factor Surveillance System (BRFSS)
Advertisements

Behavioral Risk Factor Surveillance System
Exhibit 1 NOTES: Other setting of usual care includes: neighborhood or family health center, free standing surgery center, rural health clinic, company.
10. NLTS2 Documentation Overview. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2 Training Modules.
2003 Alabama Health Care Insurance and Access Survey Montgomery, AL May 2, 2003 Ashley Alvord, MPH Alabama Department of Public Health Children’s Health.
The Early Release Program of the National Health Interview Survey Jeannine Schiller, M.P.H., Jane F. Gentleman, Ph.D., Eve Powell-Griner, Ph.D. National.
Measures of Child Well-Being from a Decentralized Statistical System: A View From the U.S. National Center for Health Statistics Stephen J. Blumberg, Ph.D.
Tobacco Use Supplement To The Current Population Survey Users’ Workshop June 2009 Tips and Tricks of Handling the TUS Data James “Todd” Gibson Information.
An Introduction to the Survey of Pathways to Diagnosis and Services, 2011 Rosa M. Avila, MSPH Centers for Disease Control and Prevention National Center.
9. Weighting and Weighted Standard Errors. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2 Training.
National Center for Health Statistics DCC CENTERS FOR DISEASE CONTROL AND PREVENTION Changes in Race Differentials: The Impact of the New OMB Standards.
Community Health Assessment San Joaquin County.
SAMPLING.
Associations between Obesity and Depression by Race/Ethnicity and Education among Women: Results from the National Health and Nutrition Examination Survey,
Business Statistics for Managerial Decision
Why sample? Diversity in populations Practicality and cost.
The Excel NORMDIST Function Computes the cumulative probability to the value X Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc
Copyright ©2005 Brooks/Cole, a division of Thomson Learning, Inc. How to Get a Good Sample Chapter 4.
Methodologic Overview of Two National Data Sets Centers for Disease Control and Prevention National Center for Health Statistics Issues in Comparing Findings.
2014 MASSACHUSETTS HEALTH INSURANCE SURVEY KEY FINDINGS Prepared by: Laura Skopec, Sharon K. Long, and Thomas H. Dimmock, Urban Institute Susan Sherr,
Physician Acceptance of New Medicaid Patients by State in 2011 Sandra Decker, Ph.D. National Center for Health Statistics NCHS National.
Source: Congressional Budget Office, The Budget and Economic Outlook: 2014 to 2024, p. 58, February 4, Note: CBO estimate of $115 billion reflects.
Aspects of the National Health Interview Survey (NHIS) Chris Moriarity National Conference on Health Statistics August 16, 2010
The Operationalization Process Making Your Concepts Measurable.
Complexities of Complex Survey Design Analysis. Why worry about this? Many government studies use these designs – CDC National Health Interview Survey.
Using the 2008 OFHS Public Use File A Self Guided Tutorial *SAS Version*
Food and Nutrition Surveillance and Response in Emergencies Session 14 Data Presentation, Dissemination and Use.
Liesl Eathington Iowa Community Indicators Program Iowa State University October 2014.
2004 Falls County Health Survey Texas Behavioral Risk Factor Surveillance System (BRFSS)
Copyright 2010, The World Bank Group. All Rights Reserved. Estimation and Weighting, Part I.
Comparable Health Data Between Canada and the U.S. n Many organizations such as the United Nations, World Health Organization and the Organization of Economic.
8.1 Inference for a Single Proportion
Multiple Indicator Cluster Surveys Survey Design Workshop Sampling: Overview MICS Survey Design Workshop.
There are two main purposes in statistics; (Chapter 1 & 2)  Organization & ummarization of the data [Descriptive Statistics] (Chapter 5)  Answering.
Using IPUMS.org Katie Genadek Minnesota Population Center University of Minnesota The IPUMS projects are funded by the National Science.
Design Effects: What are they and how do they affect your analysis? David R. Johnson Population Research Institute & Department of Sociology The Pennsylvania.
Secondary Data Analysis Linda K. Owens, PhD Assistant Director for Sampling and Analysis Survey Research Laboratory University of Illinois.
Eve Powell-Griner, PhD National Center for Health Statistics Centers for Disease Control and Prevention National Center for Health Statistics Overview.
Maryland Department of Health and Mental Hygiene WB&A Market Research Executive Summary THE 2003 MARYLAND MEDICAID MANAGED CARE CUSTOMER SATISFACTION SURVEY.
National Health and Nutrition Examination Survey: A Very General Overview Taken from various NHANES sources and Lein’s comments.
American Community Survey (ACS) 1 Oregon State Data Center Meeting Portland State University April 14,
Panel Study of Entrepreneurial Dynamics Richard Curtin University of Michigan.
Business Statistics for Managerial Decision Farideh Dehkordi-Vakil.
Factors Associated with Health Status for Children in Cross-border Appalachian States Tonimarie Black, B.S. Julia Farides-Mitchell, M.A. Robert McGrath,
Using Weighted Data Donald Miller Population Research Institute 812 Oswald Tower, December 2008.
Chapter Ten Basic Sampling Issues Chapter Ten. Chapter Ten Objectives To understand the concept of sampling. To learn the steps in developing a sampling.
An Overview of Statistics Section 1.1. Ch1 Larson/Farber 2 Statistics is the science of collecting, organizing, analyzing, and interpreting data in order.
“Advancing Knowledge. Improving Life.” Impact of Ohio Medicaid Eric Seiber, PhD Ohio State University.
Introduction to Secondary Data Analysis Young Ik Cho, PhD Research Associate Professor Survey Research Laboratory University of Illinois at Chicago Fall,
Understanding and Using NAMCS and NHAMCS Data
® Changes in Opioid Use Over One Year in Patients with Chronic Low Back Pain Alejandra Garza, Gerald Kizerian, PhD, Sandra Burge, PhD The University of.
Analytical Example Using NHIS Data Files John R. Pleis.
The Operationalization Process Making Your Concepts Measurable.
1 Introduction to Statistics. 2 What is Statistics? The gathering, organization, analysis, and presentation of numerical information.
Using Small Area Estimation Techniques to Provide County-level Estimates for Select Indicators from the OFHS Anirudh V.S. Ruhil Holly Raffle Sara L. Boyd.
2015 MASSACHUSETTS HEALTH INSURANCE SURVEY KEY FINDINGS Prepared by: Laura Skopec, Sharon K. Long, and Emily Hayes, Urban Institute Susan Sherr, David.
Using Data from the National Survey of Children with Special Health Care Needs Centers for Disease Control and Prevention National Center for Health Statistics.
Chapter 31 What Do Samples Tell Us?. Chapter 32 Thought Question 1 During a medical exam, the doctor measures your cholesterol two times. Do you think.
Washington State Department of Health Center for Health Statistics December 19, 2013 If you haven’t registered for this GoToWebinar, use this link: JoinWebinar.com.
Introduction/ Section 5.1 Designing Samples.  We know how to describe data in various ways ◦ Visually, Numerically, etc  Now, we’ll focus on producing.
Co-occurring Mental Illness and Healthcare Utilization and Expenditures Among Adults with Obesity and Chronic Physical Illness Chan Shen, MA. MS. Usha.
Transportation-related Injuries among US Immigrants: Findings from National Health Interview Survey.
Racial/Ethnic Disparities in Gestational Diabetes Mellitus in Oregon Monica Hunsberger, MPH, RD, PhD 1, Rebecca J. Donatelle, PhD 2, Kenneth D. Rosenberg,
1 ANALYZING DATA FROM THE NATIONAL IMMUNIZATION SURVEY __________________________________________ Michael P. Battaglia Abt Associates Inc. Meena Khare.
Weighting and imputation PHC 6716 July 13, 2011 Chris McCarty.
Intimate Partner Violence, Health Care Utilization and Insurance Status: Results from a Large Population-Based survey Kenneth J. Steinman, PhD, MPH.
Sampling Why use sampling? Terms and definitions
Vice President, Health Care Coverage and Access
How the Affordable Care Act Has Improved Americans’ Ability to Buy Health Insurance on Their Own Findings from the Commonwealth Fund Biennial Health Insurance.
Presentation transcript:

Using the 2008 OFHS Public Use File A Self Guided Tutorial *Stata Version*

This tutorial is intended for persons who wish to use the 2008 OFHS Public Use File (PUF). The PUFs exclude any information that could either intentionally, or unintentionally identify a respondent. Geographic information below the county level has been removed. The dataset is a record of the responses to the survey questions at the respondent level. The dataset is in a format that requires the use of SAS, a statistical analysis software from SAS Institute. The dataset is also available for SAS and SPSS. There is a separate tutorial for SAS users. Introduction

STATA Users Prerequisites –User has STATA Release 9 or Higher. –User has experience writing STATA programs. –User has an understanding of basic statistics, including analysis of univariate data using nominal and ordinal level variables. –User is comfortable with statistical terms such as proportions, standard error, confidence level, and confidence interval.

OFHS Background The 2008 OFHS is the largest State sponsored health survey in the U.S. Previous surveys were completed in 1998 and The survey had a sample size of 50,993. The survey was stratified to have enough respondents to do some analysis for each county in the state.

Documents that you may download before you get started. OFHS Questionnaire OFHS Codebook These documents are available on the OFHS web site. Look on the Downloads page.

What you need to know about the survey. Survey Design Survey Questions Imputation of Missing Values Weighting of Responses Constructed Variables

Survey Design The survey is a stratified random sample of Ohio’s non-institutional population. –Conducted through telephone interviews. Land Lines (49,000 respondents) Cell Phone (2,000 respondents) –Random Digit Dialing (land lines) within exchange numbers associated with each county. Exchanges are the first 3 digits of a seven digit phone number. The last four digits within each exchange are randomly selected.

Survey Design –Cell Phones Exchanges are at state level. –Over Samples African Americans - Some Exchanges in 6 largest urban counties have higher proportion of African Americans in the population. The higher proportion exchanges were sampled at a higher rate. Asian and Hispanics - Supplementation of survey with lists of persons with hispanic or asian surnames. –Household clusters Each household/family forms a cluster within the sample. –One adult and one child are randomly selected within the family. –Each response includes information on the adult, and the child (if there are any children). –The adult who is most knowledgeable about the child’s health responds for the child.

Survey Design The population of persons within each of the strata (State, County, telephone exchange, household, etc.) is already known or is collected as a part of the survey. A weight is established for each child and adult which reflects the inverse of the probability of being selected for the survey. Indicators of the strata and the weights are used in the STATA programs. We will come back to this later on.

Survey Questions In the survey questionnaire there are different kinds of questions. They include: –Qs that help to establish the weights for the survey. How many children are in the family? How many phone numbers are in the home?

Survey Questions –Qs that identify the demographic and socioeconomic characteristics of the individuals and the family. Age, gender, race, ethnicity. Family income, employment, occupation. Education

Survey Questions –Qs that identify the insurance status of the adult and child respondents. Source of Coverage (Job based, Medicare, Medicaid, etc.) If no insurance, the length of time without insurance. Difficulty in getting insurance. Types of Coverage (dental, prescriptions, vision mental health)

Survey Questions –Health Status of Adult and Child General health status Chronic health conditions Special Health Care needs Functional disability Height and weight

Survey Questions Health Care Access, Utilization, Satisfaction and Unmet needs. –Usual source of care –Care coordination –Specialists –Emergency room use –Hospitalizations –Types of unmet needs.

Survey Questions Questions are at multiple levels. –Anchor Questions are questions that are asked of everyone. –Qualifying Questions are questions that help to narrow down who should be responding to an in-depth question. –In-depth questions probe the dimensions of the respondent’s experience with a particular phenomenon.

Example of Question levels D43.//Have you/Has person in S1// ever been told by a doctor or any other health professional that //you/he// had diabetes or sugar diabetes? 01YES 02(Skip to D45)NO 03[VOLUNTEERED:] BORDERLINE 98DK 99REFUSED D43a //Have you/Has person in S1// ever been told by a doctor or any other health professional that //you/he/she// had TYPE 1 CHILD ONSET DIABETES or TYPE 2 ADULT ONSET, DIABETES? [INTERVIEWER NOTE: PROBE FOR TYPE, AND IF RESPONDENT SAYS ‘BORDERLINE’ CODE AS ‘03’] //Display response option 97, only if S15 = 02, 99. // 97(Skip to D45)[VOLUNTEERED:] YES, “GESTATIONAL” OR “ONLY WHEN PREGNANT” MENTIONED 01YES - TYPE I (JUVENILE) 02YES - TYPE II (ADULT ONSET) 03[VOLUNTEERED:] BORDERLINE DIAGNOSIS ONLY 04 (Skip to D45)NO, NEVER DIAGNOSED WITH DIABETES 98(Skip to D45)DK 99(Skip to D45)REFUSED Anchor Question

Example of Question levels D43b.//If (s15 = 02) then ask:// //Was your/Was person in S1’s// DIABETES only during a time associated with a pregnancy? [INTERVIEWER: PROBE FOR PROPER CODE] 01(Skip to D45)YES ONLY WHEN PREGNANT 02NO 98(Skip to D45)DK 99(Skip to D45)REFUSED D44.//Is your/Is person on S1’s// blood sugar or glucose level, which affects diabetes, USUALLY under control or where a physician wants it, even if medication is required Always, Usually, Sometimes, Rarely, or Never? 01ALWAYS 02USUALLY 03SOMETIMES 04RARELY 05NEVER 98DK 99REFUSED Qualifying Question In Depth Question

Question levels Notice in the example that there are instructions to skip to another question if the answer is no. These are anchor questions and qualifying questions which are eliminating persons from answering the in-depth questions. As a result, when a question is not asked of a respondent it creates a missing value for the respondent which is MISSING BY DESIGN.

Missing Values Some data is missing in the survey because the respondent refused to answer the question, or did not know the answer. These kinds of missing values need to be treated differently then those that are ‘missing by design’.

Missing Values There are some types of questions which are very important to the survey design or for public policy issues, for which it is not acceptable to have values missing. These include questions like: –Number of children in the family (design) –Family Income (public policy)

Imputation of Missing Values Where it is important for the survey to not have any missing values, the survey statisticians have replaced the missing value, by imputing it from all of the other survey respondents that answered other questions in the survey like the respondent did. Survey statisticians use very sophisticated models and processes to do imputation, and the practice is well accepted. When using this survey to do analysis, it is expected that the user will choose the form of the variable which includes the imputed values. These variables are labeled and typically have a suffix of “_imp”.

Weighting Weights for each adult and child response which reflect the inverse of the probability of being selected for the survey, are constructed and should be used in all analysis. When the weights are used, the results reflect an accurate reflection of the entire population.

Weighting If the weights for children in the OFHS were summed up across all responses, the total would be equal to the child population of Ohio. The same is true of the adult weights. The variable name for the adult weight is “wt_a”. The variable name for the child weight is “wt_c”.

Constructed Variables There are many variables in the OFHS file that are constructed from the responses to the survey questions that make it easier to use the OFHS. These variables include: –BMI – Body mass index. BMI is an indicator of adult and child obesity constructed from height and weight. The formula is complicated, especially for children. We make it easier for the user to do analysis of obesity by pre-calculating it.

Constructed Variables –Insurance Type – In many instances, respondents to the survey had more than one source of insurance. For example, many seniors have insurance from their private pension plans and Medicare. For the purpose of creating an unduplicated count of the population by their insurance status, we have created a variable which imposes a hierarchy of insurance sources to classify the population.

Using Stata with the OFHS Step 1. Download and Un-zip the Stata dataset. Step 2. Open dataset in Stata. Step 3. Set survey design parameters in Stata. Step 4. Build and run your first OFHS Stata Program

Download and Unzip the Stata dataset. You will find the OFHS Public Use Dataset at: Right click on the file name and select ‘save target as’. Save the ZIP file to the directory where you will store the data (c:\statadata\ofhs2008). After the file has been saved, run winzip, saving the unzipped file to the same directory.

Setting survey design parameters After you open the data in Stata, you will have to set the survey design parameters prior to running any analyses. To do this, type the following command in the command window in Stata. (Note: You will have to do this EVERY time you open the data.) If conducting analyses on adults: svyset masterid [pweight=wt_a], strata(stratum) singleunit(certainty) vce(linearized) If conducting analyses on child population: svyset masterid [pweight=wt_c], strata(stratum) singleunit(certainty) vce(linearized)

Build and run your first OFHS Stata Program You should only use procedures in Stata that support the use of complex survey designs. Including: –svy: mean (estimates means) –svy: prop (estimates proportions) –svy: tabulate (provides tables) –A detailed list of commands that support the use of complex survey designs can be found by going to the Help menu in Stata (found in toolbar), choosing Stata command, and typing “svy estimation”

Proc Surveymeans svy: tab i_type_c, ci Here is a simple program which calculates the percent of children by Insurance Type. It includes a 95% confidence interval around the mean. Note that you have already entered all of the sampling design parameters (at the beginning of your session). Remember that to calculate any adult variables, you will have to re-enter your design parameters, using the code provided on slide 28.

Svy: tab results (with a little cutting and pasting and formatting of values) Child Insurance TypeProportionsStd. Error 95% C.I. Lower Bound 95% C.I. Upper Bound 1: Medicaid & Medicare1.94%0.17%1.64%2.30% 2: Medicaid, No Medicare30.92%0.55%29.84%32.01% 3: Medicare, No Medicaid0.64%0.09%0.50%0.83% 4: Job-based Coverage53.29%0.57%52.16%54.42% 5: Directly Purchased2.55%0.18%2.22%2.93% 6: Other0.63%0.09%0.47%0.84% 7: Insured Type Unknown5.99%0.29%5.45%6.57% 8: Uninsured4.04%0.21%3.65%4.48% Total100.00%

svy: tabulate generate poverty200=. replace poverty200=0 if h87_imp>4 replace poverty200=1 if h87_imp<=4 replace poverty200=. If h87_imp==. svy: tab i_type_c if poverty200==0, se ci svy: tab i_type_c if poverty200==1, se ci Now you might add some domain analysis to this, breaking out insurance status for children by poverty level.

Svy: tabulate with an if statement Child Insurance Type if FPL>=201%ProportionsStd. Error 95% C.I. Lower Bound 95% C.I. Upper Bound 1: Medicaid & Medicare0.43%0.09%0.29%0.65% 2: Medicaid, No Medicare7.64%0.45%6.80%8.57% 3: Medicare, No Medicaid0.57%0.11%0.39%0.84% 4: Job-based Coverage80.40%0.63%79.14%81.60% 5: Directly Purchased3.48%0.29%2.95%4.10% 6: Other0.64%0.13%0.43%0.94% 7: Insured Type Unknown4.56%0.34%3.95%5.27% 8: Uninsured2.28%0.20%1.91%2.72% Total100.00% Child Insurance Type if FPL<201%ProportionsStd. Error 95% C.I. Lower Bound 95% C.I. Upper Bound 1: Medicaid & Medicare3.77%0.35%3.14%4.51% 2: Medicaid, No Medicare58.93%0.86%57.23%60.61% 3: Medicare, No Medicaid0.74%0.13%0.52%1.04% 4: Job-based Coverage20.67%0.68%19.37%22.03% 5: Directly Purchased1.42%0.19%1.10%1.84% 6: Other0.62%0.14%0.40%0.96% 7: Insured Type Unknown7.69%0.48%0%8.70% 8: Uninsured6.16%0.40%5.43%6.98% Total100.00%

The END