DDI AND THE DATA PRODUCER Prepared for Expert Seminar Finnish Social Science Data Archive Tampere, Finland September 1-2, 2000
2 NATIONAL SURVEY OF FAMILY GROWTH - NSFG Purpose: collect data on factors affecting pregnancy and women’s health in the United States Survey of males added for the first time in current cycle Previous surveys conducted in 1973, 1976, 1982, 1988, and 1995 New survey planned for with pretest in January, 2001
3 NSFG Research Topics the number of children women have had and the number they expect in the future intended and unintended births sexual intercourse marriage and cohabitation contraceptive use infertility, impaired fecundity, and sterilization
4 NSFG Research Topics - 2 breastfeeding, maternity leave, and child care adoption, stepchildren, and foster children health insurance coverage family planning and other medical services smoking by women years of age HIV testing Pelvic inflammatory disease and douching sex education
5 COMPUTER-ASSISTED INTERVIEWING AND XML TADEQ (Tool for the Analysis and Documentation of Electronic Questionnaires) –Fourth Framework Research Project of the EU –Partners: Statistics Netherlands Technical University of Vienna Office for National Statistics (UK) Statistics Finland Instituto Nacional de Estatistica (Portugal)
6 COMPUTER-ASSISTED INTERVIEWING AND XML Goal of TADEQ project : create an ‘open’ tool using a ‘neutral’ way to describe how CAI questionnaires (in Blaise) are conducted and to produce a human-readable textual documentation. This neutral way is through the use of XML.
7 COMPUTER-ASSISTED INTERVIEWING AND XML Goal of National Survey of Family Growth (NSFG) project: Work with Blaise programmers at Survey Research Center to output DDI tags from CAI instrument Eliminate as much ‘hand-editing’ as possible to create this variable-level markup How this might work for the user…..
8 (4965) 1995 NATIONAL SURVEY OF FAMILY GROWTH File Documentation Respondent File (Section H Questionnaire Items) VARIABLE COLUMN NON-HISPANIC NAME LOCATION HISP WHITE BLACK TOTAL QXTEXT AND CODE CATEGORIES HLPPRG ( During any of your relationships,) have you (or your husband/or your husband or partner) ever been to a doctor or other medical care provider to talk about ways to help you become pregnant? Inapplicable:R has has never had sex (RHADSEX coded 2, 7, 8, or 9); R has had sex but never since her first menstrual period (CI-22 SEXAFMEN coded 2); answer was not ascertained, R refused to report, or R did not know when she had sex after her first period (CI-23 WNSEXAFM coded 9997, 9998, or 9999); or R already reported receiving medial help to get pregnant in pregnancy history (HLPGETPG coded 1) Blank = Blank, inapplicable = YES = NO = Not Ascertained = Refused = Don’t Know
9 Brief Example of DDI Markup Section H: Infertility Services and Reproductive Health
11 Female Respondent File Section H: Infertility Services and Reproductive Health Female Respondent File Contents Pregnancy File Male Respondent File Combined Male- Female Respondent File Trend File NSFG, Cycle 6 Ever Received Help to Get Pregnant Series Ever Received Help to Prevent Miscarriage Series Douching Series Health Problems Related to Childbearing Series Census Bureau’s Disability Series HIV Testing and AIDS Series
12 Female Respondent File, Section H Ever Received Help to Get Pregnant Series Female Respondent File Contents Section H Contents Pregnancy File Male Respondent File Combined Male- Female Respondent File Trend File NSFG, Cycle 6 HLPPRG--HA1 Received Medical Help To Get Pregnant? HOWMANYR--HA2 # of H/P with whom R Sought Medical Help SEEKWWHO--HA3 Which H/P Did R Seek Medical Help With TYPALLP0--HA5 Infertility Services Received-1st TYPALLP1--HA5 Infertility Services Received-2nd TYPALLP2--HA5 Infertility Services Received-3rd TYPALLP3--HA5 Infertility Services Received-4th TYPALLP4--HA5 Infertility Services Received-5th TYPALLP5--HA5 Infertility Services Received-6th WHOTEST--HA5a Who had infertility testing? WHARTIN--HA5b Inseminated with whose sperm? OTMEDHE0--HA5c Other Infertility Services-1st OTMEDHE1--HA5c Other Infertility Services-2nd OTMEDHE2--HA5c Other Infertility Services-3rd OTMEDHE3--HA5c Other Infertility Services-4th [more…]
14 HLPPRG--HA1 Received Medical Help to Get Pregnant Inapplicable Respondents Full Question Text Cycle 5 Frequencies Female Respondent File Contents Section H Contents Back to Variable HLPPREG Back to Variables List NSFG, Cycle 6 R has never had sex (RHADSEX coded 2, 7, 8, or 9) R has had sex but never since her first menstrual period (CI-22 SEXAFMEN coded 2) Answer was not ascertained, R refused to report, or R did not know when she had sex after her first period (CI-23 WNSEXAFM coded 9997, 9998, or 9999) R already reported receiving medical help to get pregnant in pregnancy history (HLPGETPG coded 1)
15 HLPPRG--HA1 Received Medical Help to Get Pregnant Full Question Text Inapplicable Respondents Cycle 5 Frequencies Female Respondent File Contents Section H Contents Back to Variable HLPPREG Back to Variables List NSFG, Cycle 6 HA-1. IF TIMESMAR = 1, MARSTAT = MARRIED OR SEPARATED, AND LIFEPRTS = 1, ASK: Have you or your husband ever been to a doctor or other medical care provider to talk about ways to help you become pregnant? IF TIMESMAR = 1, MARSTAT = WIDOWED OR DIVORCED, AND LIFEPRTS = 1, ASK: Did you or your husband ever go to a doctor or other medical care provider to talk about ways to help you become pregnant? IF TIMESMAR 1 AND LIFEPRTS > 1, ASK: During any of your relationships, have you or your husband or partner at the time ever been to a doctor or other medical care provider to talk about ways to help you become pregnant? IF TIMESMAR = 0 AND LIFEPRTS = 0, ASK: Have you ever been to a doctor or other medical care provider to talk about ways to help you become pregnant? IF TIMESMAR = 0 AND LIFEPRTS 1, ASK: During any of your relationships, have you or your partner at the time ever been to a doctor or other medical care provider to talk about ways to help you become pregnant? YES NO (HLPMC) REFUSED (HLPMC) DON'T KNOW (HLPMC) NOTE: DO NOT COUNT IF MAIN PURPOSE OF VISIT WAS FOR SOMETHING OTHER THAN SEEKING HELP TO BECOME PREGNANT. FLOW CHECK H-2: IF LIFEPRTS > 1, ASK HOWMANYR. ELSE, GO TO TYPALLPG.
16 HLPPRG--HA1 Received Medical Help to Get Pregnant Cycle 5 Frequencies Inapplicable Respondents Full Question Text Female Respondent File Contents Section H Contents Back to Variable HLPPREG Back to Variables List NSFG, Cycle 6
18 INTCTFAM: Intact status of childhood family Recode rules Inapplicable Respondents Imputation Cycle 5 Frequencies Distribution by Sex Back to Variable INTCTFAM Back to Variables List If R lived with both biological parents at birth (VAR FAMTYP01=1) and that living situation did not change, or is current, (VAR CMCHFM01)=0 then INTCTFAM=1 If R lived with both adoptive parents at birth (VAR FAMTYP01=2) and that living situation did not change, or is current, (VAR CMCHFAM1=0) then INTCTFAM=2 Else, if R's parental living situation was anything else, or ever changed, INTCTFAM=3 Code categories: 1=two biological parents from birth 2=two adoptive parents from birth 3=anything other than two biological or two adoptive parents from birth NSFG, Cycle 6
19 INTCTFAM: Intact status of childhood family Inapplicable respondents Recode Rules Imputation Cycle 5 Frequencies Distribution by Sex Back to Variable INTCTFAM Back to Variables List Non-blank for all Rs. NSFG, Cycle 6
20 INTCTFAM: Intact status of childhood family Imputation Recode Rules Inapplicable Respondents Cycle 5 Frequencies Distribution by Sex Back to Variable INTCTFAM Back to Variables List Method 2: Hot deck imputation - most frequently used method of imputation Imputation using the hot deck procedure requires the identification of a pool of donors (cases with complete data) with characteristics similar to those of the receptor (the case with a missing value). A donor is then selected from the pool randomly either with equal probability (unweighted hot deck) or with probability proportional to the fully adjusted sampling weight of the donor (weighted hot deck). The cases that could donate a value to an observation without data are called donor pools, or imputation classes. An imputation class needed to be sufficiently large so that the number of times a donor provided a value was minimized, but also sufficiently small so that the donors and receptors were adequately comparable. By creating a group of respondents with similar characteristics for variables believed to be correlated with the missing recode, imputed values are generally more consistent with the life-history information. Unweighted hot deck was used in the vast majority of hot deck imputations. Weighted hot deck imputation was used for a few variables with missing data on roughly 2-8 percent of cases. NSFG, Cycle 6
21 INTCTFAM: Intact status of childhood family Cycle 5 frequencies Recode Rules Inapplicable Respondents Imputation Distribution by Sex Back to Variable INTCTFAM Back to Variables List NSFG, Cycle 6
22 INTCTFAM: Intact status of childhood family Distribution by Sex Recode Rules Inapplicable Respondents Imputation Cycle 5 Frequencies Back to Variable INTCTFAM Back to Variables List NSFG, Cycle 6
23 IMPORTANCE OF DDI TO INSTITUTE FOR SOCIAL RESEARCH Committee established with goal that Survey Research Center adopts a common data description standard based on XML and DDI for its codebooks Get data producers at Institute for Social Research to produce original documentation using DDI standards Educate staff in XML and DDI