Download presentation
Presentation is loading. Please wait.
Published byLionel Morrison Modified over 6 years ago
1
New Infrastructure for Harmonized Longitudinal Data with MIDUS and DDI 3.2
Project that MIDUS is working on with Colectica using DDI 3.2 to deliver harmonized datasets. IASSIST Toronto
2
Overview of Presentation
Background on MIDUS Importance of DDI for Harmonization Facilitating discovery and complex analysis Current Project Goals Implementation of Project Goals Upgrading MIDUS from DDI 3.1 to 3.2 Building on the MIDUS-Colectica Repository
3
Background on Baseline: 1995-96 Harvard University
MacArthur Foundation N=7,108 Ages 25-74 Satellite studies of stress, cognition
6
MIDUS: Unique Characteristics
Multiple waves (9-10 year interval) Multiple samples/cohorts 1 National (MIDUS Core) 2 Milwaukee 3 Japan (MIDJA) 4 National (MIDUS Refresher) 5 Milwaukee (Refresher) Multidisciplinary design Aging as integrated bio-psycho-social process Result N=11,500 34,000 variables
7
MIDUS: Unique Characteristics
Multiple waves (9-10 year interval) Multiple samples/cohorts Multidisciplinary design Wide use of MIDUS – Open Data philosophy #1 data download at NACDA Top 10 data download at ICPSR 530+ publications
8
Status of Current DDI Efforts
MIDUS Metadata Repository
9
Moving Forward National Institutes of Aging project: “Facilitating Secondary Analysis and Archiving of MIDUS through DDI” Timeline: The proposal is in response to RFA-AG “Secondary Analyses and Archiving of Social and Behavioral Datasets in Aging”.
10
Current Project goals Under a DDI 3.2 rubric…
1. Harmonization (internal, post-hoc) Clarify related nature of longitudinal and cross-cohort survey variables (RepresentedVariable) Provide information/procedures for reconciliation 2. Custom Data Extract (CDE) Allow researchers to focus on variables of interest Facilitate accurate merges across numerous datasets
11
Harmonization Concordance table
MIDUS P1 concordance table (Google Spreadsheet) Includes “Comparability notes” and “Comparability class” Example: Variable A1PA30 “time since last BP test” Comparabililty notes: “M1 is not directly comparable with M2, MKE, MR, MKER, M3: M1 responses were coded as number of months, while other waves broke out number (amount) and unit (days, weeks, months, years) into 2 separate variables.” Offer code/algorithm for reconciliation Also includes notes and descriptions of how the variables differ. READ example. We’re making this table available to our users and researchers as an Excel spreadsheet, so it can be used “manually,” but it is large and unwieldy. Ideally this information will be integrated in our DDI codebooks and the Colectica repository – aside from the goal of harmonizing, it also accomplishes a much more mundane goal of identifying the same variables across datasets, facilitating longitudinal and cross-cohort analysis.
12
Custom Data Extract Customized dataset
Search variables, use shopping basket Include variables from across all MIDUS projects Merge different datasets Different formats (csv, SPSS, SAS, Stata) Associated DDI codebook
13
Development Milestones
1. Metadata Quality Report 2. Harmonization 3. Web-based Discoverability 4. Data Extraction
14
Step 1. Metadata Quality Report
Compare the harmonization spreadsheet to the Repository Check for: Missing information Inconsistent labels Inconsistent data types Update the metadata to improve quality
15
Step 2. Harmonization Use the harmonization spreadsheet
Create a RepresentedVariable for each row Store these in the repository
16
Step 3. Web-based Discoverability
Build on top of Colectica Portal Searching and information retrieval out-of-the-box Add cross-reference tables for easy discoverability Choose variables or groups of variables to include in the data extract
17
Step 4. Data Extraction Store master data in Colectica Repository
Based on a user’s selected variables, generate: Datasets CSV, R, SAS, SPSS, Stata HTML and PDF codebooks DDI XML
18
Progress Complete Metadata Quality Report Complete Harmonization
Upcoming Web-based Discoverability Data Extraction Approximately 3800 RepresentedVariables
19
Acknowledgement This research project is supported by a grant from the National Institute on Aging (R03-AG046312).
20
NADDI 2015 – April “Research Data Management: Facilitating Discoverability using Open Metadata Standards”
21
NADDI 2015 – April 8 - 10 “Research Data Management:
Facilitating Discoverability using Open Metadata Standards” University of Wisconsin - Madison
22
Thank you midus.wisc.edu www.colectica.com
Jeremy Iverson – Colectica Barry Radler – UW-Madison Dan Smith – Colectica midus.wisc.edu
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.