Presentation is loading. Please wait.

Presentation is loading. Please wait.

Collaborative Data Management for Longitudinal Studies Stephen Brehm [coauthors: L. Philip Schumm & Ronald A. Thisted] University of Chicago (Supported.

Similar presentations


Presentation on theme: "Collaborative Data Management for Longitudinal Studies Stephen Brehm [coauthors: L. Philip Schumm & Ronald A. Thisted] University of Chicago (Supported."— Presentation transcript:

1 Collaborative Data Management for Longitudinal Studies Stephen Brehm [coauthors: L. Philip Schumm & Ronald A. Thisted] University of Chicago (Supported by National Institute on Aging Grant P01 AG18911-01A1)

2 Agenda 1. Background on Study 2. Problem – Data Management Deficiencies 3. Solution – Collaborative Data Management 4. STATA Programs – maketest & makedata

3 Background on Study NIH-funded Longitudinal Study Loneliness & Health Thousands of Measures –Loneliness –Depression 230 subjects Repeated Yearly

4 Problem – Data Management Deficiencies Code Not Modular …Difficult to manage the data cleaning code …Limited code reuse from year to year …Difficult to collaborate among interns No Established Set of Data Cleaning Steps …Difficult for research assistants (turn-over) …Inconsistent data cleaning techniques …Data cleaning code difficult to read

5 Problem – Data Management Deficiencies Research Assistant Research Assistant Research Assistant Research Assistant Research Assistant Core File Set

6 Solution – Collaborative Data Management Process –Established Steps –File System Layout –Automated Tests –Collaboration Concepts –Module –Batch –“Data Certification” STATA Programs –maketest –makedata

7 Solution – Collaborative Data Management Process –Established Steps –File System Layout –Automated Tests –Collaboration Concepts –Module Ex: loneliness –Batch –“Data Certification” STATA Programs –maketest –makedata

8 Solution – Collaborative Data Management Process –Established Steps –File System Layout –Automated Tests –Collaboration Concepts –Module Ex: loneliness –Batch Ex: yr1, yr2, yr3 –“Data Certification” STATA Programs –maketest –makedata

9 Solution – Collaborative Data Management Set of Files for Each Module acquire-[module].do & fix-[module].do test-[module].do derive-[module].do label-[module].do Acquire & Fix DeriveTestLabel Year-Specific60% Code Reuse – Files Shared Between Years

10 STATA Program – maketest Purpose: –Auto-generation of Data Certifying Tests Functionality: –Tests Variable Type –Checks Consistency of Value Labels –Verifies Existence of Variable

11 STATA Program – maketest Syntax: –maketest [varlist] using, [REQuire(varlist) append replace] Example: –maketest using filename.do, replace Options: –using: specifies file to write –REQ: requires presence of variables in list –append: add to existing test.do file –replace: overwrite existing.do file

12 STATA Program – makedata “Bringing it all together”

13 STATA Program – makedata Syntax: –makedata [namelist], Pattern(string) [replace clear Noisily Batch(namelist) TESTonly] Example: –makedata ats, p("acquire-*.do") b(yr1) clear replace Options: –p: pattern – file naming convention –replace: overwrite existing data file –clear: clear current data in memory –Noisily: full output (default = summary) –b: batch – year, wave, center –TESTonly: only run tests step

14 Other Applications Beyond Longitudinal Data Teaching Data Cleaning with STATA Contact Information –Stephen Brehm: sbrehm@uchicago.edu –L. Philip Schumm: pschumm@uchicago.edu –Ronald A. Thisted: thisted@health.bsd.uchicago.edu Supported by National Institute on Aging Grant P01 AG18911-01A1


Download ppt "Collaborative Data Management for Longitudinal Studies Stephen Brehm [coauthors: L. Philip Schumm & Ronald A. Thisted] University of Chicago (Supported."

Similar presentations


Ads by Google