EPI 218 Database Management for Clinical Research Tables, Relationships, Normalization, Data Types, and Data Dictionaries Michael A. Kohn, MD, MPP 1 August.

Slides:



Advertisements
Similar presentations
Exporting Data for Analysis Michael A. Kohn, MD, MPP 16 August 2012.
Advertisements

EPI 218 Web-Enabled Research Data Management Platforms Michael A. Kohn, MD,MPP 5 September 2013.
SIS – NBS Online Specimen Tracking System Training
Introduction for Clinical Database 陳勁辰2003/06/02.
Web-Based, Hosted Research Data Management Platforms 2/12/2008.
Databases Chapter Distinguish between the physical and logical view of data Describe how data is organized: characters, fields, records, tables,
Introduction to Databases CIS 5.2. Where would you find info about yourself stored in a computer? College Physician’s office Library Grocery Store Dentist’s.
Data Management for Research Michael A. Kohn, MD, MPP 7 January 2003.
Chapter 4: Database Management. Databases Before the Use of Computers Data kept in books, ledgers, card files, folders, and file cabinets Long response.
Database Design Concepts INFO1408 Term 2 week 1 Data validation and Referential integrity.
BUSINESS DRIVEN TECHNOLOGY
Chapter 4 Relational Databases Copyright © 2012 Pearson Education 4-1.
MS Access 2007 IT User Services - University of Delaware.
Microsoft Access Database software. What is a database? … a database is an organized collection of data. A collection of data of similar information compiled.
REDCap Overview Institute for Clinical and Translational Science Heath Davis Fred McClurg Brian Finley.
Page 1 ISMT E-120 Desktop Applications for Managers Introduction to Microsoft Access.
© 2008 The McGraw-Hill Companies, Inc. All rights reserved. ACCESS 2007 M I C R O S O F T ® THE PROFESSIONAL APPROACH S E R I E S Lesson 4 – Creating New.
Copyright © 2003 by Prentice Hall Module 4 Database Management Systems 1.What is a database? Data hierarchy and data organization Field, record, file,
6-1 DATABASE FUNDAMENTALS Information is everywhere in an organization Information is stored in databases –Database – maintains information about various.
Copyright © 2003 by Prentice Hall Computers: Tools for an Information Age Chapter 13 Database Management Systems: Getting Data Together.
SAS Workshop Lecture 1 Lecturer: Annie N. Simpson, MSc.
DAY 15: ACCESS CHAPTER 2 Larry Reaves October 7,
Schulich School of Medicine & Dentistry Acuity STAR v5.1 Instructor Led Intermediate Level User Training Version 5.1b (December 2010)
Database Resources Final Project Database Demonstrations 2/9/2010.
Data Collection and Management for Clinical Research Michael A. Kohn, MD, MPP 31 August 2010.
DAY 14: ACCESS CHAPTER 1 Tazin Afrin October 03,
STORING ORGANIZATIONAL INFORMATION— DATABASES CIS 429—Chapter 7.
Simple Database.
Data Management for Pharmaceutical Trials Michael A. Kohn, MD, MPP (Acknowledgment: Susanne Prokscha)
Microsoft Access Get a green book. Page AC 2 Define Access Define database.
Microsoft Access 2003 Define some key Access terminology: Field – A single characteristic or attribute of a person, place, object, event, or idea. Record.
CHAPTER 8: MANAGING DATA RESOURCES. File Organization Terms Field: group of characters that represent something Record: group of related fields File:
EPI 218 Web-Enabled Research Data Management Platforms Michael A. Kohn, MD,MPP Josh Senyak 22 August 2013.
1 Working with MS SQL Server Textbook Chapter 14.
Management Information Systems MS Access MS Access is an application software that facilitates us to create Database Management Systems (DBMS)
Database Management for Clinical Research Tables, Normalization, Queries, and Forms Michael A. Kohn, MD, MPP 3 September 2013.
Discovering Computers Fundamentals Fifth Edition Chapter 9 Database Management.
Introduction to Databases Trisha Cummings. What is a database? A database is a tool for collecting and organizing information. Databases can store information.
EPI 218 Web-Enabled Research Data Management Platforms Michael A. Kohn, MD,MPP 30 August 2012.
Storing Organizational Information - Databases
1 Faculty Center for Instructors and Roster Contacts Accessing Faculty Center Class Roster Grade Roster Request Grade Changes Grade Approval Process Next.
EPI 218 Web-Enabled Research Data Management Platforms Michael A. Kohn, MD,MPP 29 August 2013.
MS Access 2007 Management Information Systems 1. Overview 2  What is MS Access?  Access Terminology  Access Window  Database Window  Create New Database.
EPI 218 Database Management for Clinical Research Michael A. Kohn, MD, MPP January 10, 2010.
Data Management for Research Michael A. Kohn, MD, MPP January 4, 2005.
McGraw-Hill/Irwin © 2008 The McGraw-Hill Companies, All Rights Reserved Chapter 7 Storing Organizational Information - Databases.
Enhancing Forms with OLE Fields, Hyperlinks, and Subforms – Project 5.
EPI 218 Database Management for Clinical Research Michael A. Kohn, MD, MPP January 6, 2009.
EPI 218 Database Management for Clinical Research Michael A. Kohn, MD, MPP January 8, 2008.
1 Introduction to Oracle Chapter 1. 2 Before Databases Information was kept in files: Each field describes one piece of information about student Fields.
ITGS Databases.
EPI 218 Queries and On-Screen Forms Michael A. Kohn, MD, MPP 9 August 2012.
REDCap Overview Institute for Clinical and Translational Science Fred McClurg Neil Nuehring.
REDCap Overview Institute for Clinical and Translational Science Heath Davis Fred McClurg Brian Finley.
DAY 16: ACCESS CHAPTER 1-2 Rahul Kavi October 8,
IST 220 – Intro to Databases Lecture 2 Touring Microsoft Access.
McGraw-Hill/Irwin © 2008 The McGraw-Hill Companies, All Rights Reserved Chapter 7 Storing Organizational Information - Databases.
MICROSOFT ACCESS – CHAPTER 5 MICROSOFT ACCESS – CHAPTER 6 MICROSOFT ACCESS – CHAPTER 7 Sravanthi Lakkimsety Mar 14,2016.
VOCAB REVIEW. A field that can be computed from other fields Calculated field Click for the answer Next Question.
NOTE: To change the image on this slide, select the picture and delete it. Then click the Pictures icon in the placeholder to insert your own image. REDCap:
DU REDCap Introduction
GO! with Microsoft Office 2016
GO! with Microsoft Access 2016
Microsoft Access 2003 Illustrated Complete
Introduction to Ms-Access Submitted By- Navjot Kaur Mahi
Databases and Information Management
The ultimate in data organization
Ch 1 .Installing and configuring SQL Server 2005
Presentation transcript:

EPI 218 Database Management for Clinical Research Tables, Relationships, Normalization, Data Types, and Data Dictionaries Michael A. Kohn, MD, MPP 1 August 2013

Clinical Research* Choose the study design, and define the study population, predictor variables, and outcome variables; measure these variables and anticipate problems with measurement; analyze the results In this course, we discuss the “nitty gritty” of collecting, storing, updating, and monitoring the study measurements. *Private companies that make data management systems for clinical research understand “clinical research” to include only RCTs preparatory to FDA drug or device approval, not observational studies.

Outline Housekeeping Data Tables Rows = Records; Columns = Fields Normalization of Data Tables Start Lab 1

Housekeeping Epi 218

Course website: Lectures and Labs will be in China Basin Landing 6702 with overflow into 6704, 8:30 – 10:30 “Learn MS Access 2000” video Username: ucsfdbclass Password: access2000 (We can also loan you the video on CD.)

Platforms Access (Labs 1, 2 and 3) REDCap (Lab 4) QuesGen (Lab 5) OnCore (Lab 6) May use other data management platforms for final project: -- SurveyMonkey -- Filemaker Pro -- Oncore -- OpenClinica -- Other

Microsoft Access Integrated desktop database management platform Uses SQL (Structured Query Language) Has an outstanding graphical query design tool Incorporates an excellent report writer Based on the principles of the Relational Model Relationships diagram has integrated referential integrity Very flexible, infinitely customizable NOT browser based, desktop application Using advanced features usually requires hiring a developer

Microsoft Access Changed user interface between Access 2003 and Access If you are running Access 2003 or an earlier version, use the lab instructions for Access If you are running Access 2007 or 2010, use the lab instructions for Access DEB Terminal Server 185-RDS1.epi-ucsf.org has Access The others have Access 2003.

DEB Terminal Server Provides a remote Windows desktop with Microsoft Office Professional Remote Desktop client software freely available for the Mac and already part of Windows us/download/details.aspx?id= us/download/details.aspx?id=18140 Obtain DEB Terminal Server username and password from Instructions available on course syllabus page

REDCap Web-based research data collection system developed at Vanderbilt Available free through UCSF Academic Research Systems You are both the Principal Investigator and User 1. Model= “Do-it-yourself”

QuesGen Web-enabled research data collection and management platform developed (with UCSF input) by a private company based in Burlingame More full-featured and customizable than REDCap, but primarily “pay-us-to- do-it” rather than “do-it-yourself” User accounts for Epi 218 students

Learning Objectives develop a multi-table, relational database for a research study using Microsoft Access query a database for monitoring and analyzing research data learn about REDCap: basic functions, advantages and limitations understand the advantages and costs of other web-based platforms such as QuesGen hear about data management for large-scale clinical trials in industry

Requirements Turn in all 5 labs on time Labs are due by midnight the following Thursday (Lab 1 due 8/8 at midnight) Complete Final Project Due 9/18/2013

Final Project: Part A Send in or Demonstrate Your Study Database Due 9/20/2012 Send in a copy of your research study database*. We prefer a database that you are currently using or will use for a research study. However, a demonstration or pilot database is acceptable. *If you are unable to package your database in a file to , you can send us a link or work out another way to review your database.

If you are doing secondary analysis of data collected by someone else, obtain the data collection forms* used in the original data collection, set up a new database that you would use for a follow-up study. *Often easily obtained by doing a Google search or ing the author of the original study. Final Project: Part A Send in or Demonstrate Your Study Database Due 9/18/2013

General description of database Data collection and entry Error checking and data validation Analysis (e.g., export to Stata) Security/confidentiality Back up Final Project: Part B Submit Your Data Management Plan Due 9/18/2013

Final Project Due 9/18/2013 Start thinking about this now. Build your own study database as you work through the labs. Use extra time in lab to work on your study database. Set up appointments with course faculty early.

TICR Professional Conduct Statement Clarifications for this class I will maintain the highest standards of academic honesty I will neither give nor receive aid in examinations or assignments unless such cooperation is expressly permitted by the instructor I will conduct research in an unbiased manner, reports results truthfully, and credit ideas developed and work done by others I will not use answer keys from prior years I will write answers in my own words, and, when collaboration is permitted, acknowledge collaborators when answers are jointly formulated For Epi 218 – Just don’t turn in somebody else’s work as your own.

Rows = Records = Entities Columns = Fields = Attributes Data Tables

DCR Chapter 16 Exercise 2 The PHTSE (Pre-Hospital Treatment of Status Epilepticus) Study was a randomized blinded trial of lorazepam, diazepam, or placebo in the treatment of pre-hospital status epilepticus. The primary endpoint was termination of convulsions by hospital arrival. To enroll patients, paramedics contacted base hospital physicians by radio. The following are base-hospital physician data collection forms for 2 enrolled patients: Lowenstein DH, Alldredge BK, Allen F, Neuhaus J, Corry M, Gottwald M, et al. The prehospital treatment of status epilepticus (PHTSE) study: design and methodology. Control Clin Trials 2001;22(3): Alldredge BK, Gelb AM, Isaacs SM, Corry MD, Allen F, Ulrich S, et al. A comparison of lorazepam, diazepam, and placebo for the treatment of out-of-hospital status epilepticus. N Engl J Med 2001;345(9):631-7.

Display the data from these 2 data collection forms in a 2-row data table. Subjec tID KitNum ber AdminDat e Admin Time SzStopPre Hosp SzStopPreHos pTime HospArrT ime HospArrS zAct HospArrG CSV 189A3223/12/199417:39FALSE 17:48TRUE 410B53612/1/199801:35TRUE01:3901:53FALSE4

Create a 9-field data dictionary for the data table Field Name Data TypeDescriptionValidation Rule SubjectIDIntegerUnique Subject Identifier KitNumberText(5)5-character Investigational Pharmacy Code AdminDateDateDate Study Drug Administered AdminTimeTimeTime Study Drug Administered SzStopPreHospYes/NoDid seizure stop during pre- hospital course? SzStopPreHosp Time TimeTime seizures stopped during pre-hosp course (blank if seizure did not stop) HospArrTimeTimeHospital Arrival Time HospArrSzActYes/NoWas there continued Seizure Activity on Hospital Arrival? Check against SzStopPreHosp HospArrGCSVIntegerVerbal GCS on Hospital Arrival (blank if seizure continued) Between 1 and 5

Methods: Design-Nested double cohort study. Setting-Kaiser Subjects-Infants with neonatal jaundice and randomly selected non-jaundiced infants Predictor Variable-Presence or absence of jaundice Outcome Variable- Neuropsychological score (ranging from 55 to 145) at age 5 Analysis- ? JIFee Jaundice and Infant Feeding Study Newman, T. B., P. Liljestrand, et al. (2006). "Outcomes among newborns with total serum bilirubin levels of 25 mg per deciliter or more." N Engl J Med 354(18):

Infant Jaundice Study Data 1.Approximately 400 children 2.5 examiners (doctors) 3.Approximately 700 neuropsychological examinations, measuring weight, height, and “NPScore” (IQ) 4.Some children to be examined more than once 5.No examiner to see the same child twice 6.If child died before age 5, store age and circumstances of death

Demonstration: Creating a Data Table Label columns and enter rows of data in datasheet view Where is predictor on data collection form?

Demonstration: Data Dictionary Table design view: field (=column) names, data types, definitions, validation rules (More on data types, free-text vs. coded responses, later)

Acceptable table showing one set of exam results per participant. (BabyExamForFigure3)

Demonstration Disallowed values Duplicate primary keys This automatic error checking and data validation IS why you need to enter your data into a computer; it is NOT why you need a relational DBMS. Many single- table products (Filemaker Pro, SAS FSP, even Excel) can do error checking and data validation.

Demonstration: Same Table in Excel, Stata Excel Stata Etc Rows = Records = Entities Columns = Fields = Attributes Access and Stata have a special row at the top for column headings (=field names); Excel just uses the first row.

Normalization

Table of Study Subjects Row = Individual Infant Columns = ID#, Name, DOB, Sex, Jaundice If some infants have more than one exam, what do you do? Table of Study Subjects

Undesirable table showing multiple exam results per study participant. (BabyExamForFigure4)

Demo Find highest IQ Score Find all exams done in April

Common Error If you find yourself creating multiple columns for the same measurement, e.g., Date1, Score1, Date2, Score2, Date3, Score3, … Or if your table is more than about 30 columns wide, It is time to restructure your table.

Undesirable table with participant-specific data duplicated for each exam. (Note problem with Helen’s DOB.) (ExamBabyForFigure5)

Demo Find highest IQ Score Find all exams in a particular month What is Helen’s birth date? What happened to Alejandro, Ryan, Zachary, and Jackson?

If some infants have multiple exams, “normalize” the records into two tables, one for subjects and one for examinations. Normalization

Data normalized into two tables: one (“Baby”) with rows comprising subject- specific information; the other (“Exam”) with rows comprising exam-specific information. Note that Helen can only have one birth date. Subjects with no exams, e.g. Alejandro, still appear in the database. “SubjectID” functions as the primary key in the “Baby” table and as the foreign key in the “Exam” table.

Figure 7. Relationships diagram showing the one-to-many relationship between the table of subjects (“Baby”) and the table of measurements (“Exam”).

Demonstration Inability to create integrity violations with normalized tables. This IS why you need a multi-table relational DBMS.

Outline Housekeeping Data Tables Rows = Records; Columns = Fields Normalization of Data Tables Start Lab 1