Download presentation
Presentation is loading. Please wait.
Published byMervin Riley Modified over 6 years ago
1
Clinical database management: From raw data through study tabulations to analysis datasets
Thank you for your kind introduction, and the opportunity to give this talk. The title of the talk is Clinical database management: From raw data through study tabulations to analysis datasets Si litt om bakgrunn, CRO, akademia, SAS, Stata Inge Christoffer Olsen, Phd Diakonhjemmet Hospital, Norway
2
Introduction “An experiment is a question which science poses to Nature and a measurement is the recording of Nature's answer” Max Planck I will begin with a quote from the famous physicist Max Planck: ” An experiment is a question which science poses to Nature and a measurement is the recording of Nature's answer” Meaning we cannot understand Nature without measurements. We need to take care of our measurements!
3
electronic Case Report Form
Background Patient Study electronic Case Report Form eCRF Study database SDB
4
Objective Make the study database Transparent Logical Transferable
Ready to analyse
5
CDISC Clinical Data Interchange Standards Consortium (CDISC) Open
Multidisciplinary Neutral Non-profit The CDISC mission is to develop and support global, platform-independent data standards that enable information system interoperability to improve medical research and related areas of healthcare. Standards developed in cooperation with international pharmaceutical, academic and governmental stakeholders Essential for FDA regulatory submissions of new pharmaceuticals
6
Standards Protocol Clinical Data Acquisition Standards Harmonization (CDASH) eCRF standard Laboratory Data Model (LAB) Standard for exchanging lab-results Study Data Tabulation Model (SDTM) Structure CRF data within pre-specified domains Analysis Data Model (ADaM) Standards for analysis-ready datasets
7
Idea Protocol CDASH SDTM ADaM Statistical analyses Report
8
Strengths Standardized programs Recognizable Transferable
Potentially very efficient Shown to decrease resources needed by 60% and more if implemented from the protocol on
9
Weaknesses Rigid Programming demanding Designed to be transferable
Text variables No labels, label values or other Stata specific features Extreme long format Not suitable for “non-programmers”
10
Example STDM Findings class from an Myleaoid Leukemia RCT
Note no treatment column STUDYID DOMAIN USUBJID XRSEQ XRTESTCD XRTEST XRCAT XRORRES XRORRESU XRSTRESC XRSTRESN XTSTRESU XRSTAT VISITNUM VISIT TESTSTUDY XR TESTSTUDY 1 DINTENS In-Patient Days in Intensive Care Unit HOSPITALISATION DAYS 21 Day 21 TESTSTUDY 2 DREASON In-Patient Days Due to Other Reasons TESTSTUDY 3 DSTUDY In-Patient Days Due to Study Treatment 22 TESTSTUDY 4 HOSPITAL Days Hospitalized in this Course TESTSTUDY 5 BLAST Blasts in Blood TREATMENT RESPONSE . TESTSTUDY 6 BM Bone Marrow NOT DONE TESTSTUDY 7 NEUT Neutrophils TESTSTUDY 8 PLAT Platelets TESTSTUDY 9 RESPONSE Treatment Response NR NO RESPONSE TESTSTUDY 10 CONDITIO Patient Condition FOLLOW-UP ALIVE 600 Safety Follow up Assessment TESTSTUDY 11 NEVER Never Had a Remission Y TESTSTUDY 12 THERAPY Further Anticancer Therapy TESTSTUDY 13 RELAPSE / SURVIVAL DEAD 701 Survival / Relapse Assessment (MONTH 1) TESTSTUDY 14 TESTSTUDY 15
11
Idea To use the basics from CDISC, but without the rigidity of this approach Use only the SDTM and ADaM utilities Long but not so long Keep Stata labelling features Make accessible for non-programmers
12
Overview Raw output from eCRF Imported into Stata
Tabulation Datasets (TD) All data in semi-standardized datasets No manipulation Analysis Datasets Formatted datasets ready for analyses Possibly manipulated
13
Examples of TDs tddm: Demographics tdsv: Study visits
tdds: Disposition important events during study such as ICF date, randomisation date, study end date, withdrawal date etc. tdtrt: Study treatment information tdie, tdae, tdcm, tdlb, tdvs etc
14
Examples of ADs adsl: Subject level analysis dataset adds: Disposition
treatment, populations, baseline adjustments variables adds: Disposition For the patient flow figure adbl: Baseline Demographics and baseline characteristics addisact: Disease activity measures (imputed) For primary and secondary analyses
15
Discussion I have presented a setup for clinical study databases based on the CDISC standards I find the organisation simple and clear Clear distinction between manipulated and non-manipulated data Transferal of the database should be followed by a document describing the content Reserachers used to one large file might find the number of datasets overwhelming and confusing
16
The end Thank you!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.