Most Common Issues in Define.xml files

Slides:



Advertisements
Similar presentations
FDA CDER Common Data Standards Issues
Advertisements

Principal Statistical Programmer Accovion GmbH, Marburg, Germany
ADaM Implementation Guide: It’s Almost Here. Are You Ready?
Experience and process for collaborating with an outsource company to create the define file. Ganesh Sankaran TAKE Solutions.
OpenCDISC Rules for Discussion
CDISC ADaM 2.1 Implementation: A Challenging Next Step in the Process Presented by Tineke Callant
Quick tour of CDISC’s ADaM standard
Robust approach to create Define.xml v2.0
Cross Check between Define.xml and blankcrf.pdf
SUPPQUAL – Where’s My Mommy? Sandra VanPelt Nguyen Midwest CDISC Users Group May 2012.
Metadata Management – Our Journey Thus Far
Finalized FDA Requirements for Standardized Data
Gregory Steffens Novartis Associate Director, Programming NJ CDISC Users’ Group 17 April 2014 Supplemental Qualifiers.
7. German CDISC User Group Meeting Define.xml Generator ODM Validator (define.xml validation) 2010/03/11 Dimitri Kutsenko Marianne Neumann.
Copyright © 2010, SAS Institute Inc. All rights reserved. Define.xml - Tips and Techniques for Creating CRT - DDS Julie Maddox Mark Lambrecht SAS Institute.
Monika Kawohl Statistical Programming Accovion GmbH Tutorial: define.xml.
Updates on CDISC Standards Validation
SDTM Validation Rules Sub-team CDISC INTRAchange Feb 26 th, 2014.
23 August 2015Michael Knoessl1 PhUSE 2008 Manchester / Michael Knoessl Implementing CDISC at Boehringer Ingelheim.
Dominic, age 8, living with epilepsy SDTM Implementation Guide : Clear as Mud Strategies for Developing Consistent Company Standards PhUSE 2011 – CD02.
CBER CDISC Test Submission Dieter Boß CSL Behring, Marburg 20-Mar-2012.
© 2011 Octagon Research Solutions, Inc. All Rights Reserved. The contents of this document are confidential and proprietary to Octagon Research Solutions,
PhUSE SDE, 28-May A SAS based Solution for define.xml Monika Kawohl Statistical Programming Accovion.
Remapping of Codes (and of course Decodes) in Analysis Data Sets for Electronic Submissions Joerg Guettner, Lead Statistical Analyst Bayer Pharma, Wuppertal,
Implementation of a harmonized, report-friendly SDTM and ADaM Data Flow General by Marie-Rose Peltier Experience by Marie Fournier Groupe Utilisateurs.
Antje Rossmanith, Roche 14th German CDISC User Group, 25-Sep-2012
Overview and feed-back from CDISC European Interchange 2008 (From April 21 st to 25 th, COPENHAGEN) Groupe des Utilisateurs Francophones de CDISC Bagneux.
Confidential - Property of Navitas Accelerate define.xml using defineReady - Saravanan June 17, 2015.
Update on SDTMIG v and SDTM v. 1.2 Bay Area User Group Meeting May 2008 Fred Wood Octagon Research Solutions.
15th Informal US MedDRA User Group Meeting, October 28, 2011 Slide 1 Double the Impact with Half the Work: Linking MedDRA and WHO Drug Indication coding.
How to go from an SDTM Finding Domain to an ADaM-Compliant Basic Data Structure Analysis Dataset: An Example Qian Wang, MSD, Brussels, Belgium Carl Herremans,
SDTM Validation Delaware Valley CDISC user network Ketan Durve Johnson and Johnson Pharmaceutical Reasearch and Development May 11 th 2009.
WG4: Standards Implementation Issues with CDISC Data Models Data Guide Subteam Summary of Review of Proposed Templates and Next Steps July 23, 2012.
Research Study Data Standards Standards for research study data for submission to regulatory authorities Standard development divided into three parts:
Penny Pang, Novartis. 2 About OpenCDISC 3  An open source community focused on building extensible frameworks and tools for the implementation and advancement.
April ADaM define.xml - Metadata Design Analysis Results Metadata List of key analyses (as defined in change order) Analysis Results Metadata per.
Why eCTD & CDISC? GSG-US, Inc. Chaeyong Chang March, 2012.
How good is your SEND data? Timothy Kropp FDA/CDER/OCS 1.
How Good is Your SDTM Data? Perspectives from JumpStart Mary Doi, M.D., M.S. Office of Computational Science Office of Translational Sciences Center for.
Generation of real-time SDTM datasets and metadata through a generic SDTM converter mechanism CDISC (CDASH/SDTM) integration into OC/RDC Peter Van Reusel.
N5 Databases Notes Information Systems Design & Development: Structures and links.
CDISC submission standard
Practical Considerations for Data Validation
Data quality & VALIDATION
Introduction To DBMS.
A need for prescriptive define.xml
Monika Kawohl Statistical Programming Accovion GmbH
Validation of CDISC data sets, current practice and future
7. German CDISC User Group Meeting Define
Facilitating Data Integration For Regulatory Submissions
Experience and process for collaborating with an outsource company to create the define file. Ganesh Sankaran TAKE Solutions.
Updates on CDISC Standards Validation
Accelerate define.xml using defineReady - Saravanan June 17, 2015.
Week 12 Option 3: Database Design
Definition SpecIfIcatIons
Secondary Uses Primary Use EHR and other Auhortities Clinical Trial
Beyond regulatory submission - Standards Metadata Management Kevin Lee CDISC NJ Meeting at 06/17/2015 We help our Clients deliver better outcomes, so.
Monika Kawohl Statistical Programming Accovion GmbH
Traceability between SDTM and ADaM converted analysis datasets
Practical Considerations for Data Validation
STF-Study tagging file
CDISC UK Network Jun-16 – Welwyn
Definition SpecIfIcatIons
To change this title, go to Notes Master
Monika Kawohl Statistical Programming Accovion GmbH
The ultimate in data organization
DESIGN OF CRF CRF.
DESIGN OF CRF CRF.
PhUSE: Pooling WHODrug B3 Format
PhilaSUG Spring Meeting June 05, 2019
Presentation transcript:

Most Common Issues in Define.xml files NJ CDISC User Group Sergiy Sirichenko October 21, 2015

Abbreviations CT – CDISC Control Terminology VLM – Value Level Metadata

Major problems in Define.xml Usage of outdated Define.xml v1.0 Inconsistency in metadata Missing study specific metadata Lack of expertise

Outdated Define.xml v1.0 is still used Define.xml has many standard limitation issues “The first” versions are never perfect Define.xml v1.0 is 11 years old Does anybody still using SDTM IG 3.1.1? Define.xml v2.0 is robust enough to handle current submission needs Separate presentation or webinar will be dedicated to this topic

Lack of structural consistency in v1.0 Metadata structural consistency in define.xml v2.0 is preventive against errors Example: Variable Source value defines other attributes “CRF” -> Pages are expected “Derived” -> Computational Algorithm is expected Define.xml v1.0 allows entering CRF pages for derived variables, having missing values for expected attributes, etc.

Limited and confusing VLM in v1.0 In v1.0 Value Level Metadata does not provide a reference to variable it applies Cannot handle multiple conditions Confusing and complex hierarchical VLM structure is used instead Example: LB domain has VLM assigned to LBCAT LBCAT has VLM for LBSPEC, LBSPEC -> LBMETHOD, etc. Properties of LBORRES (or other?) variable are described on some point of this tree structure V2.0 has explicit single expression with multiple condition assigned to particular variable

Some sponsors try to mimic v2.0 To use functionality of v2.0 Example: V1.0 does not have attributes for NCI Codes Sponsor added NCI Codes as a part of Decode value V2.0: V1.0: It’s invalid usage of v1.0 standard! Why not switch to v2.0 instead? Permitted Value (Code) mmol/L [C64387] ng/mL [*] Code Value Code Text mmol/L mmol/L [C64387] ng/mL ng/mL [*]

Some sponsors use custom stylesheet Often done to mimic the functionality of v2.0 Regulatory reviewers like consistency, so please use the CDISC provided standard stylesheet

Non-relevant metadata Variable Role is used for standard development, but does not add any value for study metadata Example: STUDYID and USUBJID can only be “Identifier” Does anyone actually used this info? Define.xml 2.0 stylesheet doesn’t display it

Order of datasets and variables Alphabetical Example: AE, CM, DM, … Correct: logical order as defined by standard - by Class, then by domain name Random Example: Correct: as variables are present in dataset Order # Variable Label 1 AECAT Category for Adverse Event  2 AEDECOD Dictionary-Derived Term  3 AEGRPID Group ID  4 AESEQ Sequence Number  5 AETERM Reported Term for the Adverse Event  6 DOMAIN Domain Abbreviation  7 STUDYID Study Identifier  8 USUBJID Unique Subject Identifier  9 AEBODSYS Body System or Organ Class  10 AEOUT Outcome of Adverse Event    …

Missing or invalid Origin No references to CRF pages Example: Origin=”CRF”, instead of “CRF Page 12, 41, 57” Inconsistencies in Origin/Comments Example: RFSTDTC has Origin = “CRF” No annotations on CRF (as expected) Comments: “First dose of study medication” (it looks like Derived variable)

Missing of invalid Derivations Example 1: AGE: ”Calculation: = Min DOV - BRTHDTC in AGEU“ What is DOV? How I can use Character value (BRTHDTC) in arithmetical formula? How were missing or partially missing dates handled? Derivations should be provided in terms of available data Example 2: “ZX021_AE_DURATION” ???

Invalid Value Level Metadata VLM should be described on the same level as regular variables: Codelist, DataType, Length, Origin, Derivation, etc. Common issue is missing or invalid metadata for Value Level Consider VLM as new variables with properties independent from “hosted” variable Example: Treatment Emergent Flag in SUPPAE has length=1, not 200 as QVAL variable

Duplicate records Code List Term Variables Order Number

External dictionaries Info on external dictionaries (MedDRA, WHODrug) is not provided correctly As comments to variable (non-machine readable) ISO8601 is defined as External Dictionary It’s a data format associated with all date, datetime, etc. variables. No specific reference to ISO8601 is needed if Data Type is defined correctly

Missing study specific metadata Study specific information is crucial for reviewers However in most submission packages it’s missing Value of define.xml, SDRG, aCRFs is to explain what is unique in this particular study

Missing Codelists Codelists are limited to variables which are assigned to standard CT Commonly missing study specific Codelists for variables Category (--CAT), Subcategory (--SCAT) EXTRT, ARMCD, --TESTCD/--TEST, QNAM, TPT RDOMAIN in CO and RELREC domains XXTOX, …

Merged Codelists Due to confusion between Standard CT Codelist and study Variable Codelist Example: Define.xml has one codelist (UNIT) assigned to all --DOSU, --VAMTU, --ORRESU, --STRESU variables This codelist includes all unique terms across all study “units” variables and have 450 items, while for example EXDOSU variable is populated with one “mg” term only A reference to 450-terms codelist is not relevant

What is define.xml Codelist? Define.xml Codelist describes data collection process and should be limited to all terms used for data collection of specific data element (a particular Variable or Value Level) For example, LBSTRESU, EGORRESU, EXDOSU usually have separate Codelists based on the same (UNIT) standard CT If data is collected as a free text, then Codelist may be not applicable Common example is CMDOSU, CMDOSFRQ, CELOC, etc.

Missing terms in Codelist Term is present in data SD0037 check Programming error Due to misspelling , leading space characters, etc. Due to missing Decoded value for some items CodeList vs. EnumaretedItem Codelist was populated based on collected data, but some options from CRF page are not included Example: Only race “WHITE” is collected, while 6 options are present on CRF

Missing or invalid Value Level Metadata Content of SUPPQUAL domains must be described

Missing description of --SPID --SPID is often Key Variable in domain Clear and detailed description is required to understand study data Why --SPID was introduced? How it was derived? … Often Sponsors copy Notes text from CDISC IG. It’s completely invalid approach! Study specific information is expected. SDTM IG text: “Sponsor-defined reference number. Perhaps pre-printed on the CRF as an explicit line identifier or defined in the sponsor’s operational database. Example: Line number on a CRF Page.“

Missing description of variables Study specific variables are the most important RFPENDTC, RFSTDTC, RFXSTDTC, --GRPID, --LNKID,--SPID, … SDTM text is not a variable description! See --SPID slide as an example

Invalid Key Variables Too long list of variables Example: “STUDYID, USUBJID, EXSPID, EXTRT, EXCAT, EXDOSTXT, EXDOSU, EXDOSFRM, EXDOSFRQ, EXDOSTOT, EXROUTE, EXSTDTC, EXENDTC, EXSTDY, EXENDY, EXTPT,EXTPTNUM, EXTPTREF, VISIT” Inconsistency between Key Variables and domain Structure Example: Structure: “One record per event” Key Variable: “USUBJID, AETERM, AEDECOD, AESTDTC, AESEV, AESER, AEACN, VISIT” Usage of –SEQ as Key Variable Example: “USUBJID, AESEQ”

Non-compliance with eCTD Define.xml file is located in different folder than datasets Example: define.xml in …\tabulation Data in …\tabulation\sdtm File name is not “define.xml” “define_study_001_sdtm.xml”

Contact info: Sergiy Sirichenko ssirichenko@pinnacle21.net