Validation of CDISC data sets, current practice and future

Slides:



Advertisements
Similar presentations
CDISC Open Source and low-cost Solutions
Advertisements

Emerging Technologies Semantic Web and Data Integration This meeting will start at 5 min past the hour As a reminder, please place your phone on mute unless.
What’s New with CDISC Wayne R. Kubick CDISC CTO.
Snejina Lazarova Senior QA Engineer, Team Lead CRMTeam Dimo Mitev Senior QA Engineer, Team Lead SystemIntegrationTeam Telerik QA Academy SOAP-based Web.
SEND Standard for the Exchange of Nonclinical Data
“Compliance” for Analysis Data Chris Decker, Vice-President, Life Sciences Practice, d-Wise Technologies Randall Austin, Manager, Data Standards, GlaxoSmithKline.
© CDISC 2014 Asthma TAUG Metadata Bundle Review – Experimental Content PhUSE Computational Science Symposium March th, 2015 Sam Hume 1.
© 2008 Octagon Research Solutions, Inc. All Rights Reserved. 1 PhUSE 2010 Berlin * Accessing the metadata from the define.xml using XSLT transformations.
Exploring Microsoft® Office Grauer and Barber 1 Committed to Shaping the Next Generation of IT Experts. Robert Grauer and Maryann Barber Using.
Updates on CDISC Standards Validation
Pemrograman Berbasis WEB XML part 2 -Aurelio Rahmadian- Sumber: w3cschools.com.
CBER CDISC Test Submission Dieter Boß CSL Behring, Marburg 20-Mar-2012.
© 2011 Octagon Research Solutions, Inc. All Rights Reserved. The contents of this document are confidential and proprietary to Octagon Research Solutions,
PhUSE SDE, 28-May A SAS based Solution for define.xml Monika Kawohl Statistical Programming Accovion.
Confidential - Property of Navitas Accelerate define.xml using defineReady - Saravanan June 17, 2015.
Second Annual Japan CDISC Group (JCG) Meeting 28 January 2004 Julie Evans Director, Technical Services.
Intro. to XML & XML DB Bun Yue Professor, CS/CIS UHCL.
© Copyright 2008 ADaM Validation and Integrity Checks Wednesday 12 th October 2011 Louise Cross ICON Clinical Research, Marlow, UK.
Copyright © 2011, SAS Institute Inc. All rights reserved. Using the SAS ® Clinical Standards Toolkit 1.4 to work with the CDISC ODM model Lex Jansen SAS.
SDTM Validation Delaware Valley CDISC user network Ketan Durve Johnson and Johnson Pharmaceutical Reasearch and Development May 11 th 2009.
Serving society Stimulating innovation Supporting legislation Joint Research Centre The Inspire Geoportal Validator.
Dave Iberson-Hurst CDISC VP Technical Strategy
© 2011 Clinovo. All Rights Reserved. The contents of this document are confidential and proprietary to Clinovo 1 Clinovo 1208 E. Arques Avenue, Suite 114.
SOAP-based Web Services Telerik Software Academy Software Quality Assurance.
Copyright © 2015, SAS Institute Inc. All rights reserved. Future Drug Applications with No Tables, Listings and Graphs? PhUSE Annual Conference 2015, Vienna.
April ADaM define.xml - Metadata Design Analysis Results Metadata List of key analyses (as defined in change order) Analysis Results Metadata per.
With OpenCDISC Validator 1. What is 2 OpenCDISC Validator Open source project Freely Available Commercial-quality Facilitate compliance with CDISC standards.
Briefing and Planning meeting on INSPIRE validator implementation – Discussion 16/12/2015.
Enterprise Oracle Solutions Oracle Report Manager The New ADI and More Revised:June 20091Report Manager/SROAUG Presentation.
© CDISC 2015 Paul Houston CDISC Europe Foundation Head of European Operations 1 CTR 2 Protocol Representation Implementation Model Clinical Trial Registration.
XML and Distributed Applications By Quddus Chong Presentation for CS551 – Fall 2001.
Mark Wheeldon, Formedix CDISC UK Network June 7, 2016 PRACTICAL IMPLEMENTATION OF DEFINE.XML.
Submission Standards: The Big Picture Gary G. Walker Associate Director, Programming Standards, Global Data Solutions, Global Data Management.
The use of LOINC in CDISC Standards German CDISC Users Meeting Stuttgart Jozef Aerts XML4Pharma.
Practical Considerations for Data Validation
Paul Houston CDISC Europe Foundation Head of European Operations
OCUL License Mapping Project Colleen Neely
A need for prescriptive define.xml
Dave Iberson-Hurst CDISC VP Technical Strategy
Copyright and Open Licensing
Copyright and Open Licensing
Lecture 1 Introduction Richard Gesick.
XML QUESTIONS AND ANSWERS
Experience and process for collaborating with an outsource company to create the define file. Ganesh Sankaran TAKE Solutions.
The use of LOINC in CDISC Standards
Accelerate define.xml using defineReady - Saravanan June 17, 2015.
Monika Kawohl Statistical Programming Accovion GmbH
Why use CDISC for trials not submitted to regulators?
MAKE SDTM EASIER START WITH CDASH !
Traceability between SDTM and ADaM converted analysis datasets
Freundschaft Konferenz
Practical Considerations for Data Validation
Monika Kawohl Statistical Programming Accovion GmbH
A SAS macro to check SDTM domains against controlled terminology
To change this title, go to Notes Master
Fabienne NOEL CDISC – 2013, December 18th
ODM-based Study Archival
Bob Friedman, Xybion; Anthony Fata, SNBL
SDTM and ADaM Implementation FAQ
Monika Kawohl Statistical Programming Accovion GmbH
Generating Define.xml at Kendle using DefinedocTM
Venkata N Madhira Senior Statistical Programmer
An FDA Statistician’s Perspective on Standardized Data Sets
WG5 P02 Proposal 2014 Qualification of Standard Scripts
Generating Define.xml at Kendle using DefinedocTM
Data Submissions Douglas Warfield, Ph.D. Technical Lead, eData Team
WG5 P02 Proposal 2014 Qualification of Standard Scripts
Copyright and Open Licensing
Work Stream Templates Basel, September 2, 2008.
Presentation transcript:

Validation of CDISC data sets, current practice and future Jozef Aerts University of Applied Sciences FH Joanneum XML4Pharma

Who is Jozef Aerts? CDISC Volunteer since 2002 Teaching Medical Informatics in Graz Owner of XML4Pharma, a software and consultancy company Member of several CDISC development teams

Validation of CDISC datasets - current practice SDTM SEND ADaM define.xml

Validation of CDISC datasets - current practice Based on (arbitrary) interpretation of the standards documents and implementation guides By a single company With extensions from FDA and PMDA

Validation of CDISC datasets - the problem CDISC never published any validation rules (except recently for ADaM) These rules are not machine-readable CDISC Implementation guides do not provide clear rules

The old solution A company picked up the lack of CDISC initiative and developed a validation software Originally as "open source" New license however strongly limits the user rights Available as "Enterprise" and as "Community" edition Tool is also used by FDA and PMDA

The old solution - problems Tool used by FDA and PMDA reflects the interpretation of the standards and IGs by one company Not necessarily the interpretation of CDISC (teams) Leading to many discussions about what the rules exactly are Rules implementation are intransparent You cannot see how the rule is implemented (unless you look into the source code) Many false positive errors FDA/PMDA rules sometimes contradict with IG Some of them look more to be the result of a "complaint box" Copyright Charles Hope, Flickr https://www.flickr.com/photos/charleshope/4056571043/

The old solution - problems - examples https://www.pinnacle21.net/forum

Solution for the problems with the old solution It is advised to document any deviations (including false positive errors) in the "Reviewers Guide" Is this the ideal world?

The future … CDISC has established "Validation Rule Teams" For ADaM => rules already published For SDTM => work in progress These rules still come as Excel worksheets or PDF documents So not well machine-readable (although some first attempts), and therefore possibly open for different interpretations Idea/principle: what is not in the set of rules, is not a rule Validation tools should not add other/new ones Except for when mandated by the FDA, PMDA, …

CDISC SDTM validation rules (provisional)

The role of define.xml Define.xml is "the sponsor's truth" Helps the reviewers understand the submission One cannot validate submission datasets without define.xml Should validation be a 2-step mechanism? Step 1: validate contents of define.xml against SDTM/ADaM/SEND standard Step 2: validate contents of dataset against define.xml

Machine-readable rules Machine-readable rules should also be human-readable Reason: TRANSPARENCY Must have a precondition and a postcondition When does the rule apply? What is the consequence of the rule being violated (Error, Warning, …)? In case of XML, XML-Schema is always the first step But we are still using SAS-XPT!

Machine-readable rules: define.xml Step 1: validation against the XML-Schema (also see "XML Schema Validation for Define.xml White Paper") ~ 50% of the rules Step 2: validation against Schematron ~ 40% of the rules

Machine-readable rules: Submission datasets Hmmm - it's still all in XPT … Validation rules programmed in SAS? Not vendor neutral … Anyone volunteering to do so? We need something else …

Machine-readable rules in XQuery XQuery = querying language for XML documents Is a W3C standard (W3C = World Wide Web Consortium) However: only applicable to XML But we can easily transform XPT to CDISC Dataset-XML FDA and PMDA will have to move to CDISC Dataset-XML some day anywhere …

CDISC Dataset-XML CDISC's alternative to SAS-XPT

Validation rules in XQuery Human-readable & machine-executable No "wiggle room" for different interpretations Proven technology (W3C standard), vendor neutal, software language independent (Java, C#, C++, PHP, …) But only applicable to XML documents …

Validation rules in XQuery - Example Rule FDAC066: Invalid IDVAR: IDVAR must have a valid value of variables from the referenced domain Get the define.xml Get the RELREC, CO and SUPPxx dataset definitions Iterate over the RELREC, CO and SUPPxx datasets Get the physical location Get the IDVAR variable

Validation rules in XQuery - Example Rule FDAC066: Invalid IDVAR: IDVAR must have a valid value of variables from the referenced domain Iterate over all the records in the dataset Get the record number Get the value of the IDVAR Check whether it is defined in the define.xml Error message

This is work in progress … ~90% of FDA and PMDA rules done (SDTM/SEND) But ~10% of these rules is nonsense ~30% of CDISC-ADaM rules done Need help from community as I am not an ADaM specialist A few CDISC-SDTM rules done But overlap with FDA rules Available at: http://xml4pharmaserver.com/RulesXQuery/index.html Also available through RESTful web service „give me the last version of rule XYZ …“

Shouldn't the rules be in the IG itself? Do we really need separate documents with the rules? Why aren't the rules described in the IG itself? But the IGs are PDFs, not machine-readable … But highly structured, so they could be …

The future? Machine-readable Implementation Guides? Some first attempts … - rules

The future? Machine-readable Implementation Guides? Some first attempts … - rules

The future? Machine-readable Implementation Guides? Some first attempts … - codelists

Conclusions Current validation rules and their implementation are unsatisfactory Intransparent Single-sided interpretation Many false-positive errors CDISC teams want to take control back over the validation rules Publishing CDISC-owned validation rules In future, CDISC hopes to publish validation rules im human-readable, machine-executable format through SHARE Ideally, validation rules should be within a machine-readable IG

Links / Disclaimer http://cdisc-end-to-end.blogspot.com http://cdiscguru.blogspot.com The information in this presentation contains statements that are the personal opinion of Jozef Aerts and not necessarily the opinion of Jozef Aerts. None of the pictures in this presentation is owned or requires a license from APA PictureDesk, a Vienna-based „Picture Troll“ company.

Thank you for your atttention! Contact: Jozef.Aerts@XML4Pharma.com