© 2008 Octagon Research Solutions, Inc. All Rights Reserved. 1 PhUSE 2010 Berlin * Accessing the metadata from the define.xml using XSLT transformations Lex Jansen Octagon Research Solutions, Inc. Leading the Electronic Transformation of Clinical R&D
© 2008 Octagon Research Solutions, Inc. All Rights Reserved. 2 Contents Introduction define.xml: Regulatory landscape Data Definition Tables (pdf / xml) What is the define.xml Displaying the define.xml XSLT
© 2008 Octagon Research Solutions, Inc. All Rights Reserved. 3 Introduction
© 2008 Octagon Research Solutions, Inc. All Rights Reserved. 4 This presentation is NOT about CREATING a define.xml file It is about how the information (metadata) in a define.xml file can be USED Before we can USE the metadata from the define.xml file, we need to be able to ACCESS that metadata This presentation has a focus on XML technologies (XSLT) to access that metadata Introduction
© 2008 Octagon Research Solutions, Inc. All Rights Reserved. 5 Define.xml: Regulatory landscape
© 2008 Octagon Research Solutions, Inc. All Rights Reserved. 6 Regulatory Landscape (FDA) July 2004 – FDA adds Study Data Specifications v1.0 to draft eCTD Guidance. This specification references the CDISC SDTM for data tabulation datasets
© 2008 Octagon Research Solutions, Inc. All Rights Reserved. 7 Regulatory Landscape (FDA) March 2005 – Study Data Specifications v1.1: Updates Specifications for Data Set Documentation - data definitions - annotated case report forms (CRFs) “The specification for the data definitions for datasets provided using the CDISC SDTM is included in the Case Report Tabulation Data Definition Specification (define.xml) developed by the CDISC define.xml Team” … Include a reference to the style sheet as defined in the specification and place the corresponding style sheet in the same folder as the define.xml file …
© 2008 Octagon Research Solutions, Inc. All Rights Reserved. 8 Regulatory Landscape (FDA) November Study Data Specifications v1.5: "For datasets not prepared using the CDISC SDTM specifications, consult Appendix 2 for information concerning the preparation of a define.pdf data definition file." Appendix 2 specifies a define.pdf specification similar to the 1999 guidance
© 2008 Octagon Research Solutions, Inc. All Rights Reserved. 9 Data Definition Tables in PDF
Data Definition Tables - PDF Study Data Specifications: "For datasets not prepared using the CDISC SDTM specifications, consult Appendix 2 for information concerning the preparation of a define.pdf data definition file"
Data Definition Tables - PDF "Sponsors should also provide a link to the appropriate annotated case report form file (blankcrf.pdf)"
© 2008 Octagon Research Solutions, Inc. All Rights Reserved. 12 Data Definition Tables in XML
© 2008 Octagon Research Solutions, Inc. All Rights Reserved. 13 Data Definition Tables - XML As of January 1, 2008: follow the eCTD guidance and document submitted data by including data definition tables (define.xml) and annotated case report forms (blankcrf.pdf)
© 2008 Octagon Research Solutions, Inc. All Rights Reserved. 14 Data Definition Tables - XML
© 2008 Octagon Research Solutions, Inc. All Rights Reserved. 15 Displaying the define.xml … with a stylesheet
© 2008 Octagon Research Solutions, Inc. All Rights Reserved. 16 define.xml
© 2008 Octagon Research Solutions, Inc. All Rights Reserved. 17 define.xml Case Report Tabulation Data Specification (CRT-DDS, or define.xml): Production version: Extension of the CDISC Operational Data Model (ODM), an XML specification to facilitate the archival and interchange of the metadata and data for clinical research Maintained by CDISC’s XML Technologies Team (formerly known as the ODM team) New define.xml version 2 in development with additional metadata support for SDTM and ADaM (results metadata)
© 2008 Octagon Research Solutions, Inc. All Rights Reserved. 18 define.xml XML schema definitions (XSD) describe the structure of the define.xml
© 2008 Octagon Research Solutions, Inc. All Rights Reserved. 19 define.xml
© 2008 Octagon Research Solutions, Inc. All Rights Reserved. 20 define.xml – Specifications
© 2008 Octagon Research Solutions, Inc. All Rights Reserved. 21 define.xml
© 2008 Octagon Research Solutions, Inc. All Rights Reserved. 22 Displaying the define.xml
Define.xml define.xml contains metadata and is machine readable define.xml becomes human readable with a stylesheet
© 2008 Octagon Research Solutions, Inc. All Rights Reserved. 24 Displaying the define.xml define.xml becomes human readable with an XSL stylesheet
© 2008 Octagon Research Solutions, Inc. All Rights Reserved. 25 Displaying the define.xml … and looks even fancier with a different stylesheet
© 2008 Octagon Research Solutions, Inc. All Rights Reserved. 27 Displaying the define.xml
© 2008 Octagon Research Solutions, Inc. All Rights Reserved. 28 Displaying the define.xml
© 2008 Octagon Research Solutions, Inc. All Rights Reserved. 29 Displaying the define.xml
© 2008 Octagon Research Solutions, Inc. All Rights Reserved. 30 XSLT
© 2008 Octagon Research Solutions, Inc. All Rights Reserved. 31 eXtensible Stylesheet Language Transformations (XSLT) is a language that lets you convert XML documents into other XML documents, into HTML documents, or into any other text based document (like a SAS program), or even a PDF file XSLT is a language "for transforming the structure and content of an XML document"
XSLT © 2008 Octagon Research Solutions, Inc. All Rights Reserved. 32 XSL transformations are like Rubik's cube! XML HTML PDF TEXT
XSL The mandatory "hello world" XML
XSL The mandatory "hello Berlin" XML
XSL XSL stylesheet
XSL Example: XML + XSL = HTML
© 2008 Octagon Research Solutions, Inc. All Rights Reserved. 37 Other examples of using the define.xml metadata
© 2009 Octagon Research Solutions, Inc. All Rights Reserved. 38 Examples Use dataset and variable information (type, length, label) to create zero-observation datasets that can serve as data conversion targets
© 2008 Octagon Research Solutions, Inc. All Rights Reserved. 39 DATASET TEMPLATES from the define.xml Examples
© 2009 Octagon Research Solutions, Inc. All Rights Reserved. 40 Examples Use codelist information (codes/decodes) to create a PROC FORMAT
© 2008 Octagon Research Solutions, Inc. All Rights Reserved. 41 PROC FORMAT from the define.xml Examples
© 2008 Octagon Research Solutions, Inc. All Rights Reserved. 42 Examples
© 2008 Octagon Research Solutions, Inc. All Rights Reserved. 43
© 2008 Octagon Research Solutions, Inc. All Rights Reserved. 44
© 2008 Octagon Research Solutions, Inc. All Rights Reserved. 45 Running XSLT with SAS Experimental
© 2008 Octagon Research Solutions, Inc. All Rights Reserved. 46
Using Xalan
© 2008 Octagon Research Solutions, Inc. All Rights Reserved. 49 In case you get serious about XSLT … Get a good XML editor –Oxygen ( –XMLSpy ( has some issues in validating define.xmlhttp:// –Check out the define.xml white paper on Last words
© 2008 Octagon Research Solutions, Inc. All Rights Reserved. 50 Find this paper and more than 11,000 other SAS papers at