Qian Zhao, J&J Consumer Companies, Inc. Jun (John) Wang, J&J Consumer China Ltd Ruofei Hao, J&J Consumer Companies, Inc. PharmaSUG 2015 Paper #BB11.

Slides:



Advertisements
Similar presentations
Effecting Efficiency Effortlessly Daniel Carden, Quanticate.
Advertisements

Experience and process for collaborating with an outsource company to create the define file. Ganesh Sankaran TAKE Solutions.
Axio Research E-Compare A Tool for Data Review Bill Coar.
The INFILE Statement Reading files into SAS from an outside source: A Very Useful Tool!
The Assembly Language Level
Cross Check between Define.xml and blankcrf.pdf
CDISC (CDASH/SDTM) integration into OC/RDC
Database Management An Introduction.
Assemblers Dr. Monther Aldwairi 10/21/20071Dr. Monther Aldwairi.
Chapter 6 The Normal Distribution and Other Continuous Distributions
Chapter 8 Arrays and Strings
Basic And Advanced SAS Programming
Using Proc Datasets for Efficiency Originally presented as a Coder’s NESUG2000 by Ken Friedman Reviewed by Karol Katz.
Let SAS Do the Coding for You! Robert Williams Business Info Analyst Sr. WellPoint Inc.
Joy Oberoi Grade 12. Introduction THEATRE BOOKING SYSTEM (TBS) A system used to perform tasks that one would manually execute at a theatre It is online.
CBER CDISC Test Submission Dieter Boß CSL Behring, Marburg 20-Mar-2012.
Copyright © 2006, SAS Institute Inc. All rights reserved. Enterprise Guide 4.2 : A Primer SHRUG : Spring 2010 Presented by: Josée Ranger-Lacroix SAS Institute.
SAS Workshop Lecture 1 Lecturer: Annie N. Simpson, MSc.
© OCS Consulting The flexible extension to your IT team 1 Jim Groeneveld, OCS Consulting, ´s Hertogenbosch, Netherlands. PhUSE 2011 Comparing dataset metadata.
Term 2, 2011 Week 1. CONTENTS Types and purposes of graphic representations Spreadsheet software – Producing graphs from numerical data Mathematical functions.
Pairwise Alignment, Part I Constructing the Values and Directions Tables from 2 related DNA (or Protein) Sequences.
Antje Rossmanith, Roche 14th German CDISC User Group, 25-Sep-2012
Verification & Validation
Overview and feed-back from CDISC European Interchange 2008 (From April 21 st to 25 th, COPENHAGEN) Groupe des Utilisateurs Francophones de CDISC Bagneux.
Operator Precedence First the contents of all parentheses are evaluated beginning with the innermost set of parenthesis. Second all multiplications, divisions,
Chapter 6 Generating Form Letters, Mailing Labels, and a Directory
4/22/2017 5:36 PM EViews Training Creating Workfiles.
An Animated Guide©: Sending SAS files to Excel Concentrating on a D.D.E. Macro.
 A database is a collection of data that is organized so that its contents can easily be accessed, managed, and updated. What is Database?
Introduction to SAS. What is SAS? SAS originally stood for “Statistical Analysis System”. SAS is a computer software system that provides all the tools.
Definitions. Cell: Cell: Space in the intersection of a column (vertical division) and a row (horizontal division). Row: Row: A row runs horizontally.
1 Experimental Statistics - week 2 Review: 2-sample t-tests paired t-tests Thursday: Meet in 15 Clements!! Bring Cody and Smith book.
Chapter 8 Arrays and Strings
Multiple Uses for a Simple SQL Procedure Rebecca Larsen University of South Florida.
SAS Efficiency Techniques and Methods By Kelley Weston Sr. Statistical Programmer Quintiles.
SE: CHAPTER 7 Writing The Program
EDI communication system
SESSION 3.1 This section covers using the query window in design view to create a query and sorting & filtering data while in a datasheet view. Microsoft.
How to improve quality control in a data conversion process? By extended usage of metadata! Dimitri Kutsenko Entimo AG - Berlin/Germany.
The Use of Metadata in Creating, Transforming and Transporting Clinical Data Gregory Steffens Director, Data Management and SAS Programming ICON Development.
FOUNDATION IN INFORMATION TECHNOLOGY (CS-T-101) TOPIC : INFORMATION SYSTEM – SOFTWARE.
Summer SAS Workshop Lecture 3. Summer SAS Workshop Website
Common Sense Validation Using SAS Lisa Eckler Lisa Eckler Consulting Inc. TASS Interfaces, December 2015.
Lecture 5 functions 1 © by Pearson Education, Inc. All Rights Reserved.
Computing with SAS Software A SAS program consists of SAS statements. 1. The DATA step consists of SAS statements that define your data and create a SAS.
An Introduction to Proc Transpose David P. Rosenfeld HR Consultant, Workforce Planning & Data Management City of Toronto.
1 Chapter 3: Getting Started with Tasks 3.1 Introduction to Task Dialogs 3.2 Creating a Listing Report 3.3 Creating a Frequency Report 3.4 Creating a Two-Way.
CSC 405: Web Application Engineering II8.1 Web programming using PHP What have we learnt? What have we learnt? Underlying technologies of database supported.
“LAG with a WHERE” and other DATA Step Stories Neil Howard A.
Chapter 6: Modifying and Combining Data Sets  The SET statement is a powerful statement in the DATA step DATA newdatasetname; SET olddatasetname;.. run;
Business Information Management I 1. “Copyright and Terms of Service Copyright © Texas Education Agency. The materials found on this website are copyrighted.
Working Efficiently with Large SAS® Datasets Vishal Jain Senior Programmer.
Generation of real-time SDTM datasets and metadata through a generic SDTM converter mechanism CDISC (CDASH/SDTM) integration into OC/RDC Peter Van Reusel.
Mark Wheeldon, Formedix CDISC UK Network June 7, 2016 PRACTICAL IMPLEMENTATION OF DEFINE.XML.
SAS ® 101 Based on Learning SAS by Example: A Programmer’s Guide Chapters 3 & 4 By Tasha Chapman, Oregon Health Authority.
Chapter 11 Reading SAS Data
Practical Implementation of Define.xml
Chapter 6: Modifying and Combining Data Sets
Performing Mail Merges
SAS Programming Introduction to SAS.
Database Vocabulary Terms.
MAKE SDTM EASIER START WITH CDASH !
Traceability between SDTM and ADaM converted analysis datasets
PROC DOC III: Self-generating Codebooks Using SAS®
A SAS macro to check SDTM domains against controlled terminology
Fabienne NOEL CDISC – 2013, December 18th
Discrete Mathematics CMP-101 Lecture 12 Sorting, Bubble Sort, Insertion Sort, Greedy Algorithms Abdul Hameed
Introduction to DATA Step Programming: SAS Basics II
Functions continued.
Analysis Data Model (ADaM)
Presentation transcript:

Qian Zhao, J&J Consumer Companies, Inc. Jun (John) Wang, J&J Consumer China Ltd Ruofei Hao, J&J Consumer Companies, Inc. PharmaSUG 2015 Paper #BB11

2

 Data mapping and its QC can be a daunting task. When data are mapped, e.g., to SDTM, the variable values, variable names, variable types, and data structures, etc., may change.  Is it possible to write generic macros to perform data mapping and QC of the mapping that work for any data set? 3

4

 Assume the vital signs were collected in horizontal structure and then mapped to SDTM:  It is impossible to directly compare the two data sets. 5 VS_H (Vital Signs collected in horizontal structure) SUBJID USUBJIDVISITNUM WTVALWTUNITHTVALHTUNIT … → (mapped to) VS (transposed to conform with SDTM) USUBJIDVISITNUMVSTESTCDVSORRESVSORRESU VSSTRESCVSSTRESNVSSTRESU → WEIGHT HEIGHT …

 However, if we can carve out a portion, e.g., for WEIGHT:  The comparison (i.e., QC of the mapping) can be made for the carved-out portion. 6 VS_H (Vital Signs collected in horizontal structure) SUBJID USUBJIDVISITNUM WTVALWTUNITHTVALHTUNIT … → (mapped to) VS (transposed according to SDTM) USUBJIDVISITNUMVSTESTCDVSORRESVSORRESU VSSTRESCVSSTRESNVSSTRESU → WEIGHT HEIGHT …

 Libname data1 "e:\study1\original";  Libname data2 "e:\study1\mapped";  proc sort data=data1.vs_h out=vs_h;  where wtval^=.; * missing weight was not mapped to VS;  by usubjid visitnum;  run;  proc sort data=data2.vs out=vs;  where vstestcd='WEIGHT';  by usubjid visitnum;  run;  data check_mapping_result;  merge vs_h(in=_a) vs(in=_b);  * for simplicity in this paper, assuming key variables are the same;  by usubjid visitnum;  if round(wtval,0.1) ^= round(input(vsorres,best6.),0.1);  run; 7

 Looping through all pairs (with the proper conversion as needed, e.g., when checking a variable in standard unit): WTVAL vs VSORRES (where VSTESTCD=”WEIGHT”) WTVAL vs VSSTRESC (where VSTESTCD=”WEIGHT”) WTVAL vs VSSTRESN (where VSTESTCD=”WEIGHT”) WTUNIT vs VSORRESU (where VSTESTCD=”WEIGHT”) WTUNIT vs VSSTRESU (where VSTESTCD=”WEIGHT”) HTVAL vs VSORRES (where VSTESTCD=”HEIGHT”) HTVAL vs VSSTRESC (where VSTESTCD=”HEIGHT”) HTVAL vs VSSTRESN (where VSTESTCD=”HEIGHT”) HTUNIT vs VSORRESU (where VSTESTCD=”HEIGHT”) HTUNIT vs VSSTRESU (where VSTESTCD=”HEIGHT”) … the QC of mapped vital signs data can be completed. 8

 We can specify the above pairs in an external control file (data mapping QC specification), e.g., a spreadsheet:  and write a SAS macro: 9

 The macro parameters can be passed in by importing the data mapping QC specification (the external control file).  The number of loops is controlled by the number of rows in the specification.  Another way to view the external control file is that the columns define macro parameters the rows pass parameter values to each macro call 10

 Notice the four sets of columns in the QC specification: lib dt by+whr var 11

 These are the four-dimensions to identify a data field, i.e.: location (drive:\path, lib) data set (dt) row (observation, by+whr) column (variable name, var) 12

 With these four dimensions identified, the comparison between the original and the corresponding mapped data point can be performed. 13

 Also notice that the macro handles the part that is common, or unchanging, from data set to data set, e.g., the sorting, sub-setting, merging; while the control file handles the part that is changing or unknown to the macro.  In other words the macro handles the function of each column of the control file; while the control file handles the contents of each column. 14

15

 Assume we have the following data mapping specification to map VS data to SDTM:  We can write a macro against this specification to generate a mapping program for VS:  The above macro generates this mapping program: 16

 In this example, the macro serves as a compiler of the mapping specification.  For each column in the mapping specification, the macro knows its function, though not its contents, and therefore knows how to translate the specification into the programming language. 17

18

 Using external control files, standard programming is possible for what seems otherwise impossible. The macro can handle the parts that do not change, that are machine interpretable; The external control file handles the parts that are changing, or un-expressible to the macro. 19

20

We would like to thank Brenda Zimmerman and J. Tony McGuire for reviewing this paper and providing valuable comments. 21

Name: Qian Zhao Organization: J&J Consumer Companies, Inc. Address: 185 Tabor Road City, State ZIP: Morris Plains, NJ Work Phone: Fax:

Name: Jun (John) Wang Organization: J&J Consumer China Ltd Address: Worldwide Emerging Markets Innovation Center, 3285 Dongchuan Road, Minhang District City, State ZIP: Shanghai , China Work Phone: Fax:

Name: Ruofei Hao Organization: J&J Consumer Companies, Inc. Address: 185 Tabor Road City, State ZIP: Morris Plains, NJ Work Phone: Fax:

25

 Fully utilize the specification, e.g., no need to re- type the mapping expressions, variable labels, lengths, etc., in the program.  Maintain consistency between the specification and the program; therefore changes need to be made only in one place.  Make the mapping task easier and streamlined by filling a few blanks, e.g., variables to be mapped (MAPVAR), source variable (SRCVAR), mapping expression (VAR_MAP), and variable length (VARLEN). (VARORD, TGTVAR, VARLABEL and VARTY are SDTM standard and need not to be changed.) 26

 Allow making changes in a central location (i.e., not scattered in different SAS statements and/or steps).  Obtain better traceability from the source variable to the mapped variable.  Enable a higher level standardization due to reusability of the mapping specification template and the macro. Note: SDTMIG in the spreadsheet format can be downloaded and used as the base for the mapping specification template.  Potentially require lower level of SAS skills. 27