Using web service technologies for incremental, real-time data transfers from EDC to SAS Andrew Newbigging Vice President, Integrations Development 19th October 2010 Medidata Solutions, Inc. Proprietary - Medidata and Authorized Clients Only. This document contains proprietary information that shall be distributed, routed or made available only within Medidata, except with written permission of Medidata.
Agenda Introduction General Considerations Web Services Conclusion
Introduction Data transfers from EDC to SAS are typically cumulative (all data) transferred in batch mode (infrequently) in files formatted in a SAS proprietary format: How can we utilize modern technologies and standards to improve efficiency, reliability and information density? EDC SAS
Agenda Introduction General Considerations Web Services Conclusion
Cumulative data volumes in a clinical study
Average daily change
Cuumulative vs. incremental Cumulative Incremental Repeated re-transfer of unchanged data: inefficient and time-consuming Only data changes transferred: maximum efficiency Difficult to achieve real-time data transfer Near real-time transfer possible Entire data set always sent – no data lost if one transfer fails How to recover from a transfer error: checksum/resend protocol required - + - + - +
Data transfer formats Desirable features: Support any clinical study design and data Human-readable Self-describing (metadata) Support for incremental or cumulative transfers Open, not proprietary, format
Data transfer formats and standards Text SAS CDISC SDTM CDISC ODM All studies Y N Human readable Metadata (Y) Incremental / cumulative Open standard
CDISC ODM – Clinical data structure
CDISC ODM - Example
Agenda Introduction General Considerations Web Services Conclusion
Web services Web services are application programming interfaces (API) that are accessed via the Hypertext Transfer Protocol (HTTP) Simple Object Access Protocol (SOAP) is one style Representational State Transfer (REST) is our preferred approach
REST REST uses HTTP methods or verbs: GET PUT POST DELETE To access objects via Uniform Resource Identifiers (URI) https://innovate.mdsol.com/RaveWebServices/studies/Mediflex/datasets/regular/AE Returning HTTP status codes: 200 OK 401 Unauthorized 404 Not Found
REST in the browser - 1
REST in the browser - 2
REST in the browser - 3
REST from the command line - 1 curl -u username:password -H 'Content-Type:text/xml' -v https://innovate.mdsol.com/RaveWebServices/studies/Mediflex/datasets/regular/AE * About to connect() to innovate.mdsol.com port 443 (#0) * Trying 70.42.99.224... connected * Connected to innovate.mdsol.com (70.42.99.224) port 443 (#0) * SSLv3, TLS handshake, Client hello (1): * SSLv3, TLS handshake, Server hello (2): * SSLv3, TLS handshake, CERT (11): * SSLv3, TLS handshake, Server finished (14): * SSLv3, TLS handshake, Client key exchange (16): * SSLv3, TLS change cipher, Client hello (1): * SSLv3, TLS handshake, Finished (20): * SSL connection using RC4-MD5 * Server certificate: * subject: O=*.mdsol.com; OU=Domain Control Validated; CN=*.mdsol.com * start date: 2007-03-28 17:49:39 GMT * expire date: 2017-04-03 14:34:46 GMT * subjectAltName: innovate.mdsol.com matched * issuer: C=US; ST=Arizona; L=Scottsdale; O=GoDaddy.com, Inc.; OU=http://certificates.godaddy.com/repository; CN=Go Daddy Secure Certification Authority; serialNumber=07969287 * SSL certificate verify ok.
REST from the command line - 2 * Server auth using Basic with user 'username' > GET /RaveWebServices/studies/Mediflex/datasets/regular/AE HTTP/1.1 > Authorization: Basic ******************************** > User-Agent: curl/7.19.7 (universal-apple-darwin10.0) libcurl/7.19.7 OpenSSL/0.9.8l zlib/1.2.3 > Host: innovate.mdsol.com > Accept: */* > Content-Type:text/xml > < HTTP/1.1 200 OK < Date: Tue, 14 Sep 2010 01:18:05 GMT < Content-Type: text/xml < <?xml version="1.0" encoding="utf-8"?> <ODM FileType="Snapshot" FileOID="96741552-97f4-4035-aad3-e9f12459ca20" CreationDateTime="2010-09-14T01:18:05.255-00:00" ODMVersion="1.3" xmlns:mdsol="http://www.mdsol.com/ns/odm/metadata" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns="http://www.cdisc.org/ns/odm/v1.3" …
Incremental requests https://innovate.mdsol.com/RaveWebServices/ studies/Mediflex/datasets/regular/AE? start=2010-09-01T15:00:00 ODM TransactionType: Insert Update Remove
REST and SAS Clinical Data Integration(CDI)
REST and SAS CDI GET list of studies GET study metadata https://innovate.mdsol.com/RaveWebServices/ /studies/Mediflex/metadata/ GET clinical data https://innovate.mdsol.com/RaveWebServices/ /studies/Mediflex/datasets/
Challenges Consistency Metadata versions Incremental transfers are more efficient, but how can the overall integrity of transferred data be assessed? Hash functions (md5, sha1, etc) being investigated Metadata versions To accommodate changes during a study (for example a protocol amendment) CDISC ODM may have multiple metadata versions There are no constraints on changes between versions Extra care needed to ensure that the correct metadata version is applied to each data point
Agenda Introduction General Considerations Web Services Conclusion