Linking Research Data to Clinical Data – a Pilot The University of Alabama at Birmingham
UAB Comprehensive Cancer Center How We Got Here Institutional need to expose clinical data to the research community Institutional commitment to grid computing that we can leverage High-level awareness that distributed research groups have common research needs – Desire to leverage shared resources: infrastructure and data – CCC and CCTS scientists share common needs – UAB IT and HSIS have technical knowledge – Linking clinical data from the Health System to laboratory/research data is a logical extension of the Horizon system that is already in place – Need for query interface to allow hypotheses to be tested caBIG helped us develop a project team composed of subject matter experts and technologists – Partnership between the CCC, CCTS, UAB-IT, the UAB School of Medicine and the UAB Health System – UAB has a culture of build and buy not build or buy Executive level support for the development of a coordinated research support environment
UAB Comprehensive Cancer Center UAB IT Grid Infrastructure Infrastructure Development High Performance Computing (HPC) Expansion UABgrid Development Combine IT Investments with Grant Funding through CIS and Engineering Collaborations Infrastructure Services High Capacity Networks (Internet2, NLR) HPC Resource Aggregation via Grid Web Collaboration Tools (eg. Wiki, Trac) Programmable Infrastructure Components Infrastructure Transparency Open Development Community Engagement & Participation
UAB Comprehensive Cancer Center UAB Cyberinfrastructure (CI) Investments Common network user identity (BlazerID) for consistent identity across systems. Member of InCommon Early Internet2 member providing high bandwidth network access to other research campuses High Performance Computing (HPC) investments to build investigative capacity for computational research On-going model of engagement to support research technology investments – caBIG and CTSA Expansion of network capacity via dark fiber UA-System RON, including 10Gig Ethernet connectivity to ASA, SOX, SLR, NLR, Internet2
UAB Comprehensive Cancer Center Migrating Workflows to Grid Statistical Genetics R Statistical Package Methodological Analysis Workflow Many Isolated Computations Work in Progress and Promising Results Developing Successfully leveraged open science grid to demonstrate full-scale analysis using 1000 CPU hours in 4 hour window Microbiology DynamicBLAST – Grid Version of BLAST Master Worker Type Application Maximize Throughput, Minimize Job Turn-around Leading Model for Migrations Work led by Enis Afgan and Dr. Puri Bangalore in CIS Proteomics Developed a deployment diagram to facilitate dialog on caBIG and vendor support forums Worked with LabKey vendor to identify caGrid 1.2 compatibility issues (lead to upgrade of data service to caCORE SDK 4.0) Deployed LabKey/CPAS on Production caGrid publishing live data from research system
UAB Comprehensive Cancer Center Research Initiative Support caBIG UAB Comprehensive Cancer Center funded to connect to caBIG Contributed to completion of Self-Assessment and Implementation Plan Deploying Life Sciences Distribution to support research workflows caBIG provides a very good model for service and infrastructure abstractions caGrid Features Uses established, site-specific identifiers Comprehensive authorization solution for any type of resource Focused on data sharing, expanding grid beyond HPC Open development model Requirements for UAB Bring BlazerID system to NIST Level 2 Exploring Integration of caGrid GAARDS AuthX Infrastructure (GridGrouper)
caGrid Data Sharing Pilot Proposal Objectives Build a proof of concept that allows data sharing between the existing campus grid environment and the existing HS HIT environment Allows internal researchers to get access to clinical data in a protected environment Needs Partnership between Campus IT, CCC, CCTS and HSIS to design and build an agreed-upon straw-man architecture to facilitate this data sharing Leadership commitment to the pilot’s objective, including making resources available Leadership can assist in identifying a first customer from the research community to help represent the data sharing needs UAB Comprehensive Cancer Center
Overview of Health System HIT Environment Major HIT systems roles Horizon – Enterprise Clinical portal (both Ambulatory and Hospital) JCAPS technology being using to share data across all components of the HIT clinical systems (see Systems map) Tool that can also be used to facilitate the data exchange into the Grid environment The same toolset might be used to de-identify data Horizon environment Separate data repositories for all clinical data (labs, documents, visits) Internally developed J2EE services for accessing this data LDAP authentication model compatible with a federated approach UAB Comprehensive Cancer Center
HSIS Systems Map UAB Comprehensive Cancer Center
Pilot Benefits and Next Steps Benefits Uses existing technologies already in hand Validate the dynamic data sharing model without a large investment Conceptual support for the pilot approach across a broad range of constituents (campus IT, HSIS, CCC, CCTS, School of Medicine…) Next Steps Distribute an RFA to the UAB research community to identify a pilot customer Develop a resource model drawing from the constituents and identify any unmet needs Demonstrate pilot feasibility for adapting caGrid technologies to build campus research information network UAB Comprehensive Cancer Center
UAB Comprehensive Cancer Center Pilot for UAB Brain SPORE
Proposed Pilot Clinical Data Service UAB Comprehensive Cancer Center HSIS firewall UAB Grid environment