CaBIG™ Architecture Vocabularies and Common Data Elements Joint Workspace Face-to-Face Meeting University of Utah, Salt Lake City January 28-30, 2008.

Slides:



Advertisements
Similar presentations
Introduction The cancerGrid metadata registry (cgMDR) has proved effective as a lightweight, desktop solution, interoperable with caDSR, targeted at the.
Advertisements

Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
CACORE TOOLS FEATURES. caCORE SDK Features caCORE Workbench Plugin EA/ArgoUML Plug-in development Integrated support of semantic integration in the plugin.
CVRG Presenter Disclosure Information Tahsin Kurc, PhD Center for Comprehensive Informatics Emory University CardioVascular Research Grid Core Infrastructure.
Looking ahead: caGrid community requirements in the context of caGrid 2.0 Lawrence Brem 7 February 2011.
© Copyright 2008, Mayo Clinic College of Medicine Mayo Clinic Open Health Tools Application for Membership OHT Board Meeting, Birmingham, UK July 1, 2008.
CaBIG™ Terminology Services Path to Grid Enablement Thomas Johnson 1, Scott Bauer 1, Kevin Peterson 1, Christopher Chute 1, Johnita Beasley 2, Frank Hartel.
Dorian Grid Identity Management and Federation Dialogue Workshop II Edinburgh, Scotland February 9-10, 2006 Stephen Langella Department.
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
GeWorkbench caGrid TeraGrid Integration Scott Oster Ohio State University – Dept. of Biomedical Informatics Christine Hung Columbia University – JCSB/C2B2.
CaGrid Service Metadata Scott Oster - Ohio State
Mayo LexWiki: A Prototype of Collaborative Platform for Terminology/Ontology Content Development Guoqian Jiang, Ph.D. Division of Biomedical Informatics,
Data Sources & Using VIVO Data Visualizing Scholarship VIVO provides network analysis and visualization tools to maximize the benefits afforded by the.
Image Query (IQ) Project Update Building queries one question mark at a time March, 2009.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
The cancer Biomedical Informatics Grid™ (caBIG™): In Vivo Imaging Workspace Projects Fred Prior, Ph.D. Mallinckrodt Institute of Radiology Washington University.
System Design/Implementation and Support for Build 2 PDS Management Council Face-to-Face Mountain View, CA Nov 30 - Dec 1, 2011 Sean Hardman.
Technical Introduction to caGrid Service Development caGrid 1.3 Justin Permar caGrid Knowledge Center
OpenMDR: Generating Semantically Annotated Grid Services Rakesh Dhaval Shannon Hastings.
1 ISO Data Types Adoption - The Plan and the Tools Architecture/VCDE Joint Face-to-Face June 3, 2010 St. Louis, Missouri Sichen Liu CBIIT Core Infrastructure.
OpenMDR: Alternative Methods for Generating Semantically Annotated Grid Services Rakesh Dhaval Shannon Hastings.
Cancer Bioinformatics Grid (caBIG) CANS 2006 Chicago, Illinois Shannon Hastings Department of Biomedical Informatics Ohio State University.
LexEVS 6.0 Overview Scott Bauer Mayo Clinic Rochester, Minnesota February 2011.
Terminology Metadata Extension of the Service Meta Model SWG Proposal January 2008.
Terminology Metadata Salvatore Mungal Duke University Extension of the Service Meta Model Faro, Portugal, 16 th November 2008.
Department of Biomedical Informatics Service Oriented Bioscience Cluster at OSC Umit V. Catalyurek Associate Professor Dept. of Biomedical Informatics.
CaBIG Semantic Infrastructure 2.0: Supporting TBPT Needs Dave Hau, M.D., M.S. Acting Director, Semantic Infrastructure NCI Center for Biomedical Informatics.
LexEVS Overview Mayo Clinic Rochester, Minnesota June 2009.
Cancer Clinical Trial Suite (CCTS): An Introduction for Users A Tool Demonstration from caBIG™ Bill Dyer (NCI/Pyramed Research) June 2008.
Using the Open Metadata Registry (openMDR) to create Data Sharing Interfaces October 14 th, 2010 David Ervin & Rakesh Dhaval, Center for IT Innovations.
H Using the Open Metadata Registry (OpenMDR) to generate semantically annotated grid services Rakesh Dhaval, MS, Calixto Melean,
Middleware Support for Virtual Organizations Internet 2 Fall 2006 Member Meeting Chicago, Illinois Stephen Langella Department of.
LexBIG Release Overview Aug 21, LexBIG Context Project Goals for Sept –Incremental point release of LexBIG infrastructure to support EVS activities.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
Value Set Resolution: Build generalizable data normalization pipeline using LexEVS infrastructure resources Explore UIMA framework for implementing semantic.
CaBIG ® VCDE Workspace Tactics thru June 14, 2010: How working groups fit together, and other activities Brian Davis April 1, 2010 VCDE WS Teleconference.
Open Terminology Portal (TOP) Frank Hartel, Ph.D. Associate Director, Enterprise Vocabulary Services National Cancer Institute, Center for Biomedical Informatics.
Shannon Hastings Multiscale Computing Laboratory Department of Biomedical Informatics.
1 LS DAM Overview and the Specimen Core February 16, 2012 Core Team: Ian Fore, D.Phil., NCI CBIIT, Robert Freimuth, Ph.D., Mayo Clinic, Elaine Freund,
Ashish Sharma, Tony Pan, Barla Cambazoglu, Joel Saltz Ohio State University, Columbus, OH (ashish, tpan, October 10, 2007 caBIG In Vivo.
CaCORE Software Development Kit George Komatsoulis 25-Feb-2005.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
CaDSR Software Users Meeting 3.1 Requirements Review 9/19/2005 caDSR Software Team Host: Denise Warzel NCICB, Assistant Director, caDSR.
CaGrid Overview and Core Services caGrid Knowledge Center February 2011.
A LexWiki-based Representation and Harmonization Framework for caDSR Common Data Elements Guoqian Jiang, Ph.D. Robert Freimuth, Ph.D. Harold Solbrig Mayo.
1 Service Creation, Advertisement and Discovery Including caCORE SDK and ISO21090 William Stephens Operations Manager caGrid Knowledge Center February.
NeuroLOG ANR-06-TLOG-024 Software technologies for integration of process and data in medical imaging A transitional.
What is NCIA? National Cancer Imaging Archive Searchable repository of in vivo cancer images in DICOM format Publicly available at no cost over the Internet.
May 2007 Registration Status Small Group Meeting 1: August 24, 2009.
In Vivo Imaging Middleware and Applications RSNA 2007 Berkant Barla Cambazoglu The Ohio State University Department of Biomedical Informatics.
Patterns in caBIG Baris E. Suzek 12/21/2009. What is a Pattern? Design pattern “A general reusable solution to a commonly occurring problem in software.
CaBIG™ Terminology Services Path to Grid Enablement Thomas Johnson 1, Scott Bauer 1, Kevin Peterson 1, Christopher Chute 1, Johnita Beasley 2, Frank Hartel.
Compatibility Review System 3.0 Robert Freimuth October 28, 2008 Overview.
National Cancer Institute 1 1 LexBIG integration caCORE Software User Meeting Aug 7, 2006.
December 2006 Federated Query Ian Fore, NCICBIIT David Ervin, Ohio State University Arch \ VCDE Face-to-Face Meeting Salt Lake City, UT January 29, 2008.
1 LS DAM Overview August 7, 2012 Current Core Team: Ian Fore, D.Phil., NCI CBIIT, Robert Freimuth, Ph.D., Mayo Clinic, Mervi Heiskanen, NCI-CBIIT, Joyce.
CaCORE In Action: An Introduction to caDSR and EVS Browsers for End Users A Tool Demonstration from caBIG™ caCORE (Common Ontologic Representation Environment)
1 HL7 SAIF Enterprise Conformance and Compliance Framework (ECCF) Overview Baris E. Suzek Bob Freimuth VCDE Monthly Meeting December, 2010.
National Cancer Institute caCORE Software Developers Meeting Agenda and meeting notes July 26, 2007.
National Cancer Institute caDSR Briefing for Small Scale Harmonication Project Denise Warzel Associate Director, Core Infrastructure caCORE Product Line.
Tony Pan, Stephen Langella, Shannon Hastings, Scott Oster, Ashish Sharma, Metin Gurcan, Tahsin Kurc, Joel Saltz Department of Biomedical Informatics The.
0 caBIG and caGrid: Interoperable Computing Infrastructure for the Nation’s [and World’s] Cancer Research Enterprise Peter A. Covitz, Ph.D. Chief Operating.
Information Representation Working Group: Kickoff ‘08 IRWG Working Group May 13, 2008.
VCDE Silver Level Compatibility Review Digital Model Repository (DMR) 1.0 Mukesh Sharma VCDE WS Teleconference 01/08/2009.
VCDE WS in EY2 Where we are, where we’re going ICR WS Teleconference Brian Davis – VCDE WS Lead March 26, 2008.
Portlet Development Konrad Rokicki (SAIC) Manav Kher (SemanticBits) Joshua Phillips (SemanticBits) Arch/VCDE F2F November 28, 2008.
Cancer Bioinformatics Grid (caBIG) CANS 2006 Chicago, Illinois
NCI Center for Biomedical Informatics and Information Technology (CBIIT) The CBIIT is the NCI’s strategic and tactical arm for research information management.
Fred Prior, Ph.D. Mallinckrodt Institute of Radiology
MSDI training courses feedback MSDIWG10 March 2019 Busan
Presentation transcript:

caBIG™ Architecture Vocabularies and Common Data Elements Joint Workspace Face-to-Face Meeting University of Utah, Salt Lake City January 28-30, 2008

Introduction Goal: Provide an overview of the Arch/VCDE F2F meeting at Huntsman Cancer Institute at University of Utah These slides will be posted with meeting notes at ICR WS gforge site: All meeting material including presentations are available at: _doc_group_id=2582&language_id=1http://gforge.nci.nih.gov/docman/index.php?group_id=357&selected _doc_group_id=2582&language_id=1 Presenters can provide for more information

Summary of Day 1 As covered on ICR Meeting on February,13, 2007, highlights: Idea of using design templates to guide/drive future caBIG/caGrid middleware development Summary of last year activities and future goals of Arch/VCDE Workspaces External entities using/testing caBIG technologies: National Public Health Grid / CDC Training activities: Silver to Grid Training Module Approval of Gold Compatibility Guidelines

Theme: The Expanding caBIG™ Community and the Impact on Technology

caCORE 4.0 Overview Denise Warzel - NCI Center for Biomedical Informatics and Information Technology (CBIIT) Overview caCORE product line – What is new in 4.0? caCORE-like systems are about methodologies (not simply tools): model driven, agile development, object oriented, open source, service oriented architecture, XML schema transport, registered ISO Metadata and uses controlled vocabularies caCORE products: Helps to build a framework for developers to build apps that are interoperable Enable discovery and ability to move data around Future plan for caCORE products: Simplified and improved tools – “develop, access, consume” Support HL7 Datatypes – building a roadmap for HL7 caBIG participants Services to leverage semantic metadata Collaborative Terminology and caDSR Metadata development - Semantic Media Wiki, workflow support Simplified granular interoperability (Object and CDE level) Reusable ‘plug-ins’ to support building interoperable grid services e.g. validation services Optimized caDSR and UIs to support faster/simplified access

caGrid 1.x Scott Oster – Ohio State University Overview of caGrid 1.x caGrid 1.2 Highlights (caCORE SDK 4.0 support, bug fixes, new portal and more) Work in progress Simplified alternative to GridFTP for binary data transfer (caGrid transfer service) Enhancements to Data Service query language to meet community needs (CQL 2.0) Designing approach for metrics collection/statistics Future focus (selected) Incorporating outcomes from working groups such ASBP, Workflow, HTP Integrating with forthcoming registered UML/XML binding information (Introduce integration, caDSR grid service, GME enhancements) Tighter integration with other tools/projects relevant to the “caBIG Process” e.g. caCORE SDK Continuing improvement for more complex service requirements

Theme: Domain Workspace Requirements and the Impact on caBIG™ Infrastructure: Data Services/Federated Query

caCORE SDK and caGrid Interface Satish Patel - NCICB Dave Ervin – Ohio State University Overview of SDK 4.0 and new features: Re-architected system (Concurrent connection to services, POJO) Enhanced security (Attribute level security using CSM) Enhanced code generation Performance improvement Relaxed restrictions on object/data model development (No “id” attributes) New features in SDK 4.1: Support for HL7 complex data types Freestyle search Graphical installer to generate system using SDK caGrid Data Services integration with caCORE SDK is improved: Data service styles Query processors based on pluggable architecture SDK/caGrid joint development efforts

Federated Query Impact on caCORE APIs and caGrid Ian Fore - NCICBIIT Dave Ervin - Ohio State University Limitations of different query layers (SQL -> Hibernate -> caCORE API/QBE -> CQL/DCQL) caTissue use cases and solutions using different query layers Planned features in CQL 2.0 based on use cases from TBPT and IVI: Association population (going beyond targeted objects) Typed attributes (date, boolean etc) for binary operators (e.g. equal, not equal) Query modifiers (distinct, min, max) DCQL 2.0 will be build on builds on CQL 2.0

Martin Morgan – Fred Hutchinson Cancer Research Center. Shannon Hastings – Ohio State University BDT (or HTP) Requirements Data transfer and parsing related ‘Workflow’ related (Interactive, Stateful, Cooperative) Implementation related (Secure, Strongly typed, Interoperable Available BDT solutions: GridFTP WS-Enumeration Endpoint references Issues with existing solutions: Installation/configuration/platform/usability issues for GridFTP New solution: “caGrid Transfer” with caGrid 1.2 Bulk Data Transfer

Theme: Domain Workspace Requirements and the Impact on caBIG™ Infrastructure: Metadata

caDSR and Population Sciences Paul Courtney - Pop Sci SIG Lead, Dartmouth Medical School Overview of Population Science (Goals,Tools, Data of interest) How current caDSR metadata can serve Population Scientists? Need recognition that the context of a CDE within the construct of a questionnaire provides the context and semantics for that CDE Forms level metadata: About the questionnaire/survey tool as a whole About the administration of the questionnaire/survey tool (Currently working on Form Builder to accommodate needs) Future efforts: Work to bring population scientists into the process of defining Forms-level metadata requirements Move the process of bringing in questionnaires from manual curation to UML Modeling Identification of population science/public health ontologies to be used Applications that can link epi and socio-economic status (SES) data

caTissue Suite Dynamic Extensions George Komatsoulis -NCICB Denise Warzel -NCICB Poornima Govindrao –Persistent Systems (caTissue Suite developer team) Ian Fore - NCICB caTissue overview caTissue Suite Dynamic Extensions motivation: Impossible to imagine research software that isn’t extensible Commercial software for research often provide extensibility Current DE implementation Form builder UI XMI import and export System generated data entry forms to accept user input Integrated with caTissue Query interface to query across static and dynamic classes Captures metadata for UML, caDSR data, UI controls, Database Arch/VCDE Workspace related implications will be discussed in a working group (tooling, review and mentoring process) as decided in the break-out session

Theme: Domain Workspace Requirements and the Impact on caBIG™ Infrastructure: Security

caGrid/GAARDS Security Overview (Stephen Langella – OSU) Services and tools for the administration and enforcement of security policy in an enterprise Grid caBIG Clinical Trials Suite Requirements (Edmond Mulaire – SemanticBits) CCTS needs single sign on (SSO) caXchange Requirements (Kalpesh Patel – Ekagra Software) caXchange acts as a proxy for message originator Only Grid authenticated user should be able to submit the message caXchange must be able to act on behalf of the authenticated user WebSSO Solutions/Implementation (Kunal Modi – Ekagra Software) WebSSO provides the Single Sign On capabilities for the web applications as well the grid services using a single solution Credential Delegation Service (Stephen Langella – OSU) CDS,WSRF-compliant Grid service, enables users/services (delegator) to delegate their Grid credentials to other users/services (delegatee) such that the delegatee(s) may act on the delegator's behalf

geWorkbench/caGrid/TeraGrid Interface and Demo Introduction on TeraGrid Workgroup (Scott Oster- OSU) TeraGrid is an NSF high end computing infrastructure. Background on geWorkbench and geWorkbench/caGrid/TeraGrid Project (Christine Hung – Columbia University) geWorkbench – platform for data integration for genomics with tools to manage, analyze, annotate and visualize data Description of steps to establish geWorkbench/caGrid/TeraGrid Interface Demo (Christine Hung – Columbia Univeristy, Ravi Madduri – Argonne National Lab) Running a geWorkbench’s Hierarchical Clustering Service using caGrid/TeraGrid gateway

Security Working Group George Komatsoulis - NCICB Marsha Young - Booz Allen Hamilton Overview of working groups and current status caBIG™ initiatives for federated authentication and authorization caBIG™ Data Sharing and Security Framework (DSSF) Determine which data can be shared Identify necessary access and data security controls (authentication, authorization)

Ravi Madduri – Argonne National Lab Background, goals, issues & activities Review existing workflow authoring tools and suggest a tool that can be extended to be used with caBIG services. - Taverna Implement (and execute) a workflow for a specific scientific domain using this tool and existing caGrid data and analytical services. – Demo Demo: A simple (yet typical) use case for microarray analysis: 1.Locate the datasets of interest 2.Obtain the data (caArray) 3.Preprocess the data 4.Cluster the data (GenePattern) ICR Workflow Working Group Activities

caGrid Portal Joshua Phillips – Ohio State University Goal of caGrid Portal ( Provide visualization of caGrid functionality. Demonstrate how caGrid supports semantic and syntactic interoperability Demo current functionalities: Discovery Metadata exploration Status monitoring Identity federation Data service query Query sharing Future direction: Demonstrate use of semantic metadata (grid-join) Support new CQL and DCQL features Expose workflow functionality Increase support for knowledge sharing features

caBIG/ONIX Collaboration Max Wilkinson - Scientific IT Analyst ONIX Platform Development, UK NCRI Informatics Coordination Unit Oncology Information Exchange: ONIX Goal: Using technology to make use of information relating to cancer cause, prevention and cure Using informatics to maximise the impact of cancer research through better data sharing Broadly similar goals as caBIG

Analytical Service Best Practices Working Group Activites Baris Suzek – Georgetown University Shannon Hastings – Ohio State University Charter & Objectives Issues & Solutions Model Reuse XSD Reuse and/or Generation Process used for Service Development Recommended Process for Future Development by caGrid team ( Top down (cleaner) Bottom up Outstanding Issues Generic Parameters Next Steps (was presented to group on February 27 th )

Breakout Sessions Working Session – Gold Review Process and Review Criteria Level of CDE Reuse High impact/Standard CDEs Backbone model Dynamic Extensions Convene a Working Group consisting of TBPT,VCDE and Arch reps Semantic Discovery and Query Convene Semantic Query Working Group Explore Semantic Web technologies to assist in the discovery of a set of data services that could collaborate to answer a query that was expressed in terms of concepts (rather than data types)

Birds of a Feather Session: Semantic MediaWiki Hands-On Introduction (Frank Hartel – NCICB)) Biomedical Grid Terminology (BiomedGT) is an open collaboratively developed terminology for translational researchBiomedGT Demo of BiomedGT MediaWiki for Collaborative Terminology Development (Harold Solbrig – Apelon)

Theme: Arch/VCDE Workspace Requirements and the Impact on caBIG™ Infrastructure: Metadata (Day 3)

caDSR/GME Mapping Denise Warzel - NCICB Scott Oster – Ohio State University Problem: Achieving caBIG interoperability goals on the grid requires not only sound handling of both syntax and semantics, but also a formal binding between them Previous “solution”: Require an XML Schema for each package that followed a namespace construction rule (implicit binding) Solution: Planned definition of mapping rules specify how a given UML entity is represented in XML over the grid Mapping maintained in the caDSR, lookup and query available through caDSR grid service

High-Impact Common Data Element Identification Process CDE Leadership Group – VCDE Workspace Mukesh Sharma – Washington University St. Louis Introduction ‘High impact’ or standard CDEs are pervasive through many developer projects and are ‘touch points’ for semantic interoperability To permit interoperability, the classes and attributes in the UML models for different applications should be semantically annotated to be the same Approach Manual: Review models, find common objects/classes, expand “backbone” models, propose standards Automated: Use caDSR metadata to identify CDEs Next steps Complete manual review (15/76 models completed to date) Complete automated review using CRS v2.0 software Determine extension of the backbone model based on findings Creation of standard CDEs that may be shared more broadly

Terminology Metadata: Extension of the Service Meta Model Tom Johnson - Mayo Clinic Goal: Identify and model metadata needed to discover vocabularies on the grid Standards considered Dublin Core ISO /3/6: classification, registries, admin National Center for Biomedical Ontology (NCBO) BioPortal and more Next steps Model harmonization w/ recommended ISO superclasses Change caGrid tooling to capture additional metadata when registering terminology Create custom discovery client for terminology services, to take advantage of additional metadata in support of identified use cases Vote taken on criteria identified, not model per se [APPROVED]

The Vocabulary Resources: LexBIG and EVS and NCBO Browser Tom Johnson- Mayo Clinic Overview of vocabulary resources: LexGrid – raw content: model and data storage which defines concepts and properties as well as relations and associations and supports loaders/ representations such as OWL, OBO, RRF, Protégé, XML. LexBIG API – allows you to fetch data EVS caCORE API – this is in the distributed environment where LexBIG is local with the caCORE externalization which talks to database BioPortal – Web-based-features driven by the infrastructure and can chose code systems in addition to text, etc. Future browser support: OpenPortal - a collaborative effort to develop an open, site neutral and easily extensible qeb service allowing users to browse, search, and visualize ontologies stored in LexGrid repositories

The Vocabulary Review (the process and the resource together): LOINC James Cimino – Columbia University Lab LOINC® (Logical Observations Identifiers, Names, Codes) a clinical terminology important for laboratory test orders and results. Results of review process: Met most criteria (where lacking primarily in documentation) Lessons learned from review process—generally good but needs: More active participation by developer especially with respect to documentation Content available in a standard exchange format and QCd to make sure all reviewers have access Reviewers experienced with domain and vocabulary; evaluation experience is helpful because there is a steep learning curve. Notes and examples needed on criteria matrix Vote: Lab LOINC approved as caBIG terminology [Approved]

caGrid Queries into the Ontologic Space James Buntrock - Mayo Clinic Harold Solbrig - Apelon Motivation: Leverage additional semantics used for caGrid application development Provide next generation of design time activities (e.g. CDE/Model Reuse) Provide next generation semantically aware services for runtime activities (e.g. NLP) Semantic Query WG Charter: Use cases for search, retrieval, and aggregation from one or more data nodes on caGrid leveraging the semantics in vocabulary Utilize or inform future caGrid runtime and design components Semantic Query WG Deliverables: White Paper that discusses the use cases and the modifications to the caGrid software and design activities Review by VCDE and Arch Workspaces with recommendations Construction of a prototype or proof of concept implementation Evaluation of the benefits and costs of supporting semantic query capabilities on caGrid

Additional Information These slides will be posted with meeting notes at ICR WS gforge site: All Joint Arch/VCDE WS presentations are available at: _doc_group_id=2582&language_id=1http://gforge.nci.nih.gov/docman/index.php?group_id=357&selected _doc_group_id=2582&language_id=1 Presenters

Acknowledgements Elaine Freund Grace A. Stafford Brian Davis Li Kramer

Additional Slides

Introduction to Huntsman Cancer Institute at University of Utah Joyce A. Mitchell-Welcome, PhD, FACMI, FACMG Associate Vice President, Health Sciences Information Technology Chair, Department of Bioinformatics, University of Utah Huntsman Cancer Institute's Mission Cancer Genetic Research at HCI Utah Population Database Largest genetic database in the world (6.5 million individuals) Large pedigrees enable genetics Used in identification of major cancer genes such as BRCA1 and BRCA2 Department of Biomedical Informatics and its activities

Middleware and use of Design Templates in Translational Research Joel Saltz, MD, PhD Chair, Department of Biomedical Informatics, OSU College of Medicine Idea of using design templates to guide/drive future caBIG/caGrid middleware development Design templates for Translational Research: Coordinated Systems-Level Attack on Focused Problem Prospective clinical research study Multiscale Investigations that encompass genomics, epigenetics, (micro)anatomic structure and function Secondary Data Analysis Adaptive Image Guided Intervention Ad-hoc discovery, query, invocation of discrete services caGrid related Middleware Challenges: Data and Analytical Services Support for federated querying and grid services/workflow Semantic infrastructure Security Governance of middleware development

Theme: The Evolution of the Architecture and VCDE Workspaces in the Context of the Expanding caBIG™ Community

Impact of the caBIG™ Enterprise on the Architecture and VCDE Workspaces: EY2 Challenges Avinash Shanbhag - Architecture Workspace Lead, NCICB George Komatsoulis – VCDE Workspace Lead, NCICB Perspective on past year Policies – Compatibility/mentoring guidelines, compatibility review process, security policies, vocabulary review guidelines and bronze level certification Infrastructure – caGrid 1.1 support for semantic/syntactic interoperability, deployment tools Standards Domain Workspace Assistance – Support for domain workspaces (ICR, CTMS..) and working groups (ASBP, TeraGrid, Security) Goals for the next year (no major change in direction) Gold compatibility - Review process and mentorship High impact data standards – Data standard submissions from Domain Workspaces (e.g. BRIDG) Infrastructure enhancement per needs Training and documentation Security – Bridging policies and technologies for service configuration Federated Vocabulary Environment using caGrid Community expansion – Adoption/adaption support Integration with other Biomedical Research Grids/Organizations - NCRI/ONIX (UK), National Health Information Network Plan: Working towards goals

External Communities Investigating/Leveraging caBIG™ Governance and Technology Ken Hall – BearingPoint Scott Halpine - SCI Group National Public Health Grid 23 programs in the Local Health Departments (HDs) 19 programs in the State Health Departments There are 3000 local HDs and 50 State HDs Public Health Informatics Challenges (not that different from caBIG): Public health data widely distributed Volume of public health data growing rapidly Many cultural, social and political impediments to data sharing Requires a stronger economic model for long-term financial sustainability Uniquely dynamic, complex and global in scale Many redundant systems, application silos and data silos Current thinking: Explore/leverage existing Grid technologies and align with other nationwide health initiatives Pilot study at Center of Disease Control and Prevention (CDC) – A silver level compatible data service as a “proof of concept”

Updates to Silver Compatibility Checklist Guide to Mentors – VCDE/Architecture Workspaces Revision proposal and approval to allow Java primitive data types Change wording in checklist to: Class and Attribute datatypes must be approved by the VCDE and Architecture workspaces, and/or mapped to the equivalent datatype in the caDSR per the datatypes white paper [APPROVED]

Compatibility Guidelines version 3.0 Final Approval Gold Compatibility Guidelines Working Group – VCDE/Architecture Workspaces Overview of changes to Compatibility Guidelines v3.0 Responses to VCDE/Architecture Workspace participants Discussion and Vote [APPROVED] Next Steps Send for review by NCI Senior Leadership Release to caBIG community and collect comment Kick-off of 4 Gold compatibility review process working group (Vocabulary, Architecture, Information Model, Common Data Elements) Update as new needs emerge (version 3.1..) Initial discussions around the responsibilities and issues relevant to working groups

Silver to Grid Training Module Development- Overview and Demos Baris Suzek, Peter McGarvey – Protein Information Resource Georgetown University Overview of a hands-on training module to cover all steps to develop a data service from an idea to a caGrid data service Description of codebase and individual lessons for WS participants input Challenges, lessons learned and recommendations experienced Demo Lesson: Practical Metadata Reuse Finding and reusing different component including standards, models, CDEs with current tools and repositories: UML Model Browser, EA, SIW, caDSR.. Demo Lesson: Using caGrid to for Semantic Interoperability Use caGRID Service APIs (live demo) Discover information resources (Standard Vocabularies) Query resources using a standard language CQL (Standard APIs) Identify ways to combine information from multiple resources (CDE) Next steps Identification of volunteers to review individual sections

Compatibility Review Software Hands-On Training Robert Freimuth - Mayo Clinic Poornima Govindrao - Persistent Systems A system with the goal to make the process of compatibility review more efficient and reduce the administrative overhead Demonstration of workflow back and forth between developers and reviewers Hands-on mock review for participants

XC F2F action items for ICR Current Working Groups: Continue HTP WG. The Grid response to HTP is caTransfer (stateless transfer service over an http/https service) and the Grid team would like to continue with HTP WG. Define more/ provide a variety of workflows in the context of the use cases for continued software development (by caGrid team.) Include translational workflows. ASBP WG should address where you draw the line between benefits semantic interoperability but benefits of increased speed of analytical services. Interoperability: Proactively define CDEs for the ICR domain Define points and junctions where ICR will connect with other translational tools. Engage the workspaces in the identification process. CTMS and Imaging were specifically called out.

XC F2F action items for ICR--continued Other Select ICR end-user volunteers to review the caGrid training – may be for next year’s program. Engage in Dynamic extensions uses and development Create or identify tools to construct semi-automatic construction of workflows to develop pipeline. ICR and VCDE to engage in and identify projects/targets for determining standardization on transfer of structured chunks of data in order to use tools like Taverna more efficiently.