CaGrid Overview and Core Services caGrid Knowledge Center February 2011.

Slides:



Advertisements
Similar presentations
Open Grid Forum 19 January 31, 2007 Chapel Hill, NC Stephen Langella Ohio State University Grid Authentication and Authorization with.
Advertisements

CVRG Presenter Disclosure Information Joel Saltz MD, PhD Director Comprehensive Informatics Center Emory University Translational Research Informatics.
CACORE TOOLS FEATURES. caCORE SDK Features caCORE Workbench Plugin EA/ArgoUML Plug-in development Integrated support of semantic integration in the plugin.
CVRG Presenter Disclosure Information Tahsin Kurc, PhD Center for Comprehensive Informatics Emory University CardioVascular Research Grid Core Infrastructure.
Earth System Curator Spanning the Gap Between Models and Datasets.
Connect. Communicate. Collaborate Click to edit Master title style MODULE 1: perfSONAR TECHNICAL OVERVIEW.
Dorian Grid Identity Management and Federation Dialogue Workshop II Edinburgh, Scotland February 9-10, 2006 Stephen Langella Department.
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
CaGrid Service Metadata Scott Oster - Ohio State
CaGrid Overview AstraZeneca Workshop Rockville, MD May 2011.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
System Design/Implementation and Support for Build 2 PDS Management Council Face-to-Face Mountain View, CA Nov 30 - Dec 1, 2011 Sean Hardman.
Technical Introduction to caGrid Service Development caGrid 1.3 Justin Permar caGrid Knowledge Center
February Semantion Privately owned, founded in 2000 First commercial implementation of OASIS ebXML Registry and Repository.
OpenMDR: Generating Semantically Annotated Grid Services Rakesh Dhaval Shannon Hastings.
CaGrid Executive Introduction caGrid 1.3 Justin Permar caGrid Knowledge Center kc.nci.nih.gov/CaGrid/KC.
Department of Biomedical Informatics Development of Ontology-anchored Grid-based Data Services to Facilitate Integrative Clinical and Translational Science.
Adapting an Existing Data Service to be caBIG™ Silver-level Compliant Peter Hussey LabKey Software, Inc, Seattle, WA USA Contact: Abstract.
Silver to Grid Data Services Session III: Deploying a Data Service on caGrid and using caGrid Service APIs caBIG™ Annual Meeting June 23-25, 2008.
OpenMDR: Alternative Methods for Generating Semantically Annotated Grid Services Rakesh Dhaval Shannon Hastings.
State of Service Oriented Science Tools Open Source Grid Cluster Conference Oakland.
CaGrid 2.0 December What is caGrid 2.0??? Provides a patch for caGrid 1.x to support SHA2 OSGi implementation of WSRF on the new technical stack.
Cancer Bioinformatics Grid (caBIG) CANS 2006 Chicago, Illinois Shannon Hastings Department of Biomedical Informatics Ohio State University.
LexEVS 6.0 Overview Scott Bauer Mayo Clinic Rochester, Minnesota February 2011.
Using the SAS® Information Delivery Portal
Digital Object Architecture
Department of Biomedical Informatics Service Oriented Bioscience Cluster at OSC Umit V. Catalyurek Associate Professor Dept. of Biomedical Informatics.
OASIS ebXML Registry Standard Open Forum 2003 on Metadata Registries 10:30 – 11:15 January 20, 2003 Kathryn Breininger The Boeing Company Chair, OASIS.
Fundamentals of Database Chapter 7 Database Technologies.
LexEVS Overview Mayo Clinic Rochester, Minnesota June 2009.
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
Using the Open Metadata Registry (openMDR) to create Data Sharing Interfaces October 14 th, 2010 David Ervin & Rakesh Dhaval, Center for IT Innovations.
Interfacing Registry Systems December 2000.
© DATAMAT S.p.A. – Giuseppe Avellino, Stefano Beco, Barbara Cantalupo, Andrea Cavallini A Semantic Workflow Authoring Tool for Programming Grids.
H Using the Open Metadata Registry (OpenMDR) to generate semantically annotated grid services Rakesh Dhaval, MS, Calixto Melean,
Middleware Support for Virtual Organizations Internet 2 Fall 2006 Member Meeting Chicago, Illinois Stephen Langella Department of.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
Nadir Saghar, Tony Pan, Ashish Sharma REST for Data Services.
Shannon Hastings Multiscale Computing Laboratory Department of Biomedical Informatics.
Ashish Sharma, Tony Pan, Barla Cambazoglu, Joel Saltz Ohio State University, Columbus, OH (ashish, tpan, October 10, 2007 caBIG In Vivo.
Introduce Grid Service Authoring Toolkit Shannon Hastings, Scott Oster, Stephen Langella, David Ervin Ohio State University Software Research Institute.
1 caGrid Security Overview Mark Grand Senior Engineer caGrid Knowledge Center February 7, 2011.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Efficient RDF Storage and Retrieval in Jena2 Written by: Kevin Wilkinson, Craig Sayers, Harumi Kuno, Dave Reynolds Presented by: Umer Fareed 파리드.
Technology behind using Taverna in caGrid caGrid user meeting Stian Soiland-Reyes, myGrid University of Manchester, UK
ModelPedia Model Driven Engineering Graphical User Interfaces for Web 2.0 Sites Centro de Informática – CIn/UFPe ORCAS Group Eclipse GMF Fábio M. Pereira.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
ACGT: Open Grid Services for Improving Medical Knowledge Discovery Stelios G. Sfakianakis, FORTH.
1 Service Creation, Advertisement and Discovery Including caCORE SDK and ISO21090 William Stephens Operations Manager caGrid Knowledge Center February.
1 Registry Services Overview J. Steven Hughes (Deputy Chair) Principal Computer Scientist NASA/JPL 17 December 2015.
Module 9 User Profiles and Social Networking. Module Overview Configuring User Profiles Implementing SharePoint 2010 Social Networking Features.
What is NCIA? National Cancer Imaging Archive Searchable repository of in vivo cancer images in DICOM format Publicly available at no cost over the Internet.
Adapting an Existing Data Service to be caBIG™ Silver-level Compliant Peter Hussey LabKey Software, Inc, Seattle, WA USA Contact: Abstract.
System/SDWG Update Management Council Face-to-Face Flagstaff, AZ August 22-23, 2011 Sean Hardman.
In Vivo Imaging Middleware and Applications RSNA 2007 Berkant Barla Cambazoglu The Ohio State University Department of Biomedical Informatics.
GRID ANATOMY Advanced Computing Concepts – Dr. Emmanuel Pilli.
Design for a High Performance, Configurable caGrid Data Services Platform Peter Hussey LabKey Software, Inc, Seattle, WA USA Contact:
CaBIG™ Terminology Services Path to Grid Enablement Thomas Johnson 1, Scott Bauer 1, Kevin Peterson 1, Christopher Chute 1, Johnita Beasley 2, Frank Hartel.
OASIS ebXML Registry Standard Open Forum 2003 on Metadata Registries 10:30 – 11:15 January 20, 2003 Kathryn Breininger The Boeing Company Chair, OASIS.
CEDPS Services Area Update CEDPS Face-to-Face Meeting ANL October 2007.
CaGrid 1.0 Security Infrastructure Stephen Langella, Scott Oster, Shannon Hastings, David Ervin, Joshua Phillips, Vinay Kumar, Tahsin Kurc, Joel Saltz.
Collaborative and Open Source Software Development NCI’s caBIG™ Collaborative Environment Sharon Gaheen, SAIC Program Manager Himanso Sahni, SAIC Chief.
ACGT Architecture and Grid Infrastructure Juliusz Pukacki ‏ EGEE Conference Budapest, 4 October 2007.
0 caBIG and caGrid: Interoperable Computing Infrastructure for the Nation’s [and World’s] Cancer Research Enterprise Peter A. Covitz, Ph.D. Chief Operating.
CTTI PROJECT Emory University, Quality Assurance and Review Center (QARC) and Washington University in St. Louis.
Developing our Metadata: Technical Considerations & Approach Ray Plante NIST 4/14/16 NMI Registry Workshop BIPM, Paris 1 …don’t worry ;-) or How we concentrate.
Cancer Bioinformatics Grid (caBIG) CANS 2006 Chicago, Illinois
The Anatomy and The Physiology of the Grid
SDMX IT Tools SDMX Registry
Presentation transcript:

caGrid Overview and Core Services caGrid Knowledge Center February 2011

caGrid A Grid software middleware infrastructure consisting of services, toolkits, APIs, and runtime environment Standards Based, Open Source Building blocks to create interoperable, Grid-enabled systems Service Oriented Architecture Web Services Resource Framework standards Model Driven Architecture Object oriented view, published information models, strongly-typed services Rich metadata A production Grid deployment of the core services provided by that infrastructure Security, Data Services Infrastructure, Service Development & Deployment, Metadata, Federated Query, Workflow, Advertisement & Discovery Provides the software foundation which underlies the tools and applications of caBIG

Application Scenario A clinician/researcher is involved in a multi-institutional clinical trial of a new targeted therapeutic Microarray, Proteomic, and Image data are collected from patients participating in the trial Researcher wants to carry out a correlative analysis to assess the treatment Query and analyze microarray, image, and protein data from multiple patients to find interesting patterns Look for similar patterns in other microarray, protein, and image databases Patients may have been seen at multiple institutions Datasets may have been collected at different institutions

Application Scenario Location A Microarray, Protein, Image data Location B Microarray, Protein, Image data Location C Microarray, Protein, Image data Location C Image Analysis Location D Image Analysis Microarray and protein databases at other institutions Different database systems, different data representations, security Different invocations of programs, remote access, how to transfer data.

caGrid Production Environment

Infrastructure Core Capabilities Model-Driven and Metadata Enabling and supporting interoperable services Providing service-oriented metadata Service development and deployment Tooling for bringing applications and data to the grid Advertisement and Discovery Publishing services to the Grid Enabling search for services based on service metadata Security Integrating existing systems and applications with Grid security Lowering burden of implementation of grid-wide and local policy Facilitating Grid wide operations Federated query, workflow execution Making services and core infrastructure more accessible Graphical installation and configuration, higher-level object-oriented APIs, web portals, graphical administrative applications

Model Driven, Interoperable Services Client and service APIs are object oriented, and operate over well-defined and curated data types Objects are defined in UML and Components, which are in turn registered in the Cancer Data Standards Repository (caDSR) Object definitions draw from controlled terminology and vocabulary registered in the Enterprise Vocabulary Services (EVS), and their relationships are thus semantically described XML serialization of objects adhere to XML schemas registered in the Global Model Exchange (GME)

Global Model Exchange and Metadata Model Services Global Model Exchange Provides support to store and retrieve schemas for types used in Grid services. Developers should register the schemas defining types used in Grid services with the GME. Metadata Model Service (MMS) Provides support for developers to generate and add service metadata Developers can augment standard caGrid service metadata with information from metadata registries, such as the caDSR External registry provides the means to add, modify, delete, or otherwise manage the UML models and their correspondence to XML Schemas which the MMS leverages

Service Development and Deployment: Introduce A framework which enables fast and easy creation of Grid services. Provides easy to use graphical service authoring tool. Hides all “grid-ness” from the developer. Handles all core service architecture requirements for strongly typed and highly interoperable grid services. Integration with other core grid services and architecture components GAARDS Security Infrastructure Globus Index Service Global Model Exchange Metadata Model Service Cancer Data Standards Repository Extension Framework for integrating with other architecture components

Introduce Features Supports modification of operations Adding operations Removing Operations Updating Operations Importing Operations Graphical Configuration Advertisement Security Service Metadata Specification Service Metadata Editing Service Configuration Properties Auto Generates Code for Service Auto generates a client API for service. Graphical Deployment of Service Globus Tomcat JBoss

Advertisement and Discovery: Index Service All services register their service metadata information to the Index Service Clients can discover services using a discovery API which facilitates inspection of data types Leveraging semantic information in EVS (from which service metadata is drawn), services can be discovered by the semantics of their data types Examples: “Find me all the services from Cancer Center X” “Which Analytical services take Genes as input?” “Find me all the services with some metadata mentioning the string ‘macromolecules’”

Service Metadata: Data Service Data Service Metadata Describes the Domain Model being exposed, in terms of a UML model linked to semantics Data types defined in terms of structure and semantics extracted from caDSR and EVS Auto-generated by caGrid service authoring toolkit (Introduce)

Security Services Authentication How to identify a client (or a service) Secure login Integrate the Grid with existing institutional login systems! Enforce data sharing policies and access control Local policies Federated access Trust Fabric How to trust a client and what level Dynamically adapt trust if security breach

caGrid Security Infrastructure (GAARDS) Dorian Allows accounts managed in external domains to be federated and managed in the Grid. Allows users to use their existing credentials (external to the Grid) to authenticate to the Grid Grid Grouper/CSM Provides a group-based authorization solution for the Grid Grid Trust Service Supports applications and services in deciding whether or not signers of digital credentials can be trusted. Supports the provisioning of trusted certificate authorities and corresponding certificate revocation lists.  Provides services and tools for the administration and enforcement of security policy in an enterprise Grid.

Secure Clinical Research Support with GAARDS Use Dorian for grid authentication Integrate with my LDAP user database and authentication Use Grid Grouper (along with local mechanisms) for Grid authorization I let reviewers from institution X access patient data in the “Watson” research trial for review only Data Entry personnel for the research trial have permission to add new data, but not update existing data I bar institution X from accessing any other data I’m sharing on the Grid Use GTS to update the grid trust fabric I trust institution Y after finalizing data sharing agreements for the Watson research

caGrid Data Service Infrastructure caGrid Data Services provide capability to expose data resources to the Grid Specialization of caGrid grid services to expose data through a common query interface Introduce extensions to create data services from information models and using caCORE SDK Queries made with caBIG Query Language Query objects. Specifies a target object (result) type and selects the instances which satisfy the specified properties and nested object properties Ability to return full Objects, Set of attributes, count of results, or distinct attribute values Support for Bulk Data Transport for efficient transfer of large data volumes

Federated Query Processor Service Provides a mechanism to perform basic distributed aggregations and joins of queries over multiple data services Can be used to express queries against any combination of caGrid data services, since each service uses CQL Federated queries are expressed using DCQL, an extension to CQL Express joins, aggregations, and target data services Client API provides a means of expressing DCQL queries Federated Query Processor service partitions a DCQL query into queries to respective data services, carries out joins and aggregations, and compiles the results 17

Workflow Service Provides capability to describe “orchestrations” of service invocations and data movement Support two workflow execution engines ActiveBPEL (Deprecated in caGrid 1.4) Taverna Coupled with semantic discovery, service metadata, and registration of data type structures in caGrid, provides a powerful framework for analyzing data Services can be dynamically discovered and federated queries can be invoked as part of a workflow

Putting It Together for Example Scenario Location A Microarray, Protein, Image data Location B Microarray, Protein, Image data Location C Microarray, Protein, Image data Location C Image Analysis Location D Image Analysis Microarray and protein databases at other institutions caGrid Service Interfaces caGrid Environment Registered Object Definitions Advertisement Log on, Grid credentials Query and Analysis Workflow Discovery