CaGrid Service Metadata Scott Oster - Ohio State

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

Open Grid Forum 19 January 31, 2007 Chapel Hill, NC Stephen Langella Ohio State University Grid Authentication and Authorization with.
18 Copyright © 2005, Oracle. All rights reserved. Distributing Modular Applications: Introduction to Web Services.
Meta Data Larry, Stirling md on data access – data types, domain meta-data discovery Scott, Ohio State – caBIG md driven architecture semantic md Alexander.
Web Service Architecture
CACORE TOOLS FEATURES. caCORE SDK Features caCORE Workbench Plugin EA/ArgoUML Plug-in development Integrated support of semantic integration in the plugin.
CVRG Presenter Disclosure Information Tahsin Kurc, PhD Center for Comprehensive Informatics Emory University CardioVascular Research Grid Core Infrastructure.
Web Service Ahmed Gamal Ahmed Nile University Bioinformatics Group
UDDI v3.0 (Universal Description, Discovery and Integration)
CaBIG™ Terminology Services Path to Grid Enablement Thomas Johnson 1, Scott Bauer 1, Kevin Peterson 1, Christopher Chute 1, Johnita Beasley 2, Frank Hartel.
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
1 Introduction to SOA. 2 The Service-Oriented Enterprise eXtensible Markup Language (XML) Web services XML-based technologies for messaging, service description,
CORDRA Philip V.W. Dodds March The “Problem Space” The SCORM framework specifies how to develop and deploy content objects that can be shared and.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
System Design/Implementation and Support for Build 2 PDS Management Council Face-to-Face Mountain View, CA Nov 30 - Dec 1, 2011 Sean Hardman.
Processing of structured documents Spring 2003, Part 6 Helena Ahonen-Myka.
Technical Introduction to caGrid Service Development caGrid 1.3 Justin Permar caGrid Knowledge Center
OpenMDR: Generating Semantically Annotated Grid Services Rakesh Dhaval Shannon Hastings.
CISE Demonstrator Vincent Dijkstra DG Informatics (DIGIT)
Silver to Grid Data Services Session III: Deploying a Data Service on caGrid and using caGrid Service APIs caBIG™ Annual Meeting June 23-25, 2008.
OpenMDR: Alternative Methods for Generating Semantically Annotated Grid Services Rakesh Dhaval Shannon Hastings.
State of Service Oriented Science Tools Open Source Grid Cluster Conference Oakland.
CaGrid 2.0 December What is caGrid 2.0??? Provides a patch for caGrid 1.x to support SHA2 OSGi implementation of WSRF on the new technical stack.
Metadata Tools and Methods Chris Nelson Metanet Conference 2 April 2001.
Cancer Bioinformatics Grid (caBIG) CANS 2006 Chicago, Illinois Shannon Hastings Department of Biomedical Informatics Ohio State University.
Data Management Kelly Clynes Caitlin Minteer. Agenda Globus Toolkit Basic Data Management Systems Overview of Data Management Data Movement Grid FTP Reliable.
Digital Object Architecture
Department of Biomedical Informatics Service Oriented Bioscience Cluster at OSC Umit V. Catalyurek Associate Professor Dept. of Biomedical Informatics.
OASIS ebXML Registry Standard Open Forum 2003 on Metadata Registries 10:30 – 11:15 January 20, 2003 Kathryn Breininger The Boeing Company Chair, OASIS.
LexEVS Overview Mayo Clinic Rochester, Minnesota June 2009.
Using the Open Metadata Registry (openMDR) to create Data Sharing Interfaces October 14 th, 2010 David Ervin & Rakesh Dhaval, Center for IT Innovations.
Interfacing Registry Systems December 2000.
OEI’s Services Portfolio December 13, 2007 Draft / Working Concepts.
XML Registries Source: Java TM API for XML Registries Specification.
H Using the Open Metadata Registry (OpenMDR) to generate semantically annotated grid services Rakesh Dhaval, MS, Calixto Melean,
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
Web Services Based on SOA: Concepts, Technology, Design by Thomas Erl MIS 181.9: Service Oriented Architecture 2 nd Semester,
XML Web Services Architecture Siddharth Ruchandani CS 6362 – SW Architecture & Design Summer /11/05.
Shannon Hastings Multiscale Computing Laboratory Department of Biomedical Informatics.
LexBIG/LexGrid Services for LexBIG 2.3 Model and API for the Grid.
An Ontological Framework for Web Service Processes By Claus Pahl and Ronan Barrett.
Ashish Sharma, Tony Pan, Barla Cambazoglu, Joel Saltz Ohio State University, Columbus, OH (ashish, tpan, October 10, 2007 caBIG In Vivo.
Introduce Grid Service Authoring Toolkit Shannon Hastings, Scott Oster, Stephen Langella, David Ervin Ohio State University Software Research Institute.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Technology behind using Taverna in caGrid caGrid user meeting Stian Soiland-Reyes, myGrid University of Manchester, UK
Service Service metadata what Service is who responsible for service constraints service creation service maintenance service deployment rules rules processing.
CaGrid Overview and Core Services caGrid Knowledge Center February 2011.
1 Service Creation, Advertisement and Discovery Including caCORE SDK and ISO21090 William Stephens Operations Manager caGrid Knowledge Center February.
1 Registry Services Overview J. Steven Hughes (Deputy Chair) Principal Computer Scientist NASA/JPL 17 December 2015.
© FPT SOFTWARE – TRAINING MATERIAL – Internal use 04e-BM/NS/HDCV/FSOFT v2/3 JSP Application Models.
1 Open Ontology Repository initiative - Planning Meeting - Thu Co-conveners: PeterYim, LeoObrst & MikeDean ref.:
In Vivo Imaging Middleware and Applications RSNA 2007 Berkant Barla Cambazoglu The Ohio State University Department of Biomedical Informatics.
Providing web services to mobile users: The architecture design of an m-service portal Minder Chen - Dongsong Zhang - Lina Zhou Presented by: Juan M. Cubillos.
Patterns in caBIG Baris E. Suzek 12/21/2009. What is a Pattern? Design pattern “A general reusable solution to a commonly occurring problem in software.
Design for a High Performance, Configurable caGrid Data Services Platform Peter Hussey LabKey Software, Inc, Seattle, WA USA Contact:
1 Service Oriented Architecture SOA. 2 Service Oriented Architecture (SOA) Definition  SOA is an architecture paradigm that is gaining recently a significant.
CaBIG™ Terminology Services Path to Grid Enablement Thomas Johnson 1, Scott Bauer 1, Kevin Peterson 1, Christopher Chute 1, Johnita Beasley 2, Frank Hartel.
OASIS ebXML Registry Standard Open Forum 2003 on Metadata Registries 10:30 – 11:15 January 20, 2003 Kathryn Breininger The Boeing Company Chair, OASIS.
Copyright 2007, Information Builders. Slide 1 iWay Web Services and WebFOCUS Consumption Michael Florkowski Information Builders.
CaGrid 1.0 Security Infrastructure Stephen Langella, Scott Oster, Shannon Hastings, David Ervin, Joshua Phillips, Vinay Kumar, Tahsin Kurc, Joel Saltz.
National Cancer Institute caDSR Briefing for Small Scale Harmonication Project Denise Warzel Associate Director, Core Infrastructure caCORE Product Line.
Software Architecture Patterns (3) Service Oriented & Web Oriented Architecture source: microsoft.
0 caBIG and caGrid: Interoperable Computing Infrastructure for the Nation’s [and World’s] Cancer Research Enterprise Peter A. Covitz, Ph.D. Chief Operating.
International Planetary Data Alliance Registry Project Update September 16, 2011.
A Semi-Automated Digital Preservation System based on Semantic Web Services Jane Hunter Sharmin Choudhury DSTC PTY LTD, Brisbane, Australia Slides by Ananta.
IPDA Registry Definitions Project Dan Crichton Pedro Osuna Alain Sarkissian.
Cancer Bioinformatics Grid (caBIG) CANS 2006 Chicago, Illinois
Sabri Kızanlık Ural Emekçi
Wsdl.
SDMX IT Tools SDMX Registry
Presentation transcript:

caGrid Service Metadata Scott Oster - Ohio State

Agenda Service Overview Metadata Infrastructure Common Metadata Models Portal Metadata Examples Metadata-Driven Query Infrastructure Lessons Learned

caGrid Community Involvement caGrid itself provides no real “data” or “analysis” to caBIG™; its the enabling infrastructure which allows the community to do so The real “value” of the grid comes from bringing this information to the “end user” Community members develop end user applications which consume of the resources provided by the grid

What is a Community Provided caGrid Service? Silver compatible systems are exposed to the Grid as caGrid Services caDSR models are used for all data types, and transported over the grid in a common fashion Standardized, common pattern and mechanism for remote access Language and implementation technology independent Common security infrastructure for authentication and authorization Standardized service metadata models and metadata advertisement mechanisms Community provided service types: Data Services Expose data to the grid in a unified way Analytical Services Expose analytical operations to the grid

caGrid exposing Silver Systems Object Oriented APIs and data resources are developed using Object types and information models registered in the caDSR These “silver systems” are grid-enabled by defining a grid service interface that defines the functionality to be exposed to the grid The grid service interface uses the same Object types as the existing system, but leverages a platform and language neutral representation (XML) of them The grid service implementation maps service invocations to API calls or queries into the existing system

caGrid Metadata Infrastructure Goals Support a strongly typed grid Syntactic and Semantic interoperability Programmatic! Smooth transition from Application to Grid and back Leverage wealth of existing metadata Enable service Advertisement and Discovery

Metadata Services Cancer Data Standards Repository (caDSR) caBIG projects register their data models as Common Data Elements (CDEs) which are semantically harmonized and then centrally stored and managed the caDSR The caDSR grid service provides: Model discovery and traversal caGrid standard metadata generation capabilities Enterprise Vocabulary Services (EVS) EVS is set of services and resources that address the need for controlled vocabulary The EVS grid service provides: Query access to the data semantics and controlled vocabulary managed by the EVS Global Model Exchange (GME) GME is a DNS-like data definition registry and exchange service that is responsible for storing and linking together data models in the form of XML schema. The GME grid service provides: Access to the authoritative structural representation of data types on the grid Globus Information Services: Index Service The Globus Information Services infrastructure provides a generic framework for aggregation of service metadata, a registry of running Grid services, and a dynamic data- generating and indexing node, suitable for use in a hierarchy or federation of services The Index grid service provides: Yellow and white pages for the grid

caGrid Data Description Infrastructure Client and service APIs are object oriented, and operate over well-defined and curated data types Objects are defined in UML and converted into ISO/IEC Administered Components, which are in turn registered in the Cancer Data Standards Repository (caDSR) Object definitions draw from controlled terminology and vocabulary registered in the Enterprise Vocabulary Services (EVS), and their relationships are thus semantically described XML serialization of objects adhere to XML schemas registered in the Global Model Exchange (GME)

Advertisement and Discovery Overview Advertisement: The caGrid Grid Service Owner composes service metadata describing the service to the grid and publishes it to grid. The service metadata describes properties of the grid services that caGrid users and other grid services may query. Discovery: A caGrid Researcher specifies search criteria describing a service. The research submits the discovery request to a discovery service, which identifies a list of services matching the criteria, and returns the list to the researcher.

Advertisement and Discovery Process All services register their service location and metadata information to an Index Service The Index Service subscribes to the standardized metadata and aggregates their contents Clients can discover services using a discovery API which facilitates inspection of data types Leveraging semantic information in EVS (from which service metadata is drawn), services can be discovered by the semantics of their data types

Service Discovery Process Clients formulate a query over the caGrid standard metadata Examples: “Find me all the services from Ohio State’s Cancer Center” “Which Analytical services take Genes as input?” “Which Data services expose data relating to lung cancer?” “Find me all the services with some metadata mentioning the string ‘macromolecules’” This query is sent to the caGrid Index Service which returns the Address(es) of the services satisfying the query The client can then further interrogate the satisfying services by asking for all of their metadata or service descriptions Finally the client invokes the desired services as appropriate

Service Metadata: Core Model Common Service Metadata Provided by all services Details service’s capabilities, operations, contact information, hosting research center Service operation’s inputs and outputs defined in terms of structure and semantics extracted from caDSR and EVS Majority auto-generated by Introduce

Service Metadata: Service Security Service Security Metadata Provided by all services Details the service’s requirements on communication channel for each operation Can be used by client to programmatically negotiate an acceptable means of communication For example: Does operation X allow anonymous clients, or are credentials required? Auto-generated by Introduce

Service Metadata: Data Service Data Service Metadata Provided by all data services Describes the Domain Model being exposed, in terms of a UML model linked to semantics Provides information needed to formulate the Object-Oriented Query As with common metadata, data types defined in terms of structure and semantics extracted from caDSR and EVS Auto-generated by Introduce

caGrid Portal: Service Map Google Maps integration enabled by Center Information in metadata Recent services and categorization discovered from Index Service

caGrid Portal: Metadata-driven Discovery Structured discovery queries can be constructed over the metadata model Keyword expansion with information from the controlled terminology available via the EVS

caGrid Portal: Service Details Each discovered service’s metadata can be perused Federated queries can be constructed graphically from auto- discovered potential semantic joins

Data Service Query Language Specifies a target object (result) type and selects the instances which satisfy the specified properties and nested object properties Allows path navigation Provides logical grouping Provides name/predicate/value filtering on properties of objects Recursively defined Ability to return full Objects, Set of attributes, count of results, or distinct attribute values

Example CQL Query Return all Genes with a symbol beginning with BRCA and have an associated Taxon with a scientificName equal to “Homo sapiens”:

Example CQL Query Return all Genes with a symbol beginning with BRCA and have an associated Taxon with a scientificName equal to “Homo sapiens”: LIKE “BRCA%”

Example CQL Query Return all Genes with a symbol beginning with BRCA and have an associated Taxon with a scientificName equal to “Homo sapiens”: LIKE “BRCA%”

Example CQL Query Return all Genes with a symbol beginning with BRCA and have an associated Taxon with a scientificName equal to “Homo sapiens”: LIKE “BRCA%” = “Homo sapiens”

Federated Query Processor Provides a mechanism to perform basic distributed aggregations and joins of queries over multiple data services As caGrid data services all use a uniform query language, CQL, the Federated Query Infrastructure can be used to express queries over any combination of caGrid data services Federated queries are expressed with a query language, DCQL, which is an extension to CQL to express such concepts as joins, aggregations, and target services Implemented as a stateful grid service, queries may be executed asynchronously and results retrieved at a later time Supports secure deployments wherein result ownership is enforced Coupled with semantic discovery capabilities of caGrid, provides a powerful framework for data discovery, mining, and integration

Lessons Learned Applications leveraging metadata will proliferate… Therefore, having a common “base model” is important Therefore, plan to assert its authenticity Therefore, consider future sources of information, and how to differentiate between them You don’t know what your users will want to do tomorrow… Therefore, design the model with extensibility in mind Therefore, have a plan to decide what should be incorporated into a common/standard model and what is “application specific” In distributed systems, aggregated information is always out of date… Therefore, only capture information which you can reliably use out of date given your scalability and performance needs