Semantic Grid Group Members: Phạm Đức Đệ Võ Bảo Hùng Hồ Phương
1 Outline Introduction Semantic Web S-OGSA Implementation ( e-Science & myGrid )
2 What is Grid? The "Grid” ◦ flexible, secure, coordinated resource sharing among dynamic collections of individuals, institutions, and resources - virtual organizations.
3 What is the Semantic Grid ? An extension of the current Grid in which information and services are given well-defined and explicitly represented meaning, so that it can be shared and used by humans and machines, better enabling them to work in cooperation
4 “It is a truth universally acknowledged, that an application in possession of good middleware, must be in want of meaningful metadata.” Why we need the Semantic Grid? Grid Semantic -- prof. C. Goble
5 Example: To illustrate, consider if a machine’s operating system is described as “SunOS” or “Linux.” To query for a machine that is “Unix” compatible, a user either has to: 1. Explicitly incorporate the Unix compatibility concept into the request requirements by requesting a disjunction of all Unix-variant operating systems, e.g., (OpSys=“SunOS” || OpSys=“Linux”) 2. Wait for all interesting resources to advertise their operating system as Unix as well as either Linux or SunOS, e.g., (OpSys=“SunOS,” “Unix”), and then express a match as set-membership of the desired Unix value in the OpSys value set, e.g., hasMember(OpSys, “Unix”). Why we need the Semantic Grid?
6 Example (cont) Apply Semantics… - Knowledge base: “SunOS and Linux are types of Unix operating system” - Request: “Need the Unix compatibility OS” Why we need the Semantic Grid?
7 Semantic Web Current Web ( WWW ) ◦ Is a huge library of interlinked documents that are transferred by computers and presented to people ◦ Anyone can contribute to it ◦ Quality of information or even the persistence of documents cannot be generally guaranteed ◦ Contains a lot of information and knowledge, but machines usually serve only to deliver and present the content of documents describing the knowledge ◦ People have to connect all the sources of relevant information and interpret them themselves Machine can Process the content But Machine can’t Understand content
8 Definition The Semantic Web is an extension of the current web in which the semantics of information and services on the web is defined, making it possible for the web to understand and satisfy the requests of people and machines to use the web content. --- Tim Berners-Lee Semantic Web
9 Ontology is a formal representation of the knowledge by a set of concepts within a domain and the relationships between those concepts. It is used to reason about the properties of that domain, and may be used to describe the domain. Implement by XML, XML Namespace, XML Schema, RDF, RDF Schema và OWL Ontology
10 Ontology example
11 Semantic Web Architecture (1)
12 Semantic Grid Scale of data and computation Scale of Interoperability Semantic Web Classical Web Semantic Grid Classical Grid Based on an idea by Norman Paton
13 What is Semantic Grid An extension of the Grid Rich metadata is exposed and handled explicitly, shared, and managed via Grid protocols
14 What is Semantic Grid The Semantic Grid uses metadata to describe information in the Grid. Turning information into something more than just a collection of data means understanding the context, format, and significance of the data. Therefore: ◦ Understand information ◦ Discovery and reuse
15 S-OGSA A Grid usually consist of several different services by OGSA: ◦ VO management service ◦ Resource discovery and Management service ◦ Job Management service ◦ Security service ◦ Data Management service The S-OGSA should (will) provide the metadata + semantic services to those services.
16 S-OGSA The Solution: ◦ Attached the semantic to Grid entities. ◦ Binding them together by semantic binding service. ◦ Normal grid services can be “semantic” by the semantic binding service.
17 S-OGSA
18 S-OGSA Defined by ◦ Information model New entities ◦ Capabilites New functionalities ◦ Mechanisms How it is delivered Model Capabilities Mechanisms provide/ consume expose use
19 S-OGSA Model
20 S-OGSA Model Grid Entities: Grid resources and services Knowledge Entities: represent/operate with some form of knowledge (e.g ontologies, rules, knowledge bases …) Semantic Bindings: entities associate of a Grid Entity with one or more Knowledge Entities
21 METADATA as Semantic Annotations S-OGSA Model Example
22 Optimization Execution Management Resource management Data Security Information Management Infrastructure Services Application 1 Application N OGSA Semantic-OGSA Semantic Provisioning Services From OGSA to the S-OGSA Ontology Reasoning Knowledge Metadata Annotation Semantic binding Semantic Provisioning Services
23 Semantic Provisioning Service Knowledge Resource Grid Entity Semantic Binding Grid Service Is-a 0..m 1..m Semantic aware Grid Service consumeproduce 0..m 1..m uses WebMDS SAML file DFDL file JSDL file Is-a Knowledge Entity Is-a Ontology Service Is-a Reasoning Service Semantic Binding Provisioning Service Annotation Service Metadata Service Grid Resource OGSA-DAI CAS Is-a Knowledge Service Is-a Ontology Rule set KnowledgeSemantic GridGrid S-OGSA Capabilities
24 S-OGSA Capabilities Semantic Provisioning Services – SPS provisioning and management of explicit semantics and its association with Grid entities creation, storage, update, removal and access of different forms of knowledge and metadata ◦ Knowledge provisioning services ontology services, reasoning services. ◦ Semantic binding provisioning services metadata services, annotation services.
25 S-OGSA Capabilities Semantically Aware Grid Services ◦ Be able to consume Semantics Bindings and being able to take actions based on knowledge and metadata ◦ Sample Actions : Metadata aware authorization of a given identity by a VO Manager service Execution of a search request over entries in a semantic resource catalogue Incorporation of a new concept in to an ontology hosted by an ontology service
26 S-OGSA Mechanisms Treating Knowledge Entities and Semantic Bindings as Grid Resources ◦ Common Information Model (CIM) Resource Model ◦ Grid Entities : class CIM-ManagedElement in the CIM Model. ◦ Knowledge Entities : class S-OGSA- KnowledgeEntity ◦ S-OGSA-SemanticBinding:Semantic Binding, the association between a Grid Entity (CIM- ManagedElement) and a Knowledge Entity (S- OGSA-KnowledgeEntity)
27 S-OGSA Mechanisms
28 Lifetime State/properties/metadata access port... Metadata Service Ontology Service Resource Metadata Seeking Client Semantic Binding Ids Retrieval Request Semantic Binding Ids Metadata Retrieval/Query Request 4 Query/Retrieval Result 5 Obtain schema for Semantic Bindings Access Patterns to Grid Resource Metadata Deliver Metadata pointers through resource properties Zero impact on existing protocols Resource Specific
29 Outline e-Science myGrid project Introduction myGrid Services và Architecture myGrid workbench
30 e-Science (1) ‘e-Science is about global collaboration in key areas of science, and the next generation of infrastructure that will enable it.’ ‘e-Science will change the dynamic of the way science is undertaken.’ John Taylor, DG of UK OST ‘[The Grid] intends to make access to computing power, scientific data repositories and experimental facilities as easy as the Web makes access to information.’ Tony Blair, 2002
31 e-Science (2) Requirements of e-Science Grid Application Projects determine services required by Grid middleware UK Projects focus more on Grid Data Services than Teraflop/s HPC systems
32 UK Initiative UK e-Science Initiative $180M Programme over 3 years $130M is for Grid Applications in all areas of science and engineering Particle Physics and Astronomy (PPARC) Engineering and Physical Sciences (EPSRC) Biology, Medical and Environmental Science $50M ‘Core Program’ to encourage development of generic ‘industrial strength’ Grid middleware
33 core program e-Science core program Network of e-Science Centres UK e-Science Grid Support for e-Science Applications Grid Network Issues Generic/Industrial Grid Middleware e-Health Grid ‘Grand Challenges’ Outreach/International Activities
34 Grid UK e-Science Grid
35 UK e-Science Grid All e-Science Centres donating resources plus four JCSR funded dedicated compute/data clusters – Supercomputers, clusters, storage, facilities All Centres run same Grid Software – Starting point is Globus 2 and Condor: Storage Resource Broker (SRB) being evaluated
36 Some UK e-Science Projects (1) Climateprediction.com (NERC) Oceanographic Grid (NERC) Molecular Environmental Grid (NERC) NERC DataGrid (NERC + OST-CP) Biomolecular Grid (BBSRC) Proteome Annotation Pipeline (BBSRC) High-Throughput Structural Biology (BBSRC) Global Biodiversity (BBSRC) GRIDPP (PPARC) ASTROGRID (PPARC) Comb-e-Chem (EPSRC) DAME (EPSRC) DiscoveryNet (EPSRC) GEODISE (EPSRC) myGrid (EPSRC) RealityGrid (EPSRC)
37 Some UK e-Science Projects (2) Biology of Ageing (BBSRC + MRC) Sequence and Structure Data (MRC) Molecular Genetics (MRC) Cancer Management (MRC + PPARC) Clinical e-Science Framework (MRC) Neuroinformatics Modeling Tools (MRC) Interdisciplinary Research Collaborations ‘Grand Challenge’ ◦ Advanced Knowledge Technologies ◦ Meical Images and Signals ◦ Equator ◦ DIRC (Dependability
38 Support for e-Science Projects Grid Support Centre in operation ◦ supported Grid middleware & users ◦ see National e-Science Institute ◦ Research Seminars ◦ Training Programme ◦ See National Certificate Authority ◦ Issue digital certificates for projects ◦ Goal is ‘single sign-on'
39 myGrid project
40 myGrid (1) The goal is to design, develop and demonstrate higher level functionalities over an existing Grid infrastructure An e-science research project Develop open source high-level service-base middleware Using database and computation analysis The project is pioneering the use of semantic web technology, to manage annotation, ontologies and sematic discovery
41 myGrid (2) The ultimate is to supply collection of services as a toolkit to build end applications.
42 Outline 1. Introduction 2. myGrid Services and Architecture Tools Forming and executing experiments Semantic service Supporting the e-science scientific method Applications and application services 3. myGrid workbench
43 myGrid service and architecture (1) The myGrid middleware framwork employs service-base Firstly prototype with web service but with an anticipated migration to the OGSA The primary services to support routine in silico experiments fall into fours categories: Services that are the tools that will contitute the experiments Service for forming and executing experiments Semantic services Service for supporting the e-science scientific method
44 myGrid service and architecture (2)
45 Tools (1) Development of domain services that can deliver data and computation analysis To access bioinformatices tools and data Bioinformatics service Retrieval database and analysis tools EMBOSS application suite of over eight analysis tools: MEDLINE, SRS, OMIM, NCBI and WU BLAST sequence alignment tools, … Soaplab, connector for command line based system and provides a universal glue to web service
46 Tools (2) Text extraction services AMBIT is system for Acquiring Medical and Biological Information AMBIT provides an information extraction service based on natural language processing
47 myGrid service and architecture
48 Forming & executing experiments (1) FreeFluo workflow enactment engine Can handle WSDL based web service invocation Supporting two XML workflow language: IBM’s web service flow language and Xscufl OGSA distributed query processor Distributed query processing Query language initiate OQL The initial prototype is to be release in August 2003
49 Forming & executing experiments (2) myGrid information repository An information model tailored to e-science Include experiment data and provenance records of its origin Store workflow specifications, information about person and project Metadata storage o Annotations are stored in an RDF triple, such as The Jena Semantic Web Toolkit o Annotation is a key tool used to link related objects
50 Forming & executing experiments (3) myGrid information repository An organisation have a single mIR OGSA-DAI service supports to access repository local and remote The first version of mIR has been built over the relational database product DB2 primarily The second extras a federated architecture, using mediator and extensive use of annotation and shared identifiers
51 myGrid service and architecture (2)
52 discovery & metadata management (1) Registries and registry views Service descriptions are centrally published To extend the idea of a registry in three way: o Personalised views over distributed registries o Extensible metadata storage o Addition semantic descriptions o DAML+OIL semantic description o Semantic description of workflow has been used to discover revelant workflows
53 discovery & metadata management (2) Discovery components To enable more sophisticated semantic discovery Indexing and searches over DAML+OIL A service browser module with the workbench Annotation components myGrid is using semantic web annotation tools
54 discovery & metadata management (3) Ontology services To provide a single point of reference for concepts and to support description logic reasoning of concept expressions DAML+OIL
55 myGrid service and architecture (2)
56 Service for supporting e-science (1) Notification services When new or update data analytical software become available A notification service to mediate an asynchronous interaction between services Servers may register type of notification events Be used to automatically trigger workflow Be defined with ontological descriptions in metadat a
57 Service for supporting e-science (2) Provenance management Provence information is used to determine whether a notification service needs to be re-run Freefluo generates provenanece logs in the form of xml file which is stored in mIR Provenance attributes: start time, end time and attribute service instance
58 Service for supporting e-science (3) Personalisation opportunities Difference users can be provided with appropriate views of the mIR the registry view gives a user perspective over the services
59 discovery & metadata management (3) Ontology services To provide a single point of reference for concepts and to support description logic reasoning of concept expressions DAML+OIL
60 Applications and application services Applications can interact with services directly or via a Gateway The Gateway provides an optional unified single point of programmatic access to the whole system To create client software
61Outline 1. Introduction 2. myGrid Services and Architecture Tools Forming and executing experiments Semantic service Supporting the e-science scientific method Applications and application services 3. myGrid workbench
62 myGrid workbench NetBean platform and JAVA Graves Disease is caused by an autoimmune response against the thyroid, causing hyperthyroidism
63 Graves Disease Autoimmune disease of the thyroid (1)
64 Graves Disease Autoimmune disease of the thyroid (2) As soon as the identity of the relevant genes is known the myGrid workbench is used to run workflows that gather information about those genes, help design new molecular biology experiments to focus on the genes of interest, and to predict the 3D structure of the protein products of the genes
65
66 Graves Disease Autoimmune disease of the thyroid (3)
67 Graves Disease Autoimmune disease of the thyroid (4)
68 Graves Disease Autoimmune disease of the thyroid (5) (1) The notification service informs the user via a notification client in the workbench that new data has been added to the mIR which can be browsed in the workBench (2) In this case it is the identity of a new gene with changed expression in Graves’ Disease
69 Graves Disease Autoimmune disease of the thyroid (6) (3) The user can then discover workflows via a wizard in the workbench The wizard itself makes use of a semantic find service, which finds relevant services and workflows in the myGrid registry using description logic reasoning over associated semantic descriptions A registry browser is also available in the workbench to allow the user to browse more freely for a workflow or service using a hierarchical categorisation based on each individual semantic description (4)
70 Graves Disease Autoimmune disease of the thyroid (7) If an appropriate workflow does not exist, a new one can be created in the Taverna editor (5) The workflow and associated data are submitted to the FreeFluo enactor The enactor provides a detailed provenance record stored in the mIR describing what was done, with what services and when. This can also be viewed within the workbench (6)
THANK YOU FOR YOUR ATTENTION