Information Federation in Grid Information Services Mehmet S. Aktas Advisor: Prof. Geoffrey C. Fox Ph.D. Defense Exam May 3, 2007.

Slides:



Advertisements
Similar presentations
The Replica Location Service In wide area computing systems, it is often desirable to create copies (replicas) of data objects. Replication can be used.
Advertisements

Chapter 10: Designing Databases
UDDI v3.0 (Universal Description, Discovery and Integration)
Presented by: Thabet Kacem Spring Outline Contributions Introduction Proposed Approach Related Work Reconception of ADLs XTEAM Tool Chain Discussion.
Reliability on Web Services Presented by Pat Chan 17/10/2005.
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
Effective Coordination of Multiple Intelligent Agents for Command and Control The Robotics Institute Carnegie Mellon University PI: Katia Sycara
Applications over P2P Structured Overlays Antonino Virgillito.
GGF Toronto Spitfire A Relational DB Service for the Grid Peter Z. Kunszt European DataGrid Data Management CERN Database Group.
1 Introduction to Load Balancing: l Definition of Distributed systems. Collection of independent loosely coupled computing resources. l Load Balancing.
RETSINA: A Distributed Multi-Agent Infrastructure for Information Gathering and Decision Support The Robotics Institute Carnegie Mellon University PI:
Managing Agent Platforms with the Simple Network Management Protocol Brian Remick Thesis Defense June 26, 2015.
Object Naming & Content based Object Search 2/3/2003.
Introspective Replica Management Yan Chen, Hakim Weatherspoon, and Dennis Geels Our project developed and evaluated a replica management algorithm suitable.
SensIT PI Meeting, April 17-20, Distributed Services for Self-Organizing Sensor Networks Alvin S. Lim Computer Science and Software Engineering.
23 September 2004 Evaluating Adaptive Middleware Load Balancing Strategies for Middleware Systems Department of Electrical Engineering & Computer Science.
Principles for Collaboration Systems Geoffrey Fox Community Grids Laboratory Indiana University Bloomington IN 47404
Introduction to UDDI From: OASIS, Introduction to UDDI: Important Features and Functional Concepts.
A Scalable Framework for the Collaborative Annotation of Live Data Streams Thesis Proposal Tao Huang
ArcGIS Workflow Manager An Introduction
Word Wide Cache Distributed Caching for the Distributed Enterprise.
26 Sep 2003 Transparent Adaptive Resource Management for Distributed Systems Department of Electrical Engineering and Computer Science Vanderbilt University,
On P2P Collaboration Infrastructures Manfred Hauswirth, Ivana Podnar, Stefan Decker Infrastructure for Collaborative Enterprise, th IEEE International.
Managing Service Metadata as Context The 2005 Istanbul International Computational Science & Engineering Conference (ICCSE2005) Mehmet S. Aktas
Data Management Kelly Clynes Caitlin Minteer. Agenda Globus Toolkit Basic Data Management Systems Overview of Data Management Data Movement Grid FTP Reliable.
Presenter: Dipesh Gautam.  Introduction  Why Data Grid?  High Level View  Design Considerations  Data Grid Services  Topology  Grids and Cloud.
M i SMob i S Mob i Store - Mobile i nternet File Storage Platform Chetna Kaur.
A Metadata Catalog Service for Data Intensive Applications Presented by Chin-Yi Tsai.
Active Monitoring in GRID environments using Mobile Agent technology Orazio Tomarchio Andrea Calvagna Dipartimento di Ingegneria Informatica e delle Telecomunicazioni.
The GRIMOIRES Service Registry Weijian Fang and Luc Moreau School of Electronics and Computer Science University of Southampton.
McGraw-Hill/Irwin © The McGraw-Hill Companies, All Rights Reserved BUSINESS PLUG-IN B17 Organizational Architecture Trends.
ANSTO E-Science workshop Romain Quilici University of Sydney CIMA CIMA Instrument Remote Control Instrument Remote Control Integration with GridSphere.
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
material assembled from the web pages at
International Telecommunication Union Geneva, 9(pm)-10 February 2009 ITU-T Security Standardization on Mobile Web Services Lee, Jae Seung Special Fellow,
The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Dataset Caitlin Minteer & Kelly Clynes.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
Managing Dynamic Metadata and Context Mehmet S. Aktas Advisor: Prof. Geoffrey C. Fox.
Integrated Collaborative Information Systems Ahmet E. Topcu Advisor: Prof Dr. Geoffrey Fox 1.
Cracow Grid Workshop, October 27 – 29, 2003 Institute of Computer Science AGH Design of Distributed Grid Workflow Composition System Marian Bubak, Tomasz.
Information System Development Courses Figure: ISD Course Structure.
Trust- and Clustering-Based Authentication Service in Mobile Ad Hoc Networks Presented by Edith Ngai 28 October 2003.
Application code Registry 1 Alignment of R-GMA with developments in the Open Grid Services Architecture (OGSA) is advancing. The existing Servlets and.
XML Web Services Architecture Siddharth Ruchandani CS 6362 – SW Architecture & Design Summer /11/05.
Event-Based Hybrid Consistency Framework (EBHCF) for Distributed Annotation Records Ahmet Fatih Mustacoglu Advisor: Prof. Geoffrey.
The Anatomy of the Grid Introduction The Nature of Grid Architecture Grid Architecture Description Grid Architecture in Practice Relationships with Other.
Managing Dynamic Metadata and Context Mehmet S. Aktas.
1 Managing Dynamic Metadata and Context Mehmet S. Aktas Computer Science, Informatics, Pervasive Technology Laboratories Indiana University Bloomington.
Chapter 5 McGraw-Hill/Irwin Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved.
The Replica Location Service The Globus Project™ And The DataGrid Project Copyright (c) 2002 University of Chicago and The University of Southern California.
Distributed Systems CS Consistency and Replication – Part IV Lecture 21, Nov 10, 2014 Mohammad Hammoud.
1 VLDB - Data Management in Grids B. Del-Fabbro, D. Laiymani, J.M. Nicod and L. Philippe Laboratoire d’Informatique de l’Université de Franche-Comté Séoul,
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT By Jyothsna Natarajan Instructor: Prof. Yanqing Zhang Course: Advanced Operating Systems.
Providing web services to mobile users: The architecture design of an m-service portal Minder Chen - Dongsong Zhang - Lina Zhou Presented by: Juan M. Cubillos.
GRID ANATOMY Advanced Computing Concepts – Dr. Emmanuel Pilli.
EbXML Registry and Repository Dept of Computer Engineering Khon Kaen University.
Managing Metadata in Service Architectures Mehmet S. Aktas Advisor: Prof. Geoffrey C. Fox.
Event-Based Infrastructure for Reconciling Distributed Annotation Records Ahmet Fatih Mustacoglu Advisor: Prof. Geoffrey C. Fox.
National Geospatial Enterprise Architecture N S D I National Spatial Data Infrastructure An Architectural Process Overview Presented by Eliot Christian.
Event-Based Model for Reconciling Digital Entities Ahmet Fatih Mustacoglu Ahmet E. Topcu Aurel Cami Geoffrey C. Fox Indiana University Computer Science.
1 Web Service Information Systems and Applications GGF16 Semantic Grid Workshop Athens Greece February Geoffrey Fox Computer Science, Informatics,
Scaling and Fault Tolerance for Distributed Messages in a Service and Streaming Architecture Hasan Bulut Advisor: Prof. Geoffrey Fox Ph.D. Defense Exam.
AMSA TO 4 Advanced Technology for Sensor Clouds 09 May 2012 Anabas Inc. Indiana University.
A Semi-Automated Digital Preservation System based on Semantic Web Services Jane Hunter Sharmin Choudhury DSTC PTY LTD, Brisbane, Australia Slides by Ananta.
Grid Services for Digital Archive Tao-Sheng Chen Academia Sinica Computing Centre
Introduction to Load Balancing:
Wsdl.
The Globus Toolkit™: Information Services
Event-Based Infrastructure for Reconciling Distributed Annotation Records Ahmet Fatih Mustacoglu Advisor: Prof. Geoffrey C. Fox.
Information Services for Dynamically Assembled Semantic Grids
Presentation transcript:

Information Federation in Grid Information Services Mehmet S. Aktas Advisor: Prof. Geoffrey C. Fox Ph.D. Defense Exam May 3, 2007

Talk Outline  Use Cases and Challenges  Research Issues  Architecture  Hybrid Grid Information Service  Performance Evaluation  Conclusions  Contributions and Future Research Directions 2

Introduction  Grid Information Services in Service Oriented Architectures  1) Large scale relatively static metadata as in catalog of all the world’s services Interaction-independent, slowly-varying metadata  2) Small scale highly dynamic metadata as in dynamic workflows for sensor integration and collaboration Interaction-dependent, dynamic metadata Dynamic Grid/Web Service Collections* – Dynamically assembled relatively small number of services (sub-grid) – Gathered at any one time to support a specific task – Generate dynamic metadata and have limited life-time [*] [ICCSE-05] Managing Dynamic Metadata as Context 3

Motivating Use Cases  Geophysical Data Grids - CGL  Service Oriented Architecture for Geographical Information Systems Supporting Real Time Data Grids  Pattern Informatics (PI) - UC Davis  Earthquake forecasting code developed by Prof. John Rundle (UC Davis) and collaborators, uses seismic archives.  Interdependent Energy Infrastructure Simulation System (IEISS) - LANL  Models infrastructure networks (e.g. electric power systems and natural gas pipelines) and simulates their physical behavior, interdependencies between systems.  eSports System - CGL  Annotative collaboration application. Supports archive, replay, annotation of real-time video-conferencing streams. 4

A General Geographical Information System Grid Orchestration Scenario* [*] Building and Applying Geographical Information System Grids, Special Issue on Geographical information Systems and Grids based on GGF15 workshop, Concurrency and Computation: Practice and Experience 5

Background  Specifications for interaction-independent metadata  UDDI Specification  Glue Specification  EbXML Specification  Web Registry Service Specification  Specifications for interaction-dependent metadata  Point-to-point approach Web Service Resource Framework (WSRF) Specification  Third-party approach WS-Context Specification 6

Challenges  Standardization and Unification Issues  Customized Grid Information Services  Fat clients  Performance and Centralization Issues  Low performance  Low fault tolerance  UDDI Specification Issues  Lack of up-to-date, metadata-oriented registry  Lack of domain-specific metadata management  WS-Context Specification Issues  Limited data model and communication protocol 7

Research Issues I  Unification  How to combine different information services?  Federation  How to federate different information services?  Flexibility  How to accommodate broad range of specific application domains?  Interoperability  How to facilitate connection with wide range of information service clients? 8

Research Issues II  Performance  How to provide efficient information management strategies? high-performance, scalable in-memory storage efficient request distribution adaptation to instantaneous client-demand changes  Fault-tolerance  How to provide efficient replica-content placement strategies?  Consistency  How to provide efficient consistency enforcement strategies? 9

Hybrid Grid Information Service  Unification  Federation  Unified Schema  Query/Publish API  Flexibility  Interoperability  Extended UDDI  WS-Context  Glue  … Hybrid Grid Information Service  Unification  Federation  Unified Schema  Query/Publish API  Flexibility  Interoperability  Extended UDDI  WS-Context  Glue  … 10

UDDI instance WS-Context instance Unified schema instance 11

12

Support for interaction-independent metadata: Extended UDDI Service  There are other extensions of UDDI  Supports different types of metadata  User-defined metadata  Functional metadata  Enables advanced query capabilities  Geo-spatial, metadata-oriented, domain-independent queries  Provides additional capabilities  Up-to-date service registry information (leasing)  Dynamic aggregation of capabilities of services e.g. geospatial capabilities [GGF16-Semantic Grid Workshop] Web Service Information Systems and Applications [SKG06 – IEEE Proceedings] XML Metadata Services 13

Support for interaction-dependent metadata: WS-Context Service  OASIS Standard  Context Manager Service  Data model and communication protocol  Supports Dynamic Web Service Collections  Distributed state based systems e.g. workflow-style grids  Session metadata management e.g. real-time replay and session-failure recovery capabilities  Provides various capabilities  Notification capability  Up-to-date metadata registry (leasing) [SKG05 – IEEE Proceedings] Information Services for Dynamically Assembled Semantic Grids [FGCS ] Fault Tolerant High Performance Information Services for Dynamic Collections of Grid and Web Services 14

Support for federated service metadata: Information Federation  Federating Grid Information Services  Unified Schema and communication protocol  Extended UDDI, WS-Context and Glue Sche mas  Approach taken for Unified Schema [Schema Integration]  Schema Matching Identify overlapping information in given two Schemas: S1 and S2  Schema Merging Use the identified overlapping information to guide merge of S1 and S2  Communication protocol  Publish: save_ (create, update), delete_ e.g. save_service, delete_service  Inquiry: find_, get_ e.g. find_metadata, get_metadataDetail 15

Schema Matching: Identifying Matching Concepts serviceAttributeEntity: Information about metadata associated to services Site Service ComputingEl ement StorageElem ent site: information about a site where services, computing elements and storage elements are aggregated ServiceData serviceData: information associated to a service service: all information about a Service ExtUDDI.businessEntity 1:N GLUE.site ExtUDDI.businessService 1:1 GLUE.service ExtUDDI.serviceAttributeEntity 1:1 GLUE.serviceData Extended UDDIGLUE EXtUDDIGLUE 16

metadata: information about metadata associated to service bindingTemplate: Technical information about a service point tModel: Description of Specifications for services or taxonomies publisherAssertions: Defines relationships between two business entities computingElement: all info. required to manage computing resources storageElement: all information required to manage storage resources businessEntity: information about the party who publishes information about entities service: all information about a service site: all information about a concept to aggregate services and resources site contains one to n computing element has references to site contains one to n services site contains one to n storage element business contains one to n services has references to service contains one to n metadata service contains one to n technical information business contains one to n site Schema Merging: Unifying Schemas ExtUDDI.businessEntity  ExtUDDI&GLUE.businessEntity ExtUDDI&GLUE.site  GLUE.site ExtUDDI.businessService  ExtUDDI&GLUE.service  GLUE.service ExtUDDI.serviceAttribute  ExtUDDI.metadata  GLUE.serviceData Unified SchemaGLUEExtended UDDI Example Mappings => 17

Key Design Features  In-Memory storage  High performance metadata access/storage  Access distribution  Redirecting client request to an appropriate replica server  Replica content placement for performance  Dynamic replication Moving/replicating metadata to where they are demanded.  Replica content placement for fault-tolerance  Permanent replication Replicating data on an appropriate replica server  Consistency enforcement  Ensuring all replicas of a data to be the same 18

In-Memory Storage  Light-weight implementation of JavaSpaces  Data sharing, associative lookup  Integrated in-memory storage capability  Ex: UDDI-type, WS-Context-type  Today’s servers are capable of holding such small size metadata in memory.  Persistency  Newly-inserted/updated metadata is backed-up into appropriate information service back-end.  If the physical memory wiped out, at the bootstrap, database-metadata is inserted into the in-memory storage from the last-backup. 19

Experiment Results 20

Experiment Results 21

Message rate scalability investigation results 22

Message rate scalability investigation results 23

Access Distribution and Dynamic Replication  Broadcast-based request dissemination  Pub-sub system for message broadcast  Requests are broadcast only to those servers that can answer  No need to keep track of metadata locations  Replica-content placement for performance  Popular copies are moved/replicated where they are demanded  Dynamic migration/replication algorithm*  Self-adaptation to changing client demands [*] Rabinovich et al, A dynamic Object Replication and Migration Protocol for an Internet Hosting Service Proceedings of the 19th IEEE International Conference on Distributed Computing Systems,

Access Distribution Experiment Benchmark Methodology T1T2T3 Time = T1 + T2 + T3 Simulation parameters Backup frequencyevery 10 seconds Message size2.7 Kbytes One-broker case Two-broker case 25

Experiment Results  Overhead of access distribution is only few milliseconds.  Continuous access distribution operation does not degrade the performance. 26

Experiment Results  The overhead of distribution remains the same regardless of the network distances between nodes. 27

T1T2T3 Time = T1 + T2 + T3 Dynamic Replication Performance Experiment Benchmark Methodology Simulation parameters message size / message rate2.7 Kbytes / 10 msg/sec replication decision frequencyevery 100 seconds deletion / replication threshold0.03 request/second and 0.18 request/second registry size1000 metadata in Indianapolis 28

 The decrease in average latency shows that the algorithm manages to move replica copies to where they are demanded. Experiment Results 29

Replication and Consistency  Permanent replication for fault tolerance  Each node keeps information about other servers  Replica Server(s) Selection Load and proximity metrics Selection algorithm by Rabinovich et al  Unicast-based replica-content placement  Primary-copy approach  Updates are unicast to primary-copy  Updates are broadcast by the primary-copy holder to a) permanent-copy holding servers b) applications with high consistency requirements 30

Fault-tolerance Experiment Benchmark Methodology T1T2T3 Time = T1 + T2 + T3 Simulation parameters Backup frequencyevery 10 seconds Message size2.7 Kbytes One-broker case Two-broker case 31

Experiment Results  Overhead of replica-content placement is only few milliseconds.  Overhead of replica-content placement increases in the order of milliseconds as the fault-tolerance level increase. 32

Consistency Enforcement Experiment Benchmark Methodology T1T2T3 Time = T1 + T2 + T3 Simulation parameters Backup frequencyevery 10 seconds Message size2.7 Kbytes One-broker case Two-broker case 33

Experiment Results  Overhead of consistency enforcement is few milliseconds.  The cost of consistency enforcement remains the same regardless of distribution of the network nodes. 34

Contributions  Systems Research  Hybrid Grid Information Service Architecture  Unification, Federation and Interoperability of grid information services  Strategies for high-performance, scalable in-memory storage  Strategies for efficient distribution, replica-content placement, consistency enforcement by utilizing pub-sub based messaging schemes  Self-adaptation to changing-client demands  Extensions to semantics of UDDI and WS-Context Web Service Specifications  Detailed evaluation of the system components and algorithms  Systems Software  An implementation of Extended UDDI Specification  Geographical Information Systems-specific, metadata-oriented  An implementation of WS-Context Specification  Session metadata management for collaboration grids, distributed state management for workflow-style grids  An implementation of Hybrid Grid Information Service Architecture 35

Future Research Directions  Use the proposed approach to solve OGF Grid Interoperation Now (GIN) problem for information services  Investigate an information security mechanism for the decentralized Hybrid Service  Example motivating application case: Pattern Informatics application  Applying Hybrid Service to broader range of application use cases  Web 2.0/Folksonomy information services 36