Managing Metadata in Service Architectures Mehmet S. Aktas Advisor: Prof. Geoffrey C. Fox.

Slides:



Advertisements
Similar presentations
The Replica Location Service In wide area computing systems, it is often desirable to create copies (replicas) of data objects. Replication can be used.
Advertisements

DELOS Highlights COSTANTINO THANOS ITALIAN NATIONAL RESEARCH COUNCIL.
Chapter 10: Designing Databases
Database Architectures and the Web
Reliability on Web Services Presented by Pat Chan 17/10/2005.
0 General information Rate of acceptance 37% Papers from 15 Countries and 5 Geographical Areas –North America 5 –South America 2 –Europe 20 –Asia 2 –Australia.
PZ13B Programming Language design and Implementation -4th Edition Copyright©Prentice Hall, PZ13B - Client server computing Programming Language.
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
Integration of Hand-Held Devices into Collaboration Environments IC’02 Las Vegas, NV June June Geoffrey Fox, Sung-Hoon Ko, Kangseok Kim,
GGF Toronto Spitfire A Relational DB Service for the Grid Peter Z. Kunszt European DataGrid Data Management CERN Database Group.
Overview Distributed vs. decentralized Why distributed databases
Introspective Replica Management Yan Chen, Hakim Weatherspoon, and Dennis Geels Our project developed and evaluated a replica management algorithm suitable.
SensIT PI Meeting, April 17-20, Distributed Services for Self-Organizing Sensor Networks Alvin S. Lim Computer Science and Software Engineering.
Concurrency Control & Caching Consistency Issues and Survey Dingshan He November 18, 2002.
Definition of terms Definition of terms Explain business conditions driving distributed databases Explain business conditions driving distributed databases.
Client-Server Computing in Mobile Environments
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Introduction to UDDI From: OASIS, Introduction to UDDI: Important Features and Functional Concepts.
Chapter 3 Database Architectures and the Web Pearson Education © 2009.
1 of 26 Scaling and Fault Tolerance for Distributed Messages in a Service and Streaming Architecture Thesis Proposal Hasan Bulut
A Scalable Framework for the Collaborative Annotation of Live Data Streams Thesis Proposal Tao Huang
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Database Architectures and the Web Session 5
International CANOE Summer School on Events, Publish/Subscribe & Systems, Oslo, 2009 Break-out Sessions Organizer: Hans-Arno Jacobsen August 16 th – 21.
Managing Service Metadata as Context The 2005 Istanbul International Computational Science & Engineering Conference (ICCSE2005) Mehmet S. Aktas
Data Management Kelly Clynes Caitlin Minteer. Agenda Globus Toolkit Basic Data Management Systems Overview of Data Management Data Movement Grid FTP Reliable.
Presenter: Dipesh Gautam.  Introduction  Why Data Grid?  High Level View  Design Considerations  Data Grid Services  Topology  Grids and Cloud.
M i SMob i S Mob i Store - Mobile i nternet File Storage Platform Chetna Kaur.
5.1 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED.
Active Monitoring in GRID environments using Mobile Agent technology Orazio Tomarchio Andrea Calvagna Dipartimento di Ingegneria Informatica e delle Telecomunicazioni.
High Performance Web Service Architecture for Sensors and Geographic Information Systems Galip Aydin.
material assembled from the web pages at
International Telecommunication Union Geneva, 9(pm)-10 February 2009 ITU-T Security Standardization on Mobile Web Services Lee, Jae Seung Special Fellow,
Managing Dynamic Metadata and Context Mehmet S. Aktas Advisor: Prof. Geoffrey C. Fox.
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES Data Replication Service Sandeep Chandra GEON Systems Group San Diego Supercomputer Center.
XML Web Services Architecture Siddharth Ruchandani CS 6362 – SW Architecture & Design Summer /11/05.
Event-Based Hybrid Consistency Framework (EBHCF) for Distributed Annotation Records Ahmet Fatih Mustacoglu Advisor: Prof. Geoffrey.
GEM Portal and SERVOGrid for Earthquake Science PTLIU Laboratory for Community Grids Geoffrey Fox, Marlon Pierce Computer Science, Informatics, Physics.
1 Advanced Software Architecture Muhammad Bilal Bashir PhD Scholar (Computer Science) Mohammad Ali Jinnah University.
Managing Dynamic Metadata and Context Mehmet S. Aktas.
1 Managing Dynamic Metadata and Context Mehmet S. Aktas Computer Science, Informatics, Pervasive Technology Laboratories Indiana University Bloomington.
The Replica Location Service The Globus Project™ And The DataGrid Project Copyright (c) 2002 University of Chicago and The University of Southern California.
Integrating Geographical Information Systems and Grid Applications Marlon Pierce Contributions: Ahmet Sayar,
Scalable Hybrid Keyword Search on Distributed Database Jungkee Kim Florida State University Community Grids Laboratory, Indiana University Workshop on.
Tycho: A General Purpose Virtual Registry and Asynchronous Messaging System Matthew Grove ACET Invited Talk February 2006.
AMQP, Message Broker Babu Ram Dawadi. overview Why MOM architecture? Messaging broker like RabbitMQ in brief RabbitMQ AMQP – What is it ?
1 XML Metadata Services SKG06 Guilin China November Mehmet S. Aktas, Sangyoon.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
1 VLDB - Data Management in Grids B. Del-Fabbro, D. Laiymani, J.M. Nicod and L. Philippe Laboratoire d’Informatique de l’Université de Franche-Comté Séoul,
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT By Jyothsna Natarajan Instructor: Prof. Yanqing Zhang Course: Advanced Operating Systems.
Providing web services to mobile users: The architecture design of an m-service portal Minder Chen - Dongsong Zhang - Lina Zhou Presented by: Juan M. Cubillos.
GRID ANATOMY Advanced Computing Concepts – Dr. Emmanuel Pilli.
 Distributed Database Concepts  Parallel Vs Distributed Technology  Advantages  Additional Functions  Distribution Database Design  Data Fragmentation.
Information Federation in Grid Information Services Mehmet S. Aktas Advisor: Prof. Geoffrey C. Fox Ph.D. Defense Exam May 3, 2007.
Event-Based Infrastructure for Reconciling Distributed Annotation Records Ahmet Fatih Mustacoglu Advisor: Prof. Geoffrey C. Fox.
1 Web Service Information Systems and Applications GGF16 Semantic Grid Workshop Athens Greece February Geoffrey Fox Computer Science, Informatics,
Collection-Based Persistent Archives Arcot Rajasekar, Richard Marciano, Reagan Moore San Diego Supercomputer Center Presented by: Preetham A Gowda.
Scaling and Fault Tolerance for Distributed Messages in a Service and Streaming Architecture Hasan Bulut Advisor: Prof. Geoffrey Fox Ph.D. Defense Exam.
INTRODUCTION TO GRID & CLOUD COMPUTING U. Jhashuva 1 Asst. Professor Dept. of CSE.
A Semi-Automated Digital Preservation System based on Semantic Web Services Jane Hunter Sharmin Choudhury DSTC PTY LTD, Brisbane, Australia Slides by Ananta.
Grid Services for Digital Archive Tao-Sheng Chen Academia Sinica Computing Centre
Service Oriented Architecture (SOA) Prof. Wenwen Li School of Geographical Sciences and Urban Planning 5644 Coor Hall
Supporting Mobile Collaboration with Service-Oriented Mobile Units
Database Architectures and the Web
Hasan Bulut Scaling and Fault Tolerance for Distributed Messages in a Service and Streaming Architecture Hasan Bulut
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT -Sumanth Kandagatla Instructor: Prof. Yanqing Zhang Advanced Operating Systems (CSC 8320)
Event-Based Infrastructure for Reconciling Distributed Annotation Records Ahmet Fatih Mustacoglu Advisor: Prof. Geoffrey C. Fox.
MWCN`03 Singapore 28 October 2003
The Anatomy and The Physiology of the Grid
Information Services for Dynamically Assembled Semantic Grids
Presentation transcript:

Managing Metadata in Service Architectures Mehmet S. Aktas Advisor: Prof. Geoffrey C. Fox

Outline  Introduction  Motivation  Requirements  Research Issues  Architecture  Performance Evaluation  Conclusions  Contribution 2 of 34

3 of 34 Context as Service Metadata  Context interaction-independent  slowly varying, quasi-static service metadata interaction-dependent  dynamically generated metadata as result of interaction of services  information associated to a single service, or a session (service activity) or both  Dynamic Grid/Web Service Collections loosely assembled collections of services assembled to support a specific task generate metadata and have limited life-time

4 of 34 Motivating Cases  Multimedia Collaboration Grids Global Multimedia Collaboration System- Global MMCS widely distributed services, session service metadata, session metadata, stream-specific metadata mostly read-only  Workflow-style applications in Geographical Information System/Sensor Grids Pattern Informatics (PI) – UC Davis, Interdependent Energy Infrastructure Simulation System (IEISS) – LANL widely distributed services conversation metadata, transient multiple writers

5 of 34 Problems with Grid Information Services  Standardization and Unification Issues Customized Grid Information Services Differences in application requirements Thick clients  Performance and Centralization Issues Low performance Low fault tolerance  Dynamic Metadata Management Issues Point-to-point service communication approaches

6 of 34 Requirements for Grid Information Services  Greater Interoperability Unified platform for communication Shared communication protocol Thin clients  Greater Capabilities High Performance Fault-tolerant  Dynamic Grid/Web Service Collections Distributed state management Collaboration session management

7 of 34 Research Issues I  Unification of Grid Information Services How to combine different information services?  Federation of Grid Information Services What is a common data model and communication protocol?  Flexibility and extensibility Accommodating broad range of application domains  read-dominated, read/write dominated Ability to add/support more information services  Interoperability Being compatible with wide range of applications

8 of 34 Research Issues II  Performance Efficient centralized metadata management strategies  high performance and persistency Efficient decentralized metadata management strategies  Efficient request distribution strategies  Adaptation to instantaneous client-demand changes  Fault-tolerance Efficient replica-content creation strategies  Consistency How to provide consistency across the copies of the same data?

 Unification  Uniform Access  Extensibility  Interoperability  Extended UDDI  WS-Context  Federation  Unified Schema  Query/Publish XML API Hybrid Grid Information Service 9 of 34

10 of 34 UDDI instance WS-Context instance Unified schema instance

11 of 34  Decentralized  Fault-tolerant  Efficient distribution  Look-ahead caching  Consistency enforced 11 of 34

12 of 34 Support for interaction-independent metadata: Extended UDDI Service  It supports different types of metadata Geographical Information System Metadata Catalog (functional metadata) User-defined metadata ((name, value) pairs)  It enables advanced query capabilities Geo-spatial queries Metadata oriented queries Domain independent queries  It provides additional capabilities Up-to-date service registry information (leasing) Dynamic aggregation of capabilities of services  Ex: geospatial capabilities

Support for interaction-dependent metadata: WS-Context Service  Context Manager Service Data model and communication protocol Session-related metadata  It supports Dynamic Web Service Collections Support for distributed state based systems  collaboration grids  workflow-style grids  It provides various capabilities Asynchronous communication capability Up-to-date service registry information (leasing) 13 of 34

Support for federated service metadata: Unified Information Service  Federating Grid Information Services Unified data model and communication protocol Extended UDDI, WS-Context and Glue Schemas  Approach taken Union of schemas vs. separate schemas Reuse common concepts  Ex1: business, session, site => category Combine disjoined concepts  Ex1: UDDI’s tModel  It enables hybrid query capabilities “Give me list of services satisfying C:{a,b,c..} QoS requirements and participating S:{x,y,z..} sessions” 14 of 34

Collaboration Grid Sensor Grid WSDL HYBRID Service Database WS-Context Topic Based Publish-Subscribe Messaging System Subscriber Publisher WSDL HYBRID Service Database Ext-UDDI Federating Grid Information Services 15 of 34

16 of 34 Features of the Distributed System  Cache Strategy Memory-in storage  Access Distribution Redirecting client request to an appropriate replica server  Look-ahead caching Moving/replicating metadata to where they wanted  Replica Content Placement Replicating data on an appropriate replica server  Consistency enforcement Ensuring all replicas of a data to be the same 16 of 34

17 of 34 Tuple Spaces & Publish-Subscribe Paradigms  Publish-Subscribe paradigm Message based asynchronous communication Participants are decoupled both in space and in time Open source NaradaBrokering software  topic based publish/subscribe messaging system  Tuple Spaces paradigm [Gelernter-99] a data-centric asynchronous communication paradigm communication units are tuples (data structure) JavaSpaces [Sun Microsystems]- object oriented implementation specification

18 of 34 Caching Strategy  Light-weight implementation of JavaSpaces Data sharing, associative lookup, and persistency  Integrated caching capability for all types of service metadata Ex: UDDI-type, WS-Context-type, Unified Schema-type metadata We assume that today’s servers are capable of holding such small size metadata in cache.  All metadata accesses happen in memory  Persistency All metadata is backed-up into appropriate Information Service back-end every so often for persistency

Persistency investigation 19 of 34

20 of 34 Performance investigation

21 of 34 Message rate scalability investigation

22 of 34 Message size scalability investigation

23 of 34 Access Distribution Look-ahead Caching  Broadcast-based request dissemination Pub-sub system for message broadcast Broadcast requests only to those servers that can answer No need to keep track of metadata locations Dynamic migration/replication [Rabinovich et al, 1999] Popular copies are moved/replicated where they wanted Autonomous decisions, self-awareness

24 of 34 Access Distribution Experiment Test Methodology T1T2T3 Time = T1 + T2 + T3 Simulation parameters Backup frequencyevery 10 seconds Message size2.7 Kbytes

Distribution experiment result  Overhead of access distribution is only few milliseconds.  Continuous access distribution operation does not degrade the performance.  The overhead of distribution remains the same regardless of the network distances between nodes. 25 of 34

26 of 34 T1T2T3 Time = T1 + T2 + T3 Simulation parameters message size / message rate2.7 Kbytes / 10 msg/sec replication decision frequencyevery 100 seconds deletion / replication threshold0.03 request/second and 0.18 request/second registry size1000 metadata in Indianapolis Dynamic Replication Performance Test Methodology

27 of 34  The decrease in average latency shows that the algorithm manages to move replica copies to where they wanted.

Replica content placement Consistency enforcement  Replica-content placement  Each node keeps information about other servers  Selection of Replica Server(s) Selection policy based on a) geographical (proximity) and b) topical (number of topics) information  Consistency Enforcement - Primary-copy approach  Update distribution: updates labeled with synchronized timestamps reflected (unicast) to primary-copy  Update propagation: primary-copy pushes (broadcast) updates only to those replica servers holding the context Hybrid Service 1 Hybrid Service 2 Hybrid Service 3 Hybrid Service 4 Hybrid Service 1 28 of 34

29 of 34 Fault-tolerance experiment Testing Setup Simulation parameters Backup frequencyevery 10 seconds Message size2.7 Kbytes T1T2T3 Time = T1 + T2 + T3

30 of 34 Fault-tolerance experiment result  Overhead of replica creation is only few milliseconds.  Continuous replica creation operation does not degrade the performance.  Overhead of replica creation increases in the order of milliseconds as the fault-tolerance level increase.

31 of 34 Consistency Enforcement Experiment Test Methodology T1T2T3 Time = T1 + T2 + T3 Simulation parameters Backup frequencyevery 10 seconds Message size2.7 Kbytes

32 of 34 Consistency Enforcement Test Result  Overhead of consistency enforcement is few milliseconds.  Continuous operation does not degrade the performance.  The cost of consistency enforcement remains the same regardless of distribution of the network nodes.

Conclusions 33 of 34  Efficient decentralized metadata strategies TupleSpaces & Pub-Sub communication paradigms Distribution Replication for fault-tolerance Replication for performance Consistency Enforcement  Efficient centralized metadata management strategies TupleSpaces Paradigm based memory-in storage

Contributions 34 of 34  Federated Grid Information Service Architecture Unified data model and communication protocol Support for both interaction independent and conversation- based service metadata Support for greater interoperability  Unified Grid Information Service Architecture Flexible and extendable architecture Support for High Performance and Fault-tolerance Uniform access to all kinds of service metadata  Efficient decentralized metadata systems can be built by integrating TupleSpaces and Publish-Subscribe paradigms Fault-tolerance, distribution and consistency can be succeeded with few milliseconds system processing overhead. Self-awareness can be achieved in decentralized metadata management.  Communication among services can be achieved with efficient mediator metadata strategies  A metadata management approach for Dynamic Web/Grid Service Collections  Collective operations such as queries on subsets of all available metadata in service conversation.

35 of 34 Information ServiceUsage Cases WS-ContextFast SOAP transfer in Mobile Computing (Sangyoon Oh Thesis) WS-Context Extended UDDI Geographical Information Service & Sensor Grids (Galip Aydin’s Thesis) WS-ContextSession Metadata Management (Hasan Bulut’s Thesis) WS-ContextFault-Tolerant Registry (Harshawardhan Gadgil’ s Thesis) WS-ContextVLab Project – Univ. of Minesota, Florida State University Extended UDDIChemical Informatics and Cyberinfrastructure Collaboratory Project WS-Context Extended UDDI Pattern Informatics – UC – Davis IEISS - LANL

Selected Publication List focusing on a) Metadata, b) Information Services, and c) Metadata Discovery 36 of 34  Mehmet S. Aktas, Geoffrey Fox, Marlon Pierce, Information Services for Dynamically Assembled Semantic Grids [SKG-05, 2005]  Mehmet S. Aktas, Geoffrey Fox, Marlon Pierce, Managing Dynamic Metadata as Context [ICCSE, 2005]  Mehmet S. Aktas et al., Web Service Information Systems and Applications [GGF-16, 2006]  Mehmet S. Aktas, Geoffrey C. Fox, Marlon Pierce, Fault Tolerant High Performance Information Services for Dynamic Collections of Grid and Web Services [FGCS Journal, 2006]  Mehmet S. Aktas, Sangyoon Oh, Geoffrey C. Fox, Marlon Pierce, XML Metadata Services [SKG-2006, Concurrency and Computation: Practice and Experience Journal-2007]  Mehmet S. Aktas, Marlon Pierce, and Geoffrey C.Fox, Designing Ontologies and Distributed Resource Discovery Services for an Earthquake Simulation Grid [ GGF11, 2004]  Mehmet S. Aktas, M. Pierce, G. Fox, and D. Leake, A Web based Conversational Case-Based Recommender System for Ontology aided Metadata Discovery [GRID Workshop -2004]  Sangyoon Oh, Mehmet S. Aktas, Geoffrey C. Fox, Marlon Pierce, Architecture for High-Performance Web Service Communications Using an Information Service [WSEAS Journal -2006]