1 Distributed Monitoring of Peer-to-Peer Systems By Serge Abiteboul, Bogdan Marinoiu Docflow meeting, Bordeaux.

Slides:



Advertisements
Similar presentations
Database System Concepts and Architecture
Advertisements

XML: Extensible Markup Language
Distributed DBMS© M. T. Özsu & P. Valduriez Ch.6/1 Outline Introduction Background Distributed Database Design Database Integration Semantic Data Control.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 12 Slide 1 Distributed Systems Design 2.
TIMBER A Native XML Database Xiali He The Overview of the TIMBER System in University of Michigan.
1 CS 561 Presentation: Indexing and Querying XML Data for Regular Path Expressions A Paper by Quanzhong Li and Bongki Moon Presented by Ming Li.
PZ13B Programming Language design and Implementation -4th Edition Copyright©Prentice Hall, PZ13B - Client server computing Programming Language.
Information Retrieval in Practice
Paper by: A. Balmin, T. Eliaz, J. Hornibrook, L. Lim, G. M. Lohman, D. Simmen, M. Wang, C. Zhang Slides and Presentation By: Justin Weaver.
Software Testing and Quality Assurance
Xyleme A Dynamic Warehouse for XML Data of the Web.
NaLIX: A Generic Natural Language Search Environment for XML Data Presented by: Erik Mathisen 02/12/2008.
Zero-programming Sensor Network Deployment 學生:張中禹 指導教授:溫志煜老師 日期: 5/7.
Algorithms and Problem Solving-1 Algorithms and Problem Solving.
The Data Ring: Community Content Sharing Serge Abiteboul (INRIA) Alkis Polyzotis (UC Santa Cruz)
Algorithms and Problem Solving. Learn about problem solving skills Explore the algorithmic approach for problem solving Learn about algorithm development.
2005rel-xml-ii1 The SilkRoute system  The system goals  Scenario, examples  View Forests  View forest and query composition  View forest efficient.
1 COS 425: Database and Information Management Systems XML and information exchange.
Software Requirements
Architectural Design Principles. Outline  Architectural level of design The design of the system in terms of components and connectors and their arrangements.
Query Optimization. General Overview Relational model - SQL  Formal & commercial query languages Functional Dependencies Normalization Physical Design.
Copyright ©2009 Opher Etzion Event Processing Course Engineering and implementation considerations (related to chapter 10)
CIS607, Fall 2005 Semantic Information Integration Article Name: Clio Grows Up: From Research Prototype to Industrial Tool Name: DH(Dong Hwi) kwak Date:
Lecture Nine Database Planning, Design, and Administration
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Overview of Database Languages and Architectures.
Course Instructor: Aisha Azeem
Architectural Design Establishing the overall structure of a software system Objectives To introduce architectural design and to discuss its importance.
Overview of Search Engines
Query Processing Presented by Aung S. Win.
Chapter 7 Requirement Modeling : Flow, Behaviour, Patterns And WebApps.
«Tag-based Social Interest Discovery» Proceedings of the 17th International World Wide Web Conference (WWW2008) Xin Li, Lei Guo, Yihong Zhao Yahoo! Inc.,
Database System Development Lifecycle © Pearson Education Limited 1995, 2005.
1 Serge Abiteboul - Monitoring 1 Monitoring of distributed applications (in P2P) Serge Abiteboul, Pierre Bourhis, Bogdan Marinoiu, INRIA Saclay and Université.
Implementation Yaodong Bi. Introduction to Implementation Purposes of Implementation – Plan the system integrations required in each iteration – Distribute.
Querying Structured Text in an XML Database By Xuemei Luo.
Towards Low Overhead Provenance Tracking in Near Real-Time Stream Filtering Nithya N. Vijayakumar, Beth Plale DDE Lab, Indiana University {nvijayak,
The Data Ring: Community Content Sharing Serge Abiteboul (INRIA) Alkis Polyzotis (UC Santa Cruz)
SWIM-SUIT Information Models & Services
 Three-Schema Architecture Three-Schema Architecture  Internal Level Internal Level  Conceptual Level Conceptual Level  External Level External Level.
SPARQL Query Graph Model (How to improve query evaluation?) Ralf Heese and Olaf Hartig Humboldt-Universität zu Berlin.
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
Future and Emerging Technologies (FET) Future and Emerging Technologies (FET) The roots of innovation Proactive initiative on: Global Computing (GC) Proactive.
1 MSc Project Yin Chen Supervised by Dr Stuart Anderson 2003 Grid Services Monitor Long Term Monitoring of Grid Services Using Peer-to-Peer Techniques.
Grid Computing at Yahoo! Sameer Paranjpye Mahadev Konar Yahoo!
1 XQuery to SQL by XML Algebra Tree Brad Pielech, Brian Murphy Thanks: Xin.
1 DocFlow - kick off Monitoring 1 Distributed Monitoring in P2P Systems Serge Abiteboul, Bogdan Marinoiu INRIA-Futurs and Univ. Paris 11.
The Forest and the Trees Julia Stoyanovich Candidacy Exam in Database Systems Fall 2005.
DDBMS Distributed Database Management Systems Fragmentation
An information and monitoring system for static and dynamic information about grid resources, applications, networks … RDBMS Servlet aware of API during.
SOFTWARE DESIGN. INTRODUCTION There are 3 distinct types of activities in design 1.External design 2.Architectural design 3.Detailed design Architectural.
INRIA - Progress report DBGlobe meeting - Athens November 29 th, 2002.
1 ActiveXML peer Anca Ghitescu R&D Engineer - GEMO 19/05/2008.
Peer-to-Peer Result Dissemination in High-Volume Data Filtering Shariq Rizvi and Paul Burstein CS 294-4: Peer-to-Peer Systems.
Data Distribution. Outline Fundamental concepts –Name space –Description expressions –Interest expressions Static Data Distribution: HLA Declaration Management.
1 Copyright © 2005, Oracle. All rights reserved. Following a Tuning Methodology.
Chapter 9: Web Services and Databases Title: NiagaraCQ: A Scalable Continuous Query System for Internet Databases Authors: Jianjun Chen, David J. DeWitt,
JavaScript Introduction and Background. 2 Web languages Three formal languages HTML JavaScript CSS Three different tasks Document description Client-side.
Copyright 2007, Information Builders. Slide 1 iWay Web Services and WebFOCUS Consumption Michael Florkowski Information Builders.
REED : Robust, Efficient Filtering and Event Detection in Sensor Network Daniel J. Abadi, Samuel Madden, Wolfgang Lindner Proceedings of the 31st VLDB.
3/18/2002AIM AB Review of WSRP/WSIA Adaptation Description Language, Past and Present Directions. Ravi Konuru, IBM.
1 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis October 04 Lazy Query Evaluation for Active XML Abiteboul, Benjelloun, Cautis, Manolescu, Milo, Preda.
Rendering XML Documents ©NIITeXtensible Markup Language/Lesson 5/Slide 1 of 46 Objectives In this session, you will learn to: * Define rendering * Identify.
General Architecture of Retrieval Systems 1Adrienn Skrop.
Design of a Notification Engine for Grid Monitoring Events and Prototype Implementation Natascia De Bortoli INFNGRID Technical Board Bologna Feb.
Information Retrieval in Practice
Unified Modeling Language
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Database System Concepts and Architecture.
Relational Algebra 461 The slides for this text are organized into chapters. This lecture covers relational algebra, from Chapter 4. The relational calculus.
Towards an Internet-Scale XML Dissemination Service
Algorithms and Problem Solving
Presentation transcript:

1 Distributed Monitoring of Peer-to-Peer Systems By Serge Abiteboul, Bogdan Marinoiu Docflow meeting, Bordeaux

2 2 Outline  The Monitoring Problem & Approach  A language for specifying monitoring tasks: P2PML  P2PMonitor System  ActiveXML Stream Algebra  Architecture of P2PMonitor  Monitoring Plan Generation & Query Rewriting  Focus on Filtering  Reusing running tasks  Work in progress

3 3 The Monitoring Problem P2P systems are:  a popular support for content sharing communities, distributed applications  highly dynamic (intense communications, content changing rapidly, peers come/leave) and difficult to observe Observation is important !  error management & diagnosis  statistics gathering & optimization issues : the « busiest » peer in a network  business applications : billing & quality of service  Web surveillance

4 4 Is it possible to observe & analyse a P2P system ?  Difficult (if not impossible) in a centralized way  Yes, in a distributed manner

5 5 XML strea m Approach  Detect events at the (monitored) peer level Data changes, Web service calls -> alerters  Each event is represented as an XML document  XML Stream  (distributed) XML stream processing system  XML Streams are published

6 6 Outline  The Monitoring Problem & Approach  A language for specifying monitoring tasks: P2PML  P2PMonitor System  ActiveXML Stream Algebra  Architecture of P2PMonitor  Monitoring Plan Generation & Query Rewriting  Focus on Filtering  Reusing running tasks  Work in progress

7 7 P2PML statement structure XQuery FLWR flavour  For – maps streams to XML variables  Let – assigns new XML variables  Where – imposes conditions on events (filtering and join criteria)  Return – generates reports / restructures XML  By – specifies publication means :  in channels for inside system publication  s, Web pages, RSS feeds for outside system publication

8 8 P2PML statement example for $c on local: outCOM let $timeCall := $c.call.time and $duration := $c.response.time - $timeCall where $c.call.method = “GetTemp” and $duration > 10 and $c.call.site = " return {$timeCall} {$duration} by channel “QoS:Alerts”

9 9 Outline  The Monitoring Problem & Approach  A language for specifying monitoring tasks: P2PML  P2PMonitor System  ActiveXML Stream Algebra  Monitoring Plans  Architecture of P2PMonitor  Focus on Filtering  Reusing running tasks  Work in progress

10 ActiveXML Stream Algebra is the support for monitoring plan representation and the basis for the its optimization :  Distribute the work among the peers  Try to place computation close to data if possible  Try to reduce redundancy

11 Scenario P2PMLQueries XML streams

12 Monitoring Plans

13 Architecture of P2PMonitor(1)  Subscription Manager  Alerters (WS Alerter, Database Alerter, RSS Alerter)  Stream Processors  Without « storage »: Filter, Restructure, Union  With « storage »: Join, Group-By, Duplicate Removal  Publishers  , WebPage, RSS  Channel Publisher : a user or another peer may subscribe to it

14 Architecture of P2P Monitor(2)

15 Outline  The Monitoring Problem & Approach  A language for specifying monitoring tasks: P2PML  P2PMonitor System  ActiveXML Stream Algebra  Monitoring Plans  Architecture of P2PMonitor  Focus on Filtering  Reusing running tasks  Work in progress

16 Focus on Filtering Filtering is a crucial operator in stream processing ! E.g., Many users might be interested in events coming from the same source / alerter : bottleneck hazard Our approach to the problem : two-stage filtering Reasons:  Attributes of XML document’s root reflect the most important properties of an event  The event’s details can be given intentionnally (ActiveXML style)

17 Two-stage filtering A subscription is viewed as a conjunction of simple conditions (e.g., « attribute » = « value ») and « more difficult » XPath queries 1st data structure regroups the (ordered) simple conditions of all the subscriptions by commonalities (Atomic Event Set structure) 2nd data structure regroups XPath queries of all the subscriptions (path – based indexing YFilter style – using NFA) On a XML document: 1st stage: read the root, evaluate AES, detect the « difficult » XPath queries that remain to be evaluated 2nd stage (if needed): adapt the second structure and evaluate the set of XPath queries on the body of the XML document (if necessary execute Web service calls). The output is the set of the subscriptions « hit » by the XML document

18 Outline  The Monitoring Problem & Approach  A language for specifying monitoring tasks: P2PML  P2PMonitor System  ActiveXML Stream Algebra  Monitoring Plans  Architecture of P2PMonitor  Focus on Filtering  Reusing streams / running tasks  Work in progress

19 Reusing running tasks  Optimization by trying to avoid redundancy  Before building new operators (and streams), try to discover useful ones  Stream representation in XML:  Stream Definition Database – description of available streams  Distributed, not centralized (avoid bottlenecks)  Implemented using KadoP – index and repository system over a DHT

20 Stream replication and equivalence(1)  Streams can be replicated between peers  With two similar operators on two replicas of the same stream, we obtain two equivalent streams  Replication can be represented in the Stream Definition Database  Stream Equivalence is difficult to detect

21 Algorithm for discovering useful streams  It uses XPath queries on the Stream Definition Database: E.g. for identifying the output stream of alerter inCOM : 1, P 1 )  It goes bottom-up on the query tree E.g., Join P (σ F 1 ), 2 ) (S 5, P 1 )

22 Outline  The Monitoring Problem & Approach  A language for specifying monitoring tasks: P2PML  P2PMonitor System  ActiveXML Stream Algebra  Monitoring Plans  Architecture of P2PMonitor  Focus on Filtering  Reusing streams / running tasks  Work in progress

23 Work in progress (1)  Link with Incremental View Maintenance Defining a monitoring task by a tree-pattern query on an active document with streams - powerful way of expressing complex monitoring tasks (difficult to express directly in P2PML)

24 Work in progress (2)  Introducing explictly the « time » in P2PML -possible impact on P2PMonitor performance (reactivity) and resource consumption (needed storage) E.g. for $e1 on P1:inCOM, $e2 on P2:outCOM where $e1.timeEvent > $e2.timeEvent +25 …  Queries on traces obtained by P2PMonitor – diagnosis, detecting patterns of evolution for the monitored system E.g. Trace = I 1,I 2 …I n - instances of a document For each new order detected in instance I k, there is a payment present in one of the following instances

25 Thank you very much!