The Active XML project: an overview Serge Abiteboul · Omar Benjelloun · Tova Milo Lazy Query Evaluation for Active XML Abiteboul, Benjelloun, Cautis, Manolescu,

Slides:



Advertisements
Similar presentations
Inside an XSLT Processor Michael Kay, ICL 19 May 2000.
Advertisements

Copyright, UCL LEADERS: Linking EAD to Electronically Retrievable Sources Developing a Generic Toolkit: Architecture and technology issues ALLC/ACH Conference.
Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology.
A Prototype Implementation of a Framework for Organising Virtual Exhibitions over the Web Ali Elbekai, Nick Rossiter School of Computing, Engineering and.
XML: Extensible Markup Language
Web Service Ahmed Gamal Ahmed Nile University Bioinformatics Group
Search in Source Code Based on Identifying Popular Fragments Eduard Kuric and Mária Bieliková Faculty of Informatics and Information.
 Copyright 2005 Digital Enterprise Research Institute. All rights reserved. Workflow utilization in composition of complex applications based.
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
Serge Abiteboul Omar Benjelloun Bogdan Cautis Ioana Manolescu Tova Milo Nicoleta Preda Lazy Query Evaluation for Active XML.
Xyleme A Dynamic Warehouse for XML Data of the Web.
NaLIX: A Generic Natural Language Search Environment for XML Data Presented by: Erik Mathisen 02/12/2008.
XML Technologies and Applications Rajshekhar Sunderraman Department of Computer Science Georgia State University Atlanta, GA 30302
Visual Web Information Extraction With Lixto Robert Baumgartner Sergio Flesca Georg Gottlob.
B.Sc. Multimedia ComputingMedia Technologies Database Technologies.
Business Process Orchestration
XML –Query Languages, Extracting from Relational Databases ADVANCED DATABASES Khawaja Mohiuddin Assistant Professor Department of Computer Sciences Bahria.
XML Technologies and Applications Rajshekhar Sunderraman Department of Computer Science Georgia State University Atlanta, GA 30302
CIS607, Fall 2005 Semantic Information Integration Article Name: Clio Grows Up: From Research Prototype to Industrial Tool Name: DH(Dong Hwi) kwak Date:
COS 381 Day 16. Agenda Assignment 4 posted Due April 1 There was no resubmits of Assignment Capstone Progress report Due March 24 Today we will discuss.
Introduction to XSLT & its use in Grainger Library full-text & metadata projects Thomas G. Habing Grainger Engineering Library Presentation to ASIS&T,
JSP Standard Tag Library
Aurora: A Conceptual Model for Web-content Adaptation to Support the Universal Accessibility of Web-based Services Anita W. Huang, Neel Sundaresan Presented.
Dynamic XML documents with distribution and replication Angela Bonifati (currently in Icar-CNR, Italy) Joint work with: Serge Abiteboul, Gregory Cobéna,
Computing & Information Sciences Kansas State University Monday. 20 Oct 2008CIS 560: Database System Concepts Lecture 21 of 42 Monday, 20 October 2008.
1 Distributed Monitoring of Peer-to-Peer Systems By Serge Abiteboul, Bogdan Marinoiu Docflow meeting, Bordeaux.
WSDL Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
XPath Processor MQP Presentation April 15, 2003 Tammy Worthington Advisor: Elke Rundensteiner Computer Science Department Worcester Polytechnic Institute.
XML Overview. Chapter 8 © 2011 Pearson Education 2 Extensible Markup Language (XML) A text-based markup language (like HTML) A text-based markup language.
1 CIS336 Website design, implementation and management (also Semester 2 of CIS219, CIS221 and IT226) Lecture 6 XSLT (Based on Møller and Schwartzbach,
INTERPRETING IMPERATIVE PROGRAMMING LAGUAGES IN EXTENSIBLE STYLESHEET LANGUAGE TRANSFORMATIONS (XSLT) Authors: Ruhsan Onder Assoc.
Interoperability in Information Schemas Ruben Mendes Orientador: Prof. José Borbinha MEIC-Tagus Instituto Superior Técnico.
1 HKU CSIS DB Seminar: HKU CSIS DB Seminar: Web Services Oriented Data Processing and Integration Speaker: Eric Lo.
Database Management 9. course. Execution of queries.
AXML Transactions Debmalya Biswas. 16th AprSEIW Transactions A transaction can be considered as a group of operations encapsulated by the operations.
XML A web enabled data description language 4/22/2001 By Mark Lawson & Edward Ryan L’Herault.
1 XSLT An Introduction. 2 XSLT XSLT (extensible Stylesheet Language:Transformations) is a language primarily designed for transforming the structure of.
Presentation Topic: XML and ASP Presented by Yanzhi Zhang.
WEB BASED DATA TRANSFORMATION USING XML, JAVA Group members: Darius Balarashti & Matt Smith.
XML Web Services Architecture Siddharth Ruchandani CS 6362 – SW Architecture & Design Summer /11/05.
Future and Emerging Technologies (FET) Future and Emerging Technologies (FET) The roots of innovation Proactive initiative on: Global Computing (GC) Proactive.
Introduction to Server-Side Web Development Introduction to Server-Side Web Development using JSP and Web Services JSP and Web Services 18 th March 2005.
Chapter 10 Intro to SOAP and WSDL. Objectives By study in the chapter, you will be able to: Describe what is SOAP Exam the rules for creating a SOAP document.
© Drexel University Software Engineering Research Group (SERG) 1 An Introduction to Web Services.
[ Part III of The XML seminar ] Presenter: Xiaogeng Zhao A Introduction of XQL.
JSTL The JavaServer Pages Standard Tag Library (JSTL) is a collection of useful JSP tags which encapsulates core functionality common to many JSP applications.
Distribution and components. 2 What is the problem? Enterprise computing is Large scale & complex: It supports large scale and complex organisations Spanning.
1Mr.Mohammed Abu Roqyah. Database System Concepts and Architecture 2Mr.Mohammed Abu Roqyah.
XML and Its Applications Ben Y. Zhao, CS294-7 Spring 1999.
CS 157B: Database Management Systems II February 11 Class Meeting Department of Computer Science San Jose State University Spring 2013 Instructor: Ron.
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 4 1COMP9321, 15s2, Week.
INRIA - Progress report DBGlobe meeting - Athens November 29 th, 2002.
Dec. 13, 2002 WISE2002 Processing XML View Queries Including User-defined Foreign Functions on Relational Databases Yoshiharu Ishikawa Jun Kawada Hiroyuki.
Martin Kruliš by Martin Kruliš (v1.1)1.
Exchange Intensional XML Data Tova MiloSerge Abiteboul Tova Milo INRIA & Tel-Aviv U. ; Serge Abiteboul INRIA ; Bernd AmannOmar Benjelloun Bernd Amann Cedric-CNAM.
Copyright 2007, Information Builders. Slide 1 iWay Web Services and WebFOCUS Consumption Michael Florkowski Information Builders.
 XML derives its strength from a variety of supporting technologies.  Structure and data types: When using XML to exchange data among clients, partners,
1 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis October 04 Lazy Query Evaluation for Active XML Abiteboul, Benjelloun, Cautis, Manolescu, Milo, Preda.
Rendering XML Documents ©NIITeXtensible Markup Language/Lesson 5/Slide 1 of 46 Objectives In this session, you will learn to: * Define rendering * Identify.
XML and Distributed Applications By Quddus Chong Presentation for CS551 – Fall 2001.
Lecture Transforming Data: Using Apache Xalan to apply XSLT transformations Marc Dumontier Blueprint Initiative Samuel Lunenfeld Research Institute.
XML 1. Chapter 8 © 2013 Pearson Education, Inc. Publishing as Prentice Hall SAMPLE XML SCHEMA (XSD) 2 Schema is a record definition, analogous to the.
ISC321 Database Systems I Chapter 2: Overview of Database Languages and Architectures Fall 2015 Dr. Abdullah Almutairi.
12. DISTRIBUTED WEB-BASED SYSTEMS Nov SUSMITHA KOTA KRANTHI KOYA LIANG YI.
Databases (CS507) CHAPTER 2.
Distribution and components
Distributed web based systems
XML in Web Technologies
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Database System Concepts and Architecture.
More XML XML schema, XPATH, XSLT
Presentation transcript:

The Active XML project: an overview Serge Abiteboul · Omar Benjelloun · Tova Milo Lazy Query Evaluation for Active XML Abiteboul, Benjelloun, Cautis, Manolescu, Milo, Preda presented by: Irene Genitsaridh Univ. of Crete hy561 April 28, 2009

The problem addressed is web data management. Web characteristics  High heterogeneity of data sources.  Autonomy of data sources.  The scale of the Web. Result  Web revolution is setting up new standards.

A language based on XML, Web services and XQuery, for complex data management tasks. XML suitable model for web data exchange. Xquery is a query language for XML promoted by the W3C (SQL of the Web). Web services are network-accessible programs taking XML parameters and returning XML results.

Embedding calls to Web services inside XML documents.

Materialization. The service invocation is done using the SOAP protocol the result of this invocation is used to enrich the document. Tree Representation. The same document at different times will have different semantics.

Axml Services are Web services that accept AXML documents as input parameters, and return AXML documents as results. Materialization becomes a recursive process, since calling an AXML service may return some data that may contain new service calls. After invoking

The data exchanged by Web services is controlled by schemas for their input and output, specified within a WSDL description. Similarly, schemas are used to control the axml data exchange. DTD-like syntax The schema distinguishes between accepting a concrete type, e.g., a temperature element, and accepting a service call returning data of this particular type. The actual syntax in the system is an extension of XML Schema.

A site about a city’s night-life (restaurants- movies). Query: /goingOut/movies//show[title= "The Hours"]/schedule.  No point in materializing calls below the path: /goingOut/restaurants.  Avoid materializing a call found below: /goingOut/movies Solution: ( Naive approach ) Materializing all the calls in the document recursively, until a fixpoint is reached, and finally running the query over the resulting document.

Evaluation approach: Lazy evaluation Identifying in advance a tight superset of the service calls that should actually be invoked to answer a query. General Problem: Service calls may appear anywhere in the data, and dynamically in results of previously materialized calls. Solution: Force sufficient conditions for termination or that the computation halts if a full state is not reached after some time limit.

Sample Active Xml Document.

A sample query.  Queries are modeled by tree patterns. The relevant functions in the above Axml doc are 1, 3, 4 and 10.

Computing the set of relevant service calls: Given a query, generate a set of auxiliary queries that, when evaluated on a document, retrieve all service calls that are relevant to the query. Advantage : In contrast to the naive approach only functions that may contribute to the query result are invoked. Disadvantage: There is a tradeoff between accuracy and efficiency. It is expensive to exactly detect which calls are relevant and which are not. The challenge is thus to find the right balance between the efforts spent on ruling out irrelevant calls and the actual time saved by avoiding their invocation.

Pruning via typing: The return types of services are used to rule out more irrelevant service calls. Pushing queries  For instance, getNearbyRestos may return many restaurants. As we are only interested in five-star ones, and more precisely, only in their names and addresses.  Push to the function call a precise subquery, specifying that it has to apply the five-star rating selection, and only return the relevant names and addresses.

Algorithms to find a complete relevant rewriting:  Linear path queries (LPQ) 1. /*() 2. /nyHotels/*() 3. /nyHotels/hotel/*() 4. /nyHotels/hotel/name/*() 5. /nyHotels/hotel/rating/*() 6. /nyHotels/hotel/nearby/*() 7. /nyHotels/hotel/nearby//*() 8. /nyHotels/hotel/nearby//restaurant/*() 9. /nyHotels/hotel/nearby//restaurant/name/*() 10. /nyHotels/hotel/nearby//restaurant/address/*() 11. /nyHotels/hotel/nearby//restaurant/rating/*()

Correct, but usually inaccurate. Ignores filtering conditions in the path from the root or in other branches that could make some of the functions irrelevant (e.g. there is no chance that a getNearbyRestos() function node under a hotel is relevant, if the hotel rating is not “*****”). Constructing one linear path query per node.

 Node Focused Queries. Instead of constructing one linear path query per node in the query, it is used an algorithm called NFQ that includes the filtering conditions from the original query. In Contrast with Linear Path Queries, now the function nodes that are relevant for a query q are precisely the ones retrieved by the NFQs of q.

Service calls sequencing: The relationships among the calls are analyzed to derive an efficient sequence of call invocations appropriate to answer the query.  An algorithm based on NFQ called NFQA is used to compute a (possibly infinite) relevant rewriting. If it terminates, the obtained document is complete for the query q.

F-guide: A specialized access structure in the style of data-guides is used to speed up the search for relevant calls. The structure acts as an index, summarizing concisely the occurrences of functions (service calls) in the documents (hence its name, F-guide). The F-guide also holds the path extents: for each path we keep pointers to the corresponding function call nodes in the document.

It is adopted a distributed architecture based on the peer-to-peer paradigm to support the AXML language. Each participant may act both as a client and as a server. AXML peers have essentially three facets:  Repository.  Server (may provide Web services for other peers to use).  Client (may invoke the corresponding Web services that other peers provide).

Enforce the following policy: Temperature information is refreshed daily. Simple constructs in the language support specifying when service calls are invoked. So the language will enable specifying the above policy. In this situation:  Service calls should generally be kept inside AXML documents, for future reuse.  Materialization will not replace service calls by their results anymore, but will append the results of each call next to it.

Important features of AXML Service calls.  Where to get the arguments of a call? The arguments of a service call are specified as children of the call element. In the simplest case, an argument is plain XML. More generally, arguments can be AXML data, and therefore may themselves contain service calls.  When to activate a call? A special attribute of the call element.  How long data returned by a service call remains valid? A special attribute of the call element. 

In an AXML peer, AXML services can be defined as parameterized queries or updates over the peer’s AXML documents. Sample Service

 The AXML peer is implemented in Java.  The AXML peer relies on the Apache Xerces XML parser to parse documents, and manipulate them. The AXML peer also uses the Apache Xalan processor for XPath queries and XSLT transformations.  The Tomcat servlet engine: the AXML Peer needs to act as a Web server.  AXML documents can be turned into a Web application through Java Server Pages.  Axis is a Java toolkit that enables Web services functionality both on the server- side and the client-side.  The AXML peer relies on the X-OQL engine to execute complex queries on XML documents.

Some of the applications that we developed using AXML peers.  Peer-to-peer auctions: The main goal of this application is to illustrate the flexible discovery mechanism of new peers and auctions.  Electronic patient record management: The goal of this application, is to show that AXML can seamlessly manage distributed data and the privacy of this data. This is done by combining the AXML language with GUPster framework (access control). Academic and Industrial Collaborations  Distribution of Mandriva Linux : Aims at better management the production and distribution of Open Source software.