Download presentation
Presentation is loading. Please wait.
1
1 Serge Abiteboul – Singapore 2002 1 Web services and data integration S. AbiteboulOmar Benjelloun Tova Milo INRIA and Xyleme INRIAINRIA and Tel Aviv Serge.Abiteboul@inria.fr Singapore, December 2002
2
2 Serge Abiteboul – Singapore 2002 2 Organization The context Accessing information on the Web Web services –SOAP –WSDL –UDDI Active XML –AXML documents –AXML services Architecture et implementation Applications Conclusion
3
3 Serge Abiteboul – Singapore 2002 3 The context The Web and XML are changing dramatically the management of distributed information
4
4 Serge Abiteboul – Singapore 2002 4 Distributed data management Warehousing Mediation Management of data in cooperative work Management of data in distributed scientific applications Mobile data management Document management Web sites Portals, etc. Information used to live in islands and this is changing
5
5 Serge Abiteboul – Singapore 2002 5 The Web of yesterday Protocol: HTTP Documents: HTML Millions of independent Web sites and billions of documents Browsing and full-text indexing Publication of databases using forms Data management with the Web –HTML is primarily to be read by humans –Data management applications over Web data Based on hand-made wrappers Expensive, incomplete, short-lived, not adapted to the Web constant change No real support for distributed data management!
6
6 Serge Abiteboul – Singapore 2002 6 Information used to live in islands but it is changing Different formats: relational, metadata, documents, text, DXF –A Web standard for data exchange, XML, is fixing it –XML captures all kinds of information over a wide spectrum –XML comes with a family of emerging standards: XML schema, XSL/T, Xquery, domain specific schemas… Different computers, platforms, languages, applications –A standard for Web services, SOAP, is fixing it –SOAP allows ubiquitous computing on the Internet –SOAP comes with a family of emerging standards: WSDL, UDDI This provides a uniform access to information… …the dream for distributed data management
7
7 Serge Abiteboul – Singapore 2002 7 The information spectrum Structured Data Minimal structure Meta dataHierarchy + BooksContractsCatalogs Bank accounts Emails Financial Reports Insurance Policies Economical Analysis Derivatives Inventory Political analysis Insurance Claims Financial NewsSports News Resumes Semi-structured data and XML
8
8 Serge Abiteboul – Singapore 2002 8 What can be captured with XML? Very structured information such as database, knowledge base –Most DBMS now export in XML Semi-structured data such as data exchange formats (ASN.1, SGML), e.g., technical documentation Less structured data: documents –Meta-data: Author, date, status –Existing structure in them: chapter, section, table of content and index –Possibly tagging of elements in it (citation, lists) –Links to other documents Plain text Meta data for unstructured data such as images and sound
9
9 Serge Abiteboul – Singapore 2002 9 A standard for information: XML labeled ordered trees where leaves are text Marriage of document and database worlds Marriage of full text indexing and structure indexing Is it the ultimate data model? No Purely syntax – more semantics needed Is it OK for now? Definitely yes (because it is a standard)
10
10 Serge Abiteboul – Singapore 2002 10 The main asset of XML: typing Applications need typing and XML data can be typed if needed (DTD and XML schema) Trees Logical Granularity – neither page or document level – but the piece of information that is needed Semantics and structure are in tags and paths –product-table/product/reference –product-table/product/price product designation description price reference product-table
11
11 Serge Abiteboul – Singapore 2002 11 A standard for distributed computing: Web services Possibility to activate a method on some remote Web server Exchange information in XML: input and result are in XML Ubiquitous XML distributed computing infrastructure 2 main applications –E-commerce –Access to remote data With XML and Web services, it is possible –To get information from virtually anywhere –To provide information to virtually anywhere
12
12 Serge Abiteboul – Singapore 2002 12 The basic picture Black box m( ) SOAP messages answer Internet Web client XML SOAP service query
13
13 Serge Abiteboul – Singapore 2002 13 Accessing and integrating information
14
14 Serge Abiteboul – Singapore 2002 14 Accessing remote information Application using gene banks Query some data services that provide candidate genes Gene banks processing Use some processing services Multi formats + multi protocoles
15
15 Serge Abiteboul – Singapore 2002 15 Same with Web services Query some data services that provide candidate genes Gene banks processing Use some processing services Web Application using gene banks
16
16 Serge Abiteboul – Singapore 2002 16 The big picture: peer2peer Web queries Web service Web service Data warehouses Databases Web pages PC, PDA, cell phones… … DB Web Service DB Web Service queries
17
17 Serge Abiteboul – Singapore 2002 17 The main roles Client Service Provider Service Registry publish bind Look up
18
18 Serge Abiteboul – Singapore 2002 18 Simple view: Looking for information about Gismos 1.Query some yellow-pages: Who knows about Gismos? 1.Negotiate with Gismo specialists Nature of the service Quality, cost 2.Get the information Order, payment, delivery Integration in my information system 3.Eventually publish information 4.… and all this automatically…
19
19 Serge Abiteboul – Singapore 2002 19 Data integration – Logical view Mediator or warehouse Service directories Service descriptions Get service description source1 source2source3 wrapper1 wrapper2 wrapper3 Ontologies Find ontologies to build wrappers
20
20 Serge Abiteboul – Singapore 2002 20 The Web service solution Web UDDI RDF wsdl XML+SOAP wsfl Data and service description worklow Data and service repository Data and service semantics
21
21 Serge Abiteboul – Singapore 2002 21 Mediation with Web services Mediator source1 source2 source3 wrapper1 wrapper2 wrapper3 Web Web services: Service directories Service descriptions Wrappers Sources Mediators/warehouses Service directories Service descriptions
22
22 Serge Abiteboul – Singapore 2002 22 Advantages for data integration A universal model for data integration = XML –Solves the heterogeneity issue A universal protocol for distribution = SOAP A language for describing the interface of data sources = WSDL –Simple object access protocol (something like Corba) –Web service description language (something like IDL) –Solves the interoperability issue A standard for publication and discovery of information = UDDI –Universal Description, Discovery and Integration A standard for describing the semantics of sources = RDF –Resource description framework
23
23 Serge Abiteboul – Singapore 2002 23 Advantages – continued – the goal The system can find a new source of information using UDDI Understand its syntax using WSDL Understand its semantics using RDF Get it using SOAP The information is in XML, can be restructured and integrated automatically Not yet… But soon?
24
24 Serge Abiteboul – Singapore 2002 24 Jargon XML XHTML RDF.NET RosettaNet WSFL DTD Xschema XSL XSLT XSL-FO ebXML namespace HTTPS OASIS HTTP SOAP OAGIS WSDL ICE RSS UDDI WSDL MIME Help!
25
25 Serge Abiteboul – Singapore 2002 25 Active XML Joint work with: Bernd Amann, Jerôme Baumgarten, Angela Bonifati, Ioana Manolescu, Frederic Ngoc and others
26
26 Serge Abiteboul – Singapore 2002 26 q1($1,$2), Q2, Q3… (XPATH, Xquery) AXML = XML + embedded SOAP calls AXML Internet AXML peer: client and server Web server m( ) SOAP messages answer AXML query Internet answer query Web client
27
27 Serge Abiteboul – Singapore 2002 27 Active XML Peer-to-peer architecture Each Active XML peer –Repository: manages active XML data with embedded Web service calls –Web client: activate calls in the documents –Web server: provides Web services defined as (parameterized) queries over the repository AXML peer soap
28
28 Serge Abiteboul – Singapore 2002 28 Build on existing standards Tree data: XML –internal data representation and –data exchange Web services: SOAP, WSDL Query languages: Xquery/Xpath AXML XML
29
29 Serge Abiteboul – Singapore 2002 29 AXML peer: repository of AXML documents toy.xyz.com/GetToyPersonel() dvd2000.com/GetDVDPersonnel() Service calls May contain calls to any SOAP Web service e-bay.net, google.com, etc. to any AXML Web service
30
30 Serge Abiteboul – Singapore 2002 30 AXML peer: Web client 01… toy.xyz.com/GetPDA(../../@pname) toy.xyz.com/GetToyPersonel() dvd2000.com/GetDVDPersonnel() Result
31
31 Serge Abiteboul – Singapore 2002 31 Controlling the evaluation Activation of calls and data lifespan are controlled –frequency: when is the service called ? (« call each day ») – validity: how long is the retrieved data valid ? – mode: immediate or lazy ?
32
32 Serge Abiteboul – Singapore 2002 32 Example: control attributes toy.xyz.com/GetToyPersonel() dvd2000.com/GetDVDPersonnel()
33
33 Serge Abiteboul – Singapore 2002 33 AXML peer: Web server AXML Web services: defined using XQuery over AXML documents let service Get-Toy-Personnel( ) be for $a in document("toy.xyz.com/members.axml")/member, $b in $a//name, $c in $a//phone, $d in $a//pda return { $c } { $d }
34
34 Serge Abiteboul – Singapore 2002 34 The crux: the exchange of AXML data Arguments & result of calls are AXML Data is thus intentional & dynamic Distributed computing: by sending data containing service calls, one can delegate some work to other peers Partial computations: by returning data containing service calls, one can give to the receiver the control of these calls All this can be controlled
35
35 Serge Abiteboul – Singapore 2002 35 Example: Tourist guide … yahoo.com/Temp(“Paris”) … I need to evaluate the temperature of Paris 1.I call Yahoo: meteoF.com/t(“Paris”) 2.I call meteoF: 0 I am asked what is the temperature of Paris … 0 … meteoF.com/t(“Paris”) … … yahoo.com/Temp(“Paris”) …
36
36 Serge Abiteboul – Singapore 2002 36 Continuous services Inside the tourist guide: new events Pull mode : standard SOAP query –Ask once a week Push mode : subscription to a continuous service –When new events are announced, they are pushed to the AXML document Possibility to define AXML continuous services
37
37 Serge Abiteboul – Singapore 2002 37 Architecture and implementation
38
38 Serge Abiteboul – Singapore 2002 38 Global architecture XQuery processor Evaluator query service descriptions read update read update consults SOAP wrapper SOAP AXML peer S3 SOAP service SOAP client AXML peer S1 service callservice result AXML document store AXML peer S2 AXML XML AXML
39
39 Serge Abiteboul – Singapore 2002 39 Implementation SUN’s Java SDK 1.4 (includes XML parser, XPath processor, XSLT engine) Apache Tomcat 4.0 servlet engine Apache Axis SOAP toolkit 1.0 beta 3 X-OQL query processor, persistent DOM repository JSP-based user interface, using JSTL 1.0 standard tag library First prototype –No lazy evaluation –No continuous services On going work on typing, security, replication… Demo for VLDB’02 –P2P auctioning system
40
40 Serge Abiteboul – Singapore 2002 40 Illustration: 3 applications
41
41 Serge Abiteboul – Singapore 2002 41 Application 1: Warehousing Construction of warehouses with Web data Monitoring of changes on the Web Kind of services that are used –Google search engine –wget –Classification –XML Diff and site changes –Page monitoring system –etc.
42
42 Serge Abiteboul – Singapore 2002 42 Application 2: Mobile data AXML peers as mobile entities Active data store with query capabilities –Metadata and object profiles Issues –Storage services for mobile objects –Processing services for mobile objects –Use proxies for that European Project DBGlobe
43
43 Serge Abiteboul – Singapore 2002 43 Application 2: Mobile data Light-weight AXML peers –PDA, cellular phone, laptop… –Limited storage, network bandwidth –Sometime disconnected Limited functionalities –E.g., support for continuous services based on a mail server and SMTP
44
44 Serge Abiteboul – Singapore 2002 44 Application 2 : context awareness Where am I? (geographical position) Where is the « nearest » AXML proxy? (network position) Active use of this information –For providing context dependent data (e.g., time, temperature, nearest restaurants, etc.) –For selecting services (e.g., choose a nearby proxy for caching)
45
45 Serge Abiteboul – Singapore 2002 45 Application 3: P2P Auction Each peer proposes some auctions –The document records the peer’s items and the bids Each peer knows about some auctions of other peers Each peer can bid on any auction –The peer recalls the bids she has put When an auction closes, the winner is notified No centralization
46
46 Serge Abiteboul – Singapore 2002 46 Conclusion and on-going work
47
47 Serge Abiteboul – Singapore 2002 47 AXML services A simple, declarative way to create Web services compatible with current standards for Web services invocation AXML services are powerful tools for data integration They allows for new, powerful features Intentional parameters and results: AXML documents (containing service calls) that are exchanged. Continuous services send back a stream of answers (SOAP messages) to the caller
48
48 Serge Abiteboul – Singapore 2002 48 Many issues Security Typing of parameters Lazy evaluation and optimization Replication Mobility: dbglobe project Termination Implementation Foundations And more
49
49 Serge Abiteboul – Singapore 2002 49 Security Peers exchange AXML documents containing service calls A server (resp. client) might ask the client (resp. server) to do something « bad »: qod.com/QuoteOfDay My heart was bumping Tskitishvili, picked 5th in the NBA draft by the Denver Nuggets buy.com/BuyCar(« BMW Z3 »)
50
50 Serge Abiteboul – Singapore 2002 50 Using type to control the use of services Peer1 Peer2 fg Evaluate g before sending data f Accept Peer1 tells which kind of data it exports and Peer2 which kind it accepts
51
51 Serge Abiteboul – Singapore 2002 51 Distribution and replication Motivated by mobile devices with limited resources Allows to distribute one XML document on several peers Allows to replicate an XML-sub-tree on several peers Query optimization
52
52 Serge Abiteboul – Singapore 2002 52 Thanx more questions: Serge.Abiteboul@inria.fr
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.