Presentation is loading. Please wait.

Presentation is loading. Please wait.

University of Maryland Scaling Heterogeneous Information Access for Wide area Environments Michael Franklin and Louiqa Raschid.

Similar presentations


Presentation on theme: "University of Maryland Scaling Heterogeneous Information Access for Wide area Environments Michael Franklin and Louiqa Raschid."— Presentation transcript:

1 University of Maryland Scaling Heterogeneous Information Access for Wide area Environments Michael Franklin and Louiqa Raschid

2 University of Maryland Wide-Area Data Access Problems n Scalability of Wrapper-Mediator Systems n Publishing and Discovery of Sources n Dissemination of Relevant Information Relevant Technologies n Flexible Architectures n Adaptive Systems n Metadata Management

3 University of Maryland The Big Picture

4 University of Maryland the little picture Predator O-R DBMS Remote wrapper interface Planner Scrambler MDT Wrapper interface Web sources

5 University of Maryland Querying Web Sources n Generating wrappers for Web accessible sources to provide an API for queries and structured answers. n Obtaining and representing source capability and content descriptions to use in query planning. n Estimating the response time for cost-based optimization

6 University of Maryland Web application wrapper toolkit n Define the capabilities of Web sources n A wrapper interface to publish source capability n A wrapper toolkit u Translation from query + bindings –› URL u Declarative language to specify Extractors Simple extractors HTML or XMLData –» structured object Complex extractors - customizable crawler utility for extraction of meta-information n Generator for JDBC compliant wrappers n Metadata and query and answer interface

7 University of Maryland Weather source

8 University of Maryland Results from the Weather source

9 University of Maryland

10 Query Planning for Web sources Objective: Generate safe optimal plans with possibly replicated sources n Multiple heterogeneous sources u Limited capability (bindings) u Possible replication of contents u Complete / Incomplete sources n Use meta-information to construct lattices n Generate safe plans with alternatives n Mediator algebra and rules for optimization

11 University of Maryland

12 Content and Capability Descriptions n Domain information n Capability descriptions: u I/O relationships: Time,Date Channel,Title,Category u Content: Date:CurrentYear Time:{0, …,23} Channel:CNW u Completeness information, Complete. Source S3 provides complete answer when Time and Date are bound and Channel=ppv and Category=Movies. F Explicitly provided by the source DBA. F Augmented by inference. F Augmented by learning based on query feedback

13 University of Maryland Sources in Lattices

14 University of Maryland Display pay-per-view movies shown on August 14th,1998 at 9:30am. Using Buckets (S1|S3) in AlternatePartition and (S5  S1) and (S5  S3)in SimilarPartition

15 University of Maryland Web Source Response Time Estimation Tool - MDT Problem: Difficulty in determining evaluation costs n Physical implementation details unknown n Load on network and source unknown Objective: Tool to estimate response time based on query feedback and estimate confidence. To be used in a combined cost-model and to choose between alternate sources. n MDT is a tool that estimates response time based on Day, Time, Quantity, etc.

16 University of Maryland Configuring and learning in the MDT MDT is configured for some hierarchy of dimensions n Calibration of each dimension u min/ max/ scale u Allowed deviation u Confidence window n Learning algorithm u Cell splitting algorithm u Value correction algorithm u Estimate response time and confidence

17 University of Maryland Correcting the confidence of estimated value

18 University of Maryland

19

20

21

22

23 Conclusions n Extend the Predator O-R DBMS with scalable mediator functionality n Current implementation status u Scrambling enabled optimizer u Mediator algebra and logical optimizer u Cost-based optimizer based on MDT estimation n Toolkit for generating wrappers for Web sources

24 University of Maryland Still to come … n Publishing source metadata n Discovering sources n Source selection using metadata n User profiles n Dissemination of relevant data


Download ppt "University of Maryland Scaling Heterogeneous Information Access for Wide area Environments Michael Franklin and Louiqa Raschid."

Similar presentations


Ads by Google