Presentation is loading. Please wait.

Presentation is loading. Please wait.

Querying Heterogeneous Information Sources Using Source Descriptions Authors: Alon Y. Levy Anand Rajaraman Joann J. Ordille Presenter: Yihong Ding.

Similar presentations


Presentation on theme: "Querying Heterogeneous Information Sources Using Source Descriptions Authors: Alon Y. Levy Anand Rajaraman Joann J. Ordille Presenter: Yihong Ding."— Presentation transcript:

1 Querying Heterogeneous Information Sources Using Source Descriptions Authors: Alon Y. Levy Anand Rajaraman Joann J. Ordille Presenter: Yihong Ding

2 Challenges for Information Integration Interrelated data over multiple information sources Large number of the sources Limited size of data in many of the sources Greatly variant details of interacting with each source

3 IM Architecture 1 2 3 Bucket algorithm

4 IM World View Product( Model ) Automobile( Model, Year, Category ) Motorcycle( Model, Year ) Car( Model, Year, Category ) NewCar( Model, Year, Category ) UsedCar( Model, Year, Category ) CarForSale( Model, Year, Category, Price, SellerContact ) Automobile CarMotorcycle Car UsedCarCarForSale Product Automobile Virtual Relations: Classes: NewCar

5 Source Descriptions For each source: Content Record Capability Record Web Sources for Automobile Application

6 Content Records of Auto Sources

7 Capability Records of Auto Sources desired input setpossible output set capable selection set

8 Query Reformulation Containing instead of equivalent –Incomplete source –Useful subset Utilizes Plan Generator to: –Prune irrelevant sources –Split query into subgoals –Generate conjunctive query plans –Find executable ordering of subgoals

9 The Bucket Algorithm Given: user query q, source descriptions {V i } 1.Find relevant source (fill buckets) For each relation g in query q Find V j that contains relation g Check that constraints in V j are compatible with q 2.Combine source relations {V j } from each bucket into a conjunctive query q’ and check for containment (q’  q)

10 The Bucket Algorithm: Example q(m,p,r)  CarForSale(c), Category(c,sportscar), Year(c,y), y  1992, Model(c,m), Price(c,p), ProductReview(m,y,r)

11 1. Filling the Buckets q(m,p,r)  CarForSale(c), Category(c,sportscar), Year(c,y), y  1992, Model(c,m), Price(c,p), ProductReview(m,y,r) V 1 (c 1 ) V 2 (c 2 ) V 3 (c 3 ) V 1 (c 1,t 1 ) V 2 (c 2,t 2 ) V 3 (c 3,t 3 ) V 1 (c 1,y 1 ) V 2 (c 2,y 2 ) V 3 (c 3,y 3 ) V 1 (c 1,m 1 ) V 2 (c 2,m 2 ) V 3 (c 3,m 3 ) V 1 (c 1,p 1 ) V 2 (c 2,p 2 ) V 3 (c 3,p 3 ) V 5 (m 5,y 5,r 5 ) CarForSale(c), Category(c,t),Year(c,y),Model(c,m),Price(c,p),ProductReview(m,y,r) y  1992 t=sportscar

12 2. Checking Containment User Query q(m,p,r)  CarForSale(c), Category(c,sportscar), Year(c,y), y  1992, Model(c,m), Price(c,p), ProductReview(m,y,r) Result Query q’(m,p,r)  V 1 (c)({Category(c):sportscar}, {Price(c), Model (c), Year(c)}, {Year(c)  1992, Category(c)=sportscar}), V 5 (m,y,r)({m:Model(c), y:Year(c)}, {r}, {}).  ?  Expanded Query q’(m,p,r)  CarForSale(c), UsedCar(c), Category(c,t), t=sportscar, Model(c,m), Year(c,y), Price(c,p), ProductReview(m,y,r), y  1992 

13 Finding an Executable Ordering CarForSale(c), Category(c,t),Year(c,y),Model(c,m),Price(c,p),ProductReview(m,y,r) y  1992 t=sportscar V 1 (c)V 1 (c,t)V 1 (c,y)V 1 (c,m)V 1 (c,p)V 5 (m,y,r) BindAvail 1 = {CarForSale(c,sportscar), Model(c,m), Year(c,y), Price(c,p), SellerContact(c,s)} BindAvail 1 = {CarForSale(c,sportscar), Model(c,m), Year(c,y), Price(c,p), SellerContact(c,s), ProductReview(m,y,r)} BindAvail 1 = {CarForSale(c,sportscar), Model(c,m), Year(c,y), Price(c,p), SellerContact(c,s), ProductReview(m,y,r), y  1992} 

14 Experimental Results Query 1: Find titles and years of movies featuring Tom Hanks Query 2: Find titles and reviews of movies featuring Tom Hanks Query 3: Find telephone number(s) for Alaska Airlines

15 Conclusions Source descriptions as content record and capability record Bucket algorithm for query reformulation


Download ppt "Querying Heterogeneous Information Sources Using Source Descriptions Authors: Alon Y. Levy Anand Rajaraman Joann J. Ordille Presenter: Yihong Ding."

Similar presentations


Ads by Google