Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 21.2 Modes of Information Integration ID: 219 Name: Qun Yu Class: CS257 219 Spring 2009 Instructor: Dr. T.Y.Lin.

Similar presentations


Presentation on theme: "Chapter 21.2 Modes of Information Integration ID: 219 Name: Qun Yu Class: CS257 219 Spring 2009 Instructor: Dr. T.Y.Lin."— Presentation transcript:

1 Chapter 21.2 Modes of Information Integration ID: 219 Name: Qun Yu Class: CS257 219 Spring 2009 Instructor: Dr. T.Y.Lin

2 Content Index 21.2 Modes of Information Integration 21.2.1 Federated Database Systems 21.2.2 Data Warehouses 21.2.3 Mediators

3 Federations  The simplest architecture for integrating several DBs  One to one connections between all pairs of DBs  n DBs talk to each other, n(n-1) wrappers are needed  Good when communications between DBs are limited

4 Wrapper : a software translates incoming queries and outgoing answers. In a result, it allows information sources to conform to some shared schema. Wrapper

5 Federations Diagram DB2DB1 DB3DB4 2 Wrappers A federated collection of 4 DBs needs 12 components to translate queries from one to another.

6 Example Car dealers want to share their inventory. Each dealer queries the other’s DB to find the needed car. Dealer-1’s DB relation: NeededCars(model,color,autoTrans) Dealer-2’s DB relation: Auto(Serial, model, color) Options(serial,option) Dealer-1’s DB Dealer-2’s DB wrapper

7 Example… Dealer 1 queries Dealer 2 for needed cars For(each tuple(:m,:c,:a) in NeededCars){ if(:a=TRUE){/* automatic transmission wanted */ SELECT serial FROM Autos, Options WHERE Autos.serial = Options.serial AND Options.option = ‘autoTrans’ AND Autos.model = :m AND Autos.color =:c; } Else{/* automatic transmission not wanted */ SELECT serial FROM Auto WHERE Autos.model = :m AND Autos.color = :c AND NOT EXISTS( SELECT * FROM Options WHERE serial = Autos.serial AND option=‘autoTrans’); }

8 Data Warehouse  Sources are translated from their local schema to a global schema and copied to a central DB.  User transparent: user uses Data Warehouse just like an ordinary DB  User is not allowed to update Data Warehouse

9 Warehouse Diagram Warehouse Extractor Source 1Source 2 User query result Combiner

10 Example Construct a data warehouse from sources DB of 2 car dealers: Dealer-1’s schema: Cars(serialNo, model,color,autoTrans,cdPlayer,…) Dealer-2’s schema: Auto(serial,model,color) Options(serial,option) Warehouse’s schema: AutoWhse(serialNo,model,color,autoTrans,dealer) Extractor --- Query to extract data from Dealer-1’s data: INSERT INTO AutosWhse(serialNo, model, color, autoTans, dealer) SELECT serialNo,model,color,autoTrans,’dealer1’ From Cars;

11 Example Extractor --- Query to extract data from Dealer-2’s data: INSERT INTO AutosWhse(serialNo, model, color, autoTans, dealer) SELECT serialNo,model,color,’yes’,’dealer2’ FROM Autos,Options WHERE Autos.serial=Options.serial AND option=‘autoTrans’; INSERT INTO AutosWhse(serialNo, model, color, autoTans, dealer) SELECT serialNo,model,color,’no’,’dealer2’ FROM Autos WHERE NOT EXISTS ( SELECT * FROM serial =Autos.serial AND option = ‘autoTrans’);

12 Construct Data Warehouse 1)Periodically reconstructed from the current data in the sources, once a night or at even longer intervals. Advantages: simple algorithms. Disadvantages: 1) need to shut down the warehouse; 2) data can become out of date. There are mainly 3 ways to constructing the data in the warehouse:

13 Construct Data Warehouse 2)Updated periodically based on the changes(i.e. each night) of the sources. Advantages: involve smaller amounts of data. (important when warehouse is large and needs to be modified in a short period) Disadvantages: 1) the process to calculate changes to the warehouse is complex. 2) data can become out of date.

14 Construct Data Warehouse 3) Changed immediately, in response to each change or a small set of changes at one or more of the sources. Advantages: data won’t become out of date. Disadvantages: requires too much communication, therefore, it is generally too expensive. (practical for warehouses whose underlying sources changes slowly.)

15 Mediators  Virtual warehouse, which supports a virtual view or a collection of views, that integrates several sources.  Mediator doesn’t store any data.  Mediators’ tasks: 1)receive user’s query, 2)send queries to wrappers, 3)combine results from wrappers, 4)send the final result to user.

16 A Mediator diagram Mediator Wrapper Source 1Source 2 User query Query Result

17 Example Same data sources as the example of data warehouse, the mediator Integrates the same two dealers’ source into a view with schema: AutoMed(serialNo,model,color,autoTrans,dealer) When the user have a query: SELECT sericalNo, model FROM AkutoMed Where color=‘red’

18 Example In this simple case, the mediator forwards the same query to each Of the two wrappers. Wrapper1: Cars(serialNo, model, color, autoTrans, cdPlayer, …) SELECT serialNo,model FROM cars WHERE color = ‘red’; Wrapper2: Autos(serial,model,color); Options(serial,option) SELECT serial, model FROM Autos WHERE color=‘red’; The mediator needs to interprets serial into serialNo, and then returns the union of these sets of data to user.

19 Example There may be different options for the mediator to forward user query, for example, the user queries if there are a specific model&color car (i.e. “Gobi”, “blue”). The mediator decides 2 nd query is needed or not based on the result of 1 st query. That is, If dealer-1 has the specific car, the mediator doesn’t have to query dealer-2.

20 THANK YOU ! Reference: Database Systems– The complete Book 2 nd Edition, 21.2 notes from http://infolab.stanford.edu/~ullman/dscb.html


Download ppt "Chapter 21.2 Modes of Information Integration ID: 219 Name: Qun Yu Class: CS257 219 Spring 2009 Instructor: Dr. T.Y.Lin."

Similar presentations


Ads by Google