Presentation is loading. Please wait.

Presentation is loading. Please wait.

Distributed DBMSPage 5. 1 © 1998 M. Tamer Özsu & Patrick Valduriez Outline Introduction Background Distributed DBMS Architecture  Distributed Database.

Similar presentations


Presentation on theme: "Distributed DBMSPage 5. 1 © 1998 M. Tamer Özsu & Patrick Valduriez Outline Introduction Background Distributed DBMS Architecture  Distributed Database."— Presentation transcript:

1 Distributed DBMSPage 5. 1 © 1998 M. Tamer Özsu & Patrick Valduriez Outline Introduction Background Distributed DBMS Architecture  Distributed Database Design à Fragmentation à Data Location Distributed Query Processing (Briefly) Distributed Transaction Management (Extensive) n Building Distributed Database Systems (RAID) n Mobile Database Systems Privacy, Trust, and Authentication Peer to Peer Systems

2 Distributed DBMSPage 5. 2 © 1998 M. Tamer Özsu & Patrick Valduriez Useful References W. W. Chu, Optimal File Allocation in Multiple Computer System, IEEE Transaction on Computers, 885-889, October 1969.

3 Distributed DBMSPage 5. 3 © 1998 M. Tamer Özsu & Patrick Valduriez Allocation Alternatives Non-replicated à partitioned : each fragment resides at only one site Replicated à fully replicated : each fragment at each site à partially replicated : each fragment at some of the sites Rule of thumb: If replication is advantageous, otherwise replication may cause problems read - only queries update queries  1

4 Distributed DBMSPage 5. 4 © 1998 M. Tamer Özsu & Patrick Valduriez Comparison of Replication Alternatives Full-replicationPartial-replicationPartitioning QUERY PROCESSING Easy Same Difficulty DIRECTORY MANAGEMENT Easy or Non-existant CONCURRENCY CONTROL EasyDifficultModerate RELIABILITYVery highHighLow REALITY Possible application Realistic Possible application

5 Distributed DBMSPage 5. 5 © 1998 M. Tamer Özsu & Patrick Valduriez Four categories: à Database information à Application information à Communication network information à Computer system information Information Requirements

6 Distributed DBMSPage 5. 6 © 1998 M. Tamer Özsu & Patrick Valduriez Fragment Allocation Problem Statement Given F = { F 1, F 2, …, F n } fragments S ={ S 1, S 2, …, S m } network sites Q = { q 1, q 2,…, q q }applications Find the "optimal" distribution of F to S. Optimality à Minimal cost  Communication + storage + processing (read & update)  Cost in terms of time (usually) à Performance Response time and/or throughput à Constraints  Per site constraints (storage & processing)

7 Distributed DBMSPage 5. 7 © 1998 M. Tamer Özsu & Patrick Valduriez Information Requirements Database information à selectivity of fragments à size of a fragment Application information à access types and numbers à access localities Communication network information à unit cost of storing data at a site à unit cost of processing at a site Computer system information à bandwidth à latency à communication overhead

8 Distributed DBMSPage 5. 8 © 1998 M. Tamer Özsu & Patrick Valduriez File Allocation (FAP) vs Database Allocation (DAP): à Fragments are not individual files  relationships have to be maintained à Access to databases is more complicated  remote file access model not applicable  relationship between allocation and query processing à Cost of integrity enforcement should be considered à Cost of concurrency control should be considered Allocation

9 Distributed DBMSPage 5. 9 © 1998 M. Tamer Özsu & Patrick Valduriez Allocation – Information Requirements Database Information à selectivity of fragments à size of a fragment Application Information à number of read accesses of a query to a fragment à number of update accesses of query to a fragment à A matrix indicating which queries updates which fragments à A similar matrix for retrievals à originating site of each query Site Information à unit cost of storing data at a site à unit cost of processing at a site Network Information à communication cost/frame between two sites à frame size

10 Distributed DBMSPage 5. 10 © 1998 M. Tamer Özsu & Patrick Valduriez General Form min(Total Cost) subject to response time constraint storage constraint processing constraint Decision Variable Allocation Model x ij  1 if fragment F i is stored at site S j 0 otherwise   

11 Distributed DBMSPage 5. 11 © 1998 M. Tamer Özsu & Patrick Valduriez Total Cost Storage Cost (of fragment F j at S k ) Query Processing Cost (for one query) processing component + transmission component Allocation Model (unit storage cost at S k )  (size of F j )  x jk query processing cost  all queries  cost of storing a fragment at a site all fragments  all sites 

12 Distributed DBMSPage 5. 12 © 1998 M. Tamer Özsu & Patrick Valduriez Query Processing Cost Processing component access cost + integrity enforcement cost + concurrency control cost à Access cost à Integrity enforcement and concurrency control costs  Can be similarly calculated Allocation Model ( no. of update accesses+ no. of read accesses)  all fragments  all sites  x ij  local processing cost at a site

13 Distributed DBMSPage 5. 13 © 1998 M. Tamer Özsu & Patrick Valduriez Query Processing Cost Transmission component cost of processing updates + cost of processing retrievals à Cost of updates à Retrieval Cost Allocation Model update message cost  all fragments  all sites  acknowledgment cost all fragments  all sites  ( min all sites all fragments  cost of retrieval command  cost of sending back the result)

14 Distributed DBMSPage 5. 14 © 1998 M. Tamer Özsu & Patrick Valduriez Constraints à Response Time execution time of query ≤ max. allowable response time for that query à Storage Constraint (for a site) à Processing constraint (for a site) Allocation Model storage requirement of a fragment at that site  all fragments  storage capacity at that site processing load of a query at that site  all queries  processing capacity of that site

15 Distributed DBMSPage 5. 15 © 1998 M. Tamer Özsu & Patrick Valduriez Solution Methods à FAP is NP-complete à DAP also NP-complete Heuristics based on à single commodity warehouse location (for FAP) à knapsack problem à branch and bound techniques à network flow Allocation Model

16 Distributed DBMSPage 5. 16 © 1998 M. Tamer Özsu & Patrick Valduriez Attempts to reduce the solution space à assume all candidate partitionings known; select the “best” partitioning à ignore replication at first à sliding window on fragments Allocation Model


Download ppt "Distributed DBMSPage 5. 1 © 1998 M. Tamer Özsu & Patrick Valduriez Outline Introduction Background Distributed DBMS Architecture  Distributed Database."

Similar presentations


Ads by Google