Download presentation
Presentation is loading. Please wait.
1
Distributed DBMSPage 5. 1 © 1998 M. Tamer Özsu & Patrick Valduriez Outline Introduction Background Distributed DBMS Architecture Distributed Database Design à Fragmentation à Data Location Distributed Query Processing (Briefly) Distributed Transaction Management (Extensive) n Building Distributed Database Systems (RAID) n Mobile Database Systems Privacy, Trust, and Authentication Peer to Peer Systems
2
Distributed DBMSPage 5. 2 © 1998 M. Tamer Özsu & Patrick Valduriez Useful References W. W. Chu, Optimal File Allocation in Multiple Computer System, IEEE Transaction on Computers, 885-889, October 1969.
3
Distributed DBMSPage 5. 3 © 1998 M. Tamer Özsu & Patrick Valduriez Allocation Alternatives Non-replicated à partitioned : each fragment resides at only one site Replicated à fully replicated : each fragment at each site à partially replicated : each fragment at some of the sites Rule of thumb: If replication is advantageous, otherwise replication may cause problems read - only queries update queries 1
4
Distributed DBMSPage 5. 4 © 1998 M. Tamer Özsu & Patrick Valduriez Comparison of Replication Alternatives Full-replicationPartial-replicationPartitioning QUERY PROCESSING Easy Same Difficulty DIRECTORY MANAGEMENT Easy or Non-existant CONCURRENCY CONTROL EasyDifficultModerate RELIABILITYVery highHighLow REALITY Possible application Realistic Possible application
5
Distributed DBMSPage 5. 5 © 1998 M. Tamer Özsu & Patrick Valduriez Four categories: à Database information à Application information à Communication network information à Computer system information Information Requirements
6
Distributed DBMSPage 5. 6 © 1998 M. Tamer Özsu & Patrick Valduriez Fragment Allocation Problem Statement Given F = { F 1, F 2, …, F n } fragments S ={ S 1, S 2, …, S m } network sites Q = { q 1, q 2,…, q q }applications Find the "optimal" distribution of F to S. Optimality à Minimal cost Communication + storage + processing (read & update) Cost in terms of time (usually) à Performance Response time and/or throughput à Constraints Per site constraints (storage & processing)
7
Distributed DBMSPage 5. 7 © 1998 M. Tamer Özsu & Patrick Valduriez Information Requirements Database information à selectivity of fragments à size of a fragment Application information à access types and numbers à access localities Communication network information à unit cost of storing data at a site à unit cost of processing at a site Computer system information à bandwidth à latency à communication overhead
8
Distributed DBMSPage 5. 8 © 1998 M. Tamer Özsu & Patrick Valduriez File Allocation (FAP) vs Database Allocation (DAP): à Fragments are not individual files relationships have to be maintained à Access to databases is more complicated remote file access model not applicable relationship between allocation and query processing à Cost of integrity enforcement should be considered à Cost of concurrency control should be considered Allocation
9
Distributed DBMSPage 5. 9 © 1998 M. Tamer Özsu & Patrick Valduriez Allocation – Information Requirements Database Information à selectivity of fragments à size of a fragment Application Information à number of read accesses of a query to a fragment à number of update accesses of query to a fragment à A matrix indicating which queries updates which fragments à A similar matrix for retrievals à originating site of each query Site Information à unit cost of storing data at a site à unit cost of processing at a site Network Information à communication cost/frame between two sites à frame size
10
Distributed DBMSPage 5. 10 © 1998 M. Tamer Özsu & Patrick Valduriez General Form min(Total Cost) subject to response time constraint storage constraint processing constraint Decision Variable Allocation Model x ij 1 if fragment F i is stored at site S j 0 otherwise
11
Distributed DBMSPage 5. 11 © 1998 M. Tamer Özsu & Patrick Valduriez Total Cost Storage Cost (of fragment F j at S k ) Query Processing Cost (for one query) processing component + transmission component Allocation Model (unit storage cost at S k ) (size of F j ) x jk query processing cost all queries cost of storing a fragment at a site all fragments all sites
12
Distributed DBMSPage 5. 12 © 1998 M. Tamer Özsu & Patrick Valduriez Query Processing Cost Processing component access cost + integrity enforcement cost + concurrency control cost à Access cost à Integrity enforcement and concurrency control costs Can be similarly calculated Allocation Model ( no. of update accesses+ no. of read accesses) all fragments all sites x ij local processing cost at a site
13
Distributed DBMSPage 5. 13 © 1998 M. Tamer Özsu & Patrick Valduriez Query Processing Cost Transmission component cost of processing updates + cost of processing retrievals à Cost of updates à Retrieval Cost Allocation Model update message cost all fragments all sites acknowledgment cost all fragments all sites ( min all sites all fragments cost of retrieval command cost of sending back the result)
14
Distributed DBMSPage 5. 14 © 1998 M. Tamer Özsu & Patrick Valduriez Constraints à Response Time execution time of query ≤ max. allowable response time for that query à Storage Constraint (for a site) à Processing constraint (for a site) Allocation Model storage requirement of a fragment at that site all fragments storage capacity at that site processing load of a query at that site all queries processing capacity of that site
15
Distributed DBMSPage 5. 15 © 1998 M. Tamer Özsu & Patrick Valduriez Solution Methods à FAP is NP-complete à DAP also NP-complete Heuristics based on à single commodity warehouse location (for FAP) à knapsack problem à branch and bound techniques à network flow Allocation Model
16
Distributed DBMSPage 5. 16 © 1998 M. Tamer Özsu & Patrick Valduriez Attempts to reduce the solution space à assume all candidate partitionings known; select the “best” partitioning à ignore replication at first à sliding window on fragments Allocation Model
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.