Download presentation
Presentation is loading. Please wait.
Published byChrystal Gilbert Modified over 9 years ago
1
Chapter 17: Additional Slides February 6, 2016
2
Outline Physical Data Management Fragments Distributed Query Processing Transactions Logical Data Management Transparency Conceptual Data Management
3
Physical Data Management Fragments What is a Fragment? Vertical subset (project operation) Horizontal subset (restrict operation) Mixed Fragment (combination of project and restrict) A fragment may be allocated to a single or multiple sites Fragments may be replicated where there is a primary fragment as a single site Copies of the fragment are placed at multiple sites (secondary)
4
Physical Data Management Distributed Query Processing Involves both local (intra site) and global (inter site) optimization. Multiple optimization objectives The weighting of communication costs versus local processing costs depends on network characteristics. There are many more possible access plans for a distributed query.
5
Physical Data Management Distributed Query Processing cont’d Local vs. Global query processing In Local, Queries are performed at a central server (single site) In Global, Must decide which sites to access for the fragments May need to move fragments from site to site Multiple optimization is needed for Global Due to the multiple sites and access plans Many possible access plans for Global Choosing the best one may be difficult
6
Physical Data Management Distributed Query Processing cont’d Communication Costs Communication Time (CT) Fixed Message Delay (MD) Variable Transmission Time (TT) CT = MD + TT MD = Number of Messages * Delay per message TT = Number of bits/Data rate
7
Physical Data Management Distributed Query Processing cont’d Global Query Example (p. 632) List the order number, order date, product number, product name, product price, and order quantity for eastern orders with a specified customer number, date range, and product color. Four possible access plans
8
Physical Data Management Distributed Query Processing cont’d Access Plan 1 Move the Product table to the Tulsa site where the query is processed
9
Physical Data Management Distributed Query Processing cont’d Access Plan 2 Restrict the Product table at the Denver Site Then move result to the Tulsa site to execute the remainder of the query
10
Physical Data Management Distributed Query Processing cont’d Access Plan 3 Perform join and restrictions of Eastern- Orders and Eastern Order-lines fragments at the Tulsa site Then move result to Denver site to join with Product Table
11
Physical Data Management Distributed Query Processing cont’d Access Plan 4 Restrict the Product table at the Denver site Move product numbers to Tulsa and do restrict/join Then move result back to Denver to combine with Product table to get product names
12
Physical Data Management Distributed Query Processing cont’d Obviously many different access plans can be used to answer the same query Need to investigate actual network costs the local processing costs at each site to determine which access plan is the best
13
Physical Data Management Transactions – 2 Phase Commit Protocol 2 Phase Commit (2PC) Ensures that all transactions are Atomic One site is selected as a Coordinator while other sites are Participants Each Participant site execute a different part of the transaction Two phases: Voting Phase and Decision Phase Figure 17.18, page 634
14
Physical Data Management Transactions – 2 Phase Commit Protocol Several Complications IF Failures during recovery and Timeouts Log records are lost Coordinator fails Etc. Several methods to resolve these, but out of scope for this class
15
Logical Data Management Transparency Transparency is related to data independence. With transparency, users can write queries with no knowledge of the distribution, and distribution changes will not cause changes to existing queries and transactions. Without transparency, users must reference some distribution details in queries and distribution changes can lead to changes in existing queries.
16
Logical Data Management Fragmentation Transparency Fragmentation transparency provides the highest level of data independence. Users formulate queries and transactions without knowledge of fragments (locations, or local formats). If fragments change, queries and transactions are not affected. Table 17.6, p 626
17
Logical Data Management Location Transparency Location transparency provides a lesser level of data independence than fragmentation transparency. Users need to reference fragments in formulating queries and transactions. However, knowledge of locations and local formats is not necessary. Table 17.7, p 627
18
Conceptual Data Management Schema Integration Multiple types of schemas may exist to describe the same dataset Integrate multiple schemas into a single schema Best explained using an Exercise
19
Database III – E/R Model Entities Engineer(Engineer No, name, title, salary) Project(PNo, project name, budget, location) Client(Client Name, Address) Relationships Engineer Works_In Project : (Responsibility, Duration) Project Contract_By Client : (Contract Date) Conceptual Data Management Schema Integration - Exercise Engineering Database – Relational Schema E(eno, ename, title), p.k. = eno J(jno, jname, budget, loc, cname), p.k. = jno G(eno, jno, resp, dur), p.k. = eno, jno S(title, sal), p.k. = title Employee Database – CODASYL Schema Department(dept-name, budget, manager) Employee(e#, name, address, title, salary) Department Employs Employee (1:N relationship) First, find the common entities and relationships between schemas
20
Database III – E/R Model Entities Engineer(Engineer No, name, title, salary) Project(PNo, project name, budget, location) Client(Client Name, Address) Relationships Engineer Works_In Project : (Responsibility, Duration) Project Contract_By Client : (Contract Date) Conceptual Data Management Schema Integration - Exercise Engineering Database – Relational Schema E(eno, ename, title), p.k. = eno J(jno, jname, budget, loc, cname), p.k. = jno G(eno, jno, resp, dur), p.k. = eno, jno S(title, sal), p.k. = title Employee Database – CODASYL Schema Department(dept-name, budget, manager) Employee(e#, name, address, title, salary) Department Employs Employee (1:N relationship) Second, draw the conceptual diagram for the common entities and relationships
21
Conceptual Data Management Schema Integration No single correct solution to this exercise It may be seen that there are multiple solutions to this problem Considered a very HARD problem Often hard to find the best synonyms especially from a large set of schemas
22
Weekly Exercise Questions 2, 4, and 5
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.