Download presentation
Presentation is loading. Please wait.
1
Distributed Database Management Systems
Lecture 20
2
In the Previous Lecture
Continued with VF Computed CA Partitioning Algorithm
3
In this Lecture Continue with VF Hybrid Fragmentation
Allocation Problem Replication
4
A1 A3 A2 A4 45 53 5 3 80 75 78 A1 A2 A3 A4 q1 1 q2 q3 q4 S1 S2 S3 q1 15 20 10 q2 5 q3 25 q4 3 CA refj(qi) accj(qi) z2 = 3311 z1 = 0 – 452 z3=
5
A1= jNo A2= jName A3= budget A4= loc V1 = {jNo, budget} V2 = {jNo, jName, loc}
6
VF- Two Problems 1- Clusters not in the sides, rather in the middle of CA 2- m-way partitioning
7
VF Correctness
8
A relation R, defined over attribute set A and key K, generates the vertical partitioning
FR = {R1, R2 , …, Rr } Completeness: The following should be true for A A =U Ri
9
Reconstruction: can be achieved by
R = ⋈K Ri, ∀Ri ∈ FR Disjointness: TID's are not considered to be overlapping since they are maintained by the system PK is exception
10
Hybrid Fragmentation
11
Practically, applications require the fragmentation of both the types to be combined
12
So the nesting of fragmentations, i. e
So the nesting of fragmentations, i.e., one following the other, it becomes sort of a tree
13
Disjoint ness and completeness have to be assured at each step, and reconstruction can be obtained by applying Join and Union in reverse order
14
CUST Beta Delta1 Delta2 A/C# Name Bal Branch AB101 Saeed 4535 MTN
Laeeq LHR AB203 Salma AB109 Shaan 45.32 CUST Beta = ΠA/C#, Bal (CUST) Delta1 = σ Loc = “MTN” (ΠA/C#, Name, Branch (CUST)) Delta2 = σ Loc = “LHR” (ΠA/C#, Name, Branch (CUST)) Beta A/C# Bal AB101 4535 AB202 AB203 AB109 45.32 Delta1 Delta2 A/C# Name Branch AB101 Saeed MTN AB109 Shaan A/C# Name Branch AB202 Laeeq LHR AB203 Salma
15
Allocation
16
Find the "optimal" distribution of F to S.
Given F = {F1, F2 , …, Fn} fragments S ={S1 , S2 , …, Sm} network sites Q = {q1, q2 ,…, qq } applications Find the "optimal" distribution of F to S.
17
Optimality Minimize the processing cost and maximize the system throughput at each site
18
It is a complex problem to be solved mathematically, to make the things very simple, consider the allocation of a single fragment Fk,
19
set of read only queries on Fk from Si; T = {t1, t2, …, tm}
set of update queries U on Fk from Si; U= {u1, u2, .., um}
20
Communication Cost C(T) = {c1,2, c1,3, …., c1,m, ….cm-1, m} C’(T) = {c’1,2, c’1,3, …., c’1,m, ….c’m-1, m} Storage Cost D = {d1, d2, ……., dm}
21
Allocation problem is to find the cites out of set of sites S, where the copy of Fk will be stored.
22
The specification of the allocation problem will be
0 otherwise xj = 1 if the fragment Fk is assigned to site Sj The specification of the allocation problem will be min
23
That concludes our discussion on Fragmentation
Lets summarize it
24
Fragmentation is splitting a table into smaller tables Alternatives
Horizontal Vertical Hybrid
25
Horizontal Fragmentation
26
Splits a table into horizontal subsets (row wise)
Primary and Derived Horizontal Fragmentation
27
We need major simple predicates (Pr); should be complete & minimal
Pr is transformed into Pr’ Minterm (M) predicates from Pr’
28
Correctness of PHF depends on the Pr’
Derived Horizontal Fragmentation is based on Owner-member link
29
Vertical Fragmentation is more complicated due to more options
Based on attributes’ affinities
30
AA is transformed into CA using BEA
Calculated using usage data and access frequencies from different sites AA is transformed into CA using BEA
31
CA establishes clusters of attributes that are split for efficient access
Hybrid Fragmentation combines HF and VF That concludes Fragmentation
32
Thanks
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.