CS 347Notes 021 CS 347: Parallel and Distributed Data Management Notes02: Distributed DB Design Hector Garcia-Molina
CS 347Notes 022 Distributed DB Design Top-down approach: - have DB… - how to split and allocate the sites Multi-DBs (or bottom-up): no design issues! Chapter 5 Ozsu & Valduriez
CS 347Notes 023 Two issues in DDB design: Fragmentation Allocation Note: issues not independent, but will cover separately
CS 347Notes 024 Example Employee relation E (#,name,loc,sal,…) 40% of queries: 40% of queries: Qa: select * Qb: select * from E where loc=Sa where loc=Sb and… and...
CS 347Notes 025 Example Employee relation E (#,name,loc,sal,…) 40% of queries: 40% of queries: Qa: select * Qb: select * from E where loc=Sa where loc=Sb and… and... Motivation: Two sites: Sa, Sb Qa Qb Sa Sb
CS 347Notes 026 It does not take a rocket scientist to figure out fragmentation...
CS 347Notes 027 # NM Loc Sal E Sa10 SallySb25 TomSa15 Joe # NM Loc Sal 5 8 Sa10 TomSa15 Joe7Sb25Sally.. F At Sa At Sb
CS 347Notes 028 F = { F 1, F 2 } F 1 = loc=Sa E F 2 = loc=Sb E
CS 347Notes 029 F = { F 1, F 2 } F 1 = loc=Sa E F 2 = loc=Sb E called primary horizontal fragmentation
CS 347Notes 0210 Fragmentation Horizontal Primary depends on local attributes RDerived depends on foreign relation Vertical R
CS 347Notes 0211 Fragmentation Horizontal Primary depends on local attributes RDerived depends on foreign relation Vertical R Fragmentation also called Sharding
CS 347Notes 0212 Three common horizontal partitioning techniques Round robin Hash partitioning Range partitioning
CS 347Notes 0213 Round robin RD 0 D 1 D 2t1t2t3t4...t5 Evenly distributes data Good for scanning full relation Not good for point or range queries
CS 347Notes 0214 Hash partitioning RD 0 D 1 D 2 t1 h(k 1 )=2t1 t2 h(k 2 )=0t2 t3 h(k 3 )=0t3 t4 h(k 4 )=1t4... Good for point queries on key; also for joins Not good for range queries; point queries not on key If hash function good, even distribution
CS 347Notes 0215 Range partitioning RD 0 D 1 D 2 t1: A=5t1 t2: A=8t2 t3: A=2t3 t4: A=3t4... Good for some range queries on A Need to select good vector: else unbalance data skew execution skew 47 partitionin g vector V 0 V 1
CS 347Notes 0216 Which are good fragmentations? Example: F = { F 1, F 2 } F 1 = sal 20 E
CS 347Notes 0217 Which are good fragmentations? Example: F = { F 1, F 2 } F 1 = sal 20 E Problem: Some tuples lost!
CS 347Notes 0218 Which are good fragmentations? Second example: F = { F 3, F 4 } F 3 = sal 5 E
CS 347Notes 0219 Which are good fragmentations? Second example: F = { F 3, F 4 } F 3 = sal 5 E Tuples with 5 < sal < 10 are duplicated...
CS 347Notes 0220 Prefer to deal with replication explicitly Example: F = { F 5, F 6, F 7 } F 5 = sal 5 E F 6 = 5< sal <10 E F 7 = sal 10 E Then replicate F 6 if convenient (part of allocation problem)
CS 347Notes 0221 Desired properties for horizontal fragmentation R F ={ F 1, F 2, … } (1) Completeness t R, F i F such that t F i
CS 347Notes 0222 (2) Disjointness t F i, F j such that t F j, i j, F i, F j F (3) Reconstruction - ignore
CS 347Notes 0223 How do we get completeness and disjointness? (1) Check it “manually”! e.g., F 1 = sal<10 E ; F 2 = sal 10 E
CS 347Notes 0224 How do we get completeness and disjointness? (2) “Automatically” generate fragments with these properties Desired simple predicates Fragments
CS 347Notes 0225 Example of generation Say queries use predicates: A 5, Loc = S A, Loc = S B Next: - generate “minterm” predicates - eliminate useless ones
CS 347Notes 0226 Minterm predicates (part I) (1) A 5 Loc=S A Loc=S B (2) A 5 Loc=S A ¬(Loc=S B ) (3) A 5 ¬(Loc=S A ) Loc=S B (4) A 5 ¬(Loc=S A ) ¬(Loc=S B ) (5) A 5) Loc=S A Loc=S B (6) A 5) Loc=S A ¬(Loc=S B ) (7) A 5) ¬(Loc=S A ) Loc=S B (8) A 5) ¬(Loc=S A ) ¬(Loc=S B )
CS 347Notes 0227 Minterm predicates (part I) (1) A 5 Loc=S A Loc=S B (2) A 5 Loc=S A ¬(Loc=S B ) (3) A 5 ¬(Loc=S A ) Loc=S B (4) A 5 ¬(Loc=S A ) ¬(Loc=S B ) (5) A 5) Loc=S A Loc=S B (6) A 5) Loc=S A ¬(Loc=S B ) (7) A 5) ¬(Loc=S A ) Loc=S B (8) A 5) ¬(Loc=S A ) ¬(Loc=S B )
CS 347Notes 0228 Minterm predicates (part I) (1) A 5 Loc=S A Loc=S B (2) A 5 Loc=S A ¬(Loc=S B ) (3) A 5 ¬(Loc=S A ) Loc=S B (4) A 5 ¬(Loc=S A ) ¬(Loc=S B ) (5) A 5) Loc=S A Loc=S B (6) A 5) Loc=S A ¬(Loc=S B ) (7) A 5) ¬(Loc=S A ) Loc=S B (8) A 5) ¬(Loc=S A ) ¬(Loc=S B ) A 5 5 < A < 10
CS 347Notes 0229 Minterm predicates (part II) (9) ¬(A 5 Loc=S A Loc=S B (10) ¬(A 5 Loc=S A ¬(Loc=S B ) (11) ¬(A 5 ¬(Loc=S A ) Loc=S B (12) ¬(A 5 ¬(Loc=S A ) ¬(Loc=S B ) (13) ¬(A 5) Loc=S A Loc=S B (14) ¬(A 5) Loc=S A ¬(Loc=S B ) (15) ¬(A 5) ¬(Loc=S A ) Loc=S B (16) ¬(A 5) ¬(Loc=S A ) ¬(Loc=S B )
CS 347Notes 0230 Minterm predicates (part II) (9) ¬(A 5 Loc=S A Loc=S B (10) ¬(A 5 Loc=S A ¬(Loc=S B ) (11) ¬(A 5 ¬(Loc=S A ) Loc=S B (12) ¬(A 5 ¬(Loc=S A ) ¬(Loc=S B ) (13) ¬(A 5) Loc=S A Loc=S B (14) ¬(A 5) Loc=S A ¬(Loc=S B ) (15) ¬(A 5) ¬(Loc=S A ) Loc=S B (16) ¬(A 5) ¬(Loc=S A ) ¬(Loc=S B ) A 10
CS 347Notes 0231 Final fragments: F 2: 5 < A < 10 Loc=S A F 3: 5 < A < 10 Loc=S B F 6: A 5 Loc=S A F 7: A 5 Loc=S B F 10: A 10 Loc=S A F 11: A 10 Loc=S B
CS 347Notes 0232 Note: elimination of useless fragments depends on application semantics: e.g.: if LOC could be S A, S B, we need to add fragments F 4: 5 <A <10 Loc S A Loc S B F 8: A 5 Loc S A Loc S B F 12: A 10 Loc S A Loc S B
CS 347Notes 0233 Why does this work? Predicates: p1 p2 p3 p4 p1 p2 p3 ¬ p4 ¬ p1 ¬ p2 ¬ p3 ¬ p4...
CS 347Notes 0234 (1) Completeness: Take t R p i (t) must be T or F! Say p 1 (t) =T p 2 (t) = T p 3 (t) =F p 4 (t) =F Then t is in fragment with predicate p1 p2 ¬ p3 ¬ p4
CS 347Notes 0235 (2) Disjointness Say t Fragment p 1 p 2 ¬ p 3 ¬ p 4 Then: p 1 (t) = T, p 2 (t) = T, p 3 (t) = F, p 4 (t)= F t cannot be in any other fragment!
CS 347Notes 0236 Summary Given simple predicates P r = { p 1, p 2,.. p m } minterm predicates are M={m | m = p k *, 1 k m } where p k * is p k or is ¬ p k pkPrpkPr Fragments m R for all m M are complete and disjoint
Another Desired Fragmentation Property: Match Access Patterns CS 347Notes 0237 data A data B data C frequently accessed together try to place in same fragment
CS 347Notes 0238 Return to example: E(#, NM, LOC, SAL,…) Common queries: Qa: select *Qb: select * from Efrom E where LOC=Sawhere LOC=Sb and …and...
CS 347Notes 0239 Three choices: (1) Pr = { } F 1 ={ E } (2) Pr = {LOC=Sa, LOC=Sb} F 2 ={ loc=Sa E, loc=Sb E } (3) Pr = {LOC=Sa, LOC=Sb, Sal<10} F 3 ={ loc=Sa sal<10 E, loc=Sa sal 10 E, loc=Sb sal<10 E, loc=Sb sal 10 E }
CS 347Notes 0240 In other words: Loc=Sa sal < 10 Loc=Sa sal 10 Loc=Sb sal < 10 Loc=Sb sal 10 F1F1 F3F3 F2F2 Q a : Select … loc = S a... Q b : Select … loc = S b...
CS 347Notes 0241 In other words: Loc=Sa sal < 10 Loc=Sa sal 10 Loc=Sb sal < 10 Loc=Sb sal 10 F1F1 F3F3 F2F2 Q a : Select … loc = S a... Q b : Select … loc = S b... F 2 is good… (not F 1, F 3 )
CS 347Notes 0242 Derived horizontal fragmentation Example: E(#, NM, SAL, LOC) F ={ E 1, E 2 } by LOC J(#, DES,…) Common query for project: [Given employee name, list projects (s)he works in]
CS 347Notes 0243 E1E1 (at S a ) (at S b ) E2E2 J
CS 347Notes 0244 E1E1 (at S a ) (at S b ) E2E2 J1J1 J2J2 J 1 = J E 1 J 2 = J E 2
CS 347Notes 0245 Derived horizontal fragmentation R, F = { F 1, F 2,... F n } S, D = {D 1, D 2, …D n } where D i =S F i Convention: R is owner S is member F could be primary or derived
CS 347Notes 0246 Checking completeness and disjointness of derived fragmentation But no #= 33 in E 1 nor in E 2 ! Example: Say J is: This J tuple will not be in J 1 nor J 2 Fragmentation not complete
CS 347Notes 0247 Need to enforce referential integrity constraint: join attr(#) of member relation joint attr(#) of owner relation To get completeness
CS 347Notes 0248 Example: E1E1 E2E2 J1J1 J J2J2 Fragmentation is not disjoint!
CS 347Notes 0249 Join attribute(#) should be key of owner relation To get disjointness
CS 347Notes 0250 Summary: horizontal fragmentation Type: primary, derived Properties: completeness, disjointness
CS 347Notes 0251 Vertical fragmentation E1E1 E E2E2 Example:
CS 347Notes 0252 R[T] R 1 [T 1 ] T i T R n [Tn] Just like normalization of relations...
CS 347Notes 0253 Properties: R[T] R i [T i ] (1) Completeness U Ti = T all i
CS 347Notes 0254 (2) Disjointness Ti Tj = for all i,j i j E(#,LOC,SAL) E 1 (#,LOC) E 2 (SAL)
CS 347Notes 0255 (2) Disjointness Ti Tj = for all i,j i j E(#,LOC,SAL) E 1 (#,LOC) E 2 (SAL) Not a desirable property!! (could not reconstruct R!)
CS 347Notes 0256 (3) Lossless join R i = R all i One way to achieve lossless join: Repeat key in all fragments, i.e., Key T i for all i
CS 347Notes 0257 How do we decide what attributes are grouped with which? E 1 (#,NM,LOC) E 2 (#,SAL) Example: E(#,NM,LOC,SAL)E 1 (#,NM) E 2 (#,LOC) E 3 (#,SAL) ?
CS 347Notes 0258 A 1 A 2 A 3 A 4 A 5 A A A A A Attribute affinity matrix
CS 347Notes 0259 A 1 A 2 A 3 A 4 A 5 A A A A A Attribute affinity matrix R 1 [K,A 1, A 2, A 3 ] R 2 [K,A 4, A 5 ]
CS 347Notes 0260 Textbook (Ozsu & Valduriez) discusses –How to build affinity matrix –How to identify attribute clusters –How to partition relation You are not responsible for –Clustering and partitioning algorithms (i.e., Skip pages )
CS 347Notes 0261 Allocation Example: E(#,NM,LOC,SAL) F 1 = loc=Sa E ; F 2 = loc=Sb E Qa: select … where loc=Sa... Qb: select … where loc=Sb… Site a Site b Where do F 1,F 2 go? ?
CS 347Notes 0262 Issues Where do queries originate What is communication cost? and size of answers, relations,… What is storage capacity, cost at sites? and size of fragments? What is processing power at sites?
CS 347Notes 0263 What is query processing strategy? –How are joins done? –Where are answers collected? More Issues
CS 347Notes 0264 Cost of updating copies? Writes and concurrency control?... Do we replicate fragments?
CS 347Notes 0265 Optimization problem: What is best placement of fragments and/or best number of copies to: –minimize query response time –maximize throughput –minimize “some cost” –... Subject to constraints? –Available storage –Available bandwidth, power,… –Keep 90% of response time below X –...
CS 347Notes 0266 Optimization problem: What is best placement of fragments and/or best number of copies to: –minimize query response time –maximize throughput –minimize “some cost” –... Subject to constraints? –Available storage –Available bandwidth, power,… –Keep 90% of response time below X –... This is an incredibly hard problem
CS 347Notes 0267 Example: Single fragment F Read cost: [t i MIN C ij ] i:Originating site of request t i :Read traffic at S i C ij :Retrieval cost Accessing fragment F at Sj from Si ji=1 m
CS 347Notes 0268 Scenario - Read cost i C=inf c i,3 c i,1 c i,2 Stream of read requests for F t i REQ/SEC C=inf F F F
CS 347Notes 0269 Write cost X j u i C’ ij i: Originating site of request j: Site being updated X j : 0 if F not stored at S j 1 if F stored at S j u i : Write traffic at S i C’ ij : Write cost Updating F at S j from S i i=1j=1 mm
CS 347Notes 0270 Scenario - write cost Updates u i updates/sec..... i F F F
CS 347Notes 0271 Storage cost: X i d i X i :0 if F not stored at S i 1 if F stored at S i d i : storage cost at S i i=1 m
CS 347Notes 0272 Target function: min [t i MIN C ij + X j u i C’ ij ] + X i d i j i=1 j=1 i=1 m m m
CS 347Notes 0273 Can add more complications: Examples: - Multiple fragments - Fragment sizes - Concurrency control cost
Case Study: PNUTS Where in the World is My Data? Sudarshan Kadambi, Jianjun Chen, Brian F. Cooper, David Lomax, Raghu Ramakrishnan, Adam Silberstein, Erwin Tam, Hector Garcia-Molina; VLDB 2011 Distributed object/tuple store for Yahoo! CS 347Notes 0274
Case Study: PNUTS Issue: Where to locate data Issue: What and where to replicate CS 347Notes 0275
PNUTS Discussion Dynamic vs Static fragment placement Caching vs Replication CS 347Notes 0276
Policy Constraints MIN_COPIES: The minimum number of full replicas of the record that must exist. INCL_LIST: An inclusion list -- the locations where a full replica of the record must exist. EXCL_LIST: An exclusion list -- the locations where a full replica of the record cannot exist. CS 347Notes 0277
Example Rule Rule 1: IF TABLE_NAME = "Users“ THEN SET 'MIN_COPIES' = 2 CONSTRAINT_PRI = 0 CS 347Notes 0278
Another Example Rule Rule 2: IF TABLE_NAME = "Users" AND FIELD STR('home location') = 'France‘ THEN SET 'MIN_COPIES' = 3 AND SET 'EXCL LIST' = 'USWest, USEast‘ CONSTRAINT PRI = 1 CS 347Notes 0279
CS 347Notes 0280 Summary Description of fragmentation Good fragmentations Design of fragmentation Allocation