Presentation is loading. Please wait.

Presentation is loading. Please wait.

Replication and Distribution CSE 444 Spring 2012 University of Washington.

Similar presentations


Presentation on theme: "Replication and Distribution CSE 444 Spring 2012 University of Washington."— Presentation transcript:

1 Replication and Distribution CSE 444 Spring 2012 University of Washington

2 HASH MAPS

3 Hash Maps Precursors to Bloom filters. Used to reduce communication while joining. S = Set to transmit. – S = {x 1, x 2, …, x n } H = Hash Map. – An array of m bits.

4 Operation To insert x in H: – Compute the hash on x to get a bit position j – Set j to 1. To send S, insert all of its elements in H. Two distinct elements can hash to 1 position. – Creates false positives.

5 Question Data supplier R has N = 1 million documents. Data supplier S also has N = 1 million documents. Each document is 1KB. They have 50 documents in common and they want to compute these. They will proceed as follows: 1. R computes a hash map M with cN bits, where c=8 and sends it to S. 2. S checks its items in M and sends all matches to R. 3. R computes the result and sends the matching 50 documents to S. Q: Indicate the total number of bytes transferred over the network in each step.

6 Analysis Recall |H| = m. Insert one element into H. Probability that bit j remains 0? p = (1 – 1/m) 00001000000000000000

7 Analysis Recall |H| = m. Insert all n elements into H. Probability that bit j remains 0? p = (1 – 1/m) n = e -n/m (for large m) 01001101000011000100

8 Probability of False Positives Take a random element y, and check if its hash is set to 1 in H. Probability of FP = probability that the hash is 1. Probability that bit j is 1? p = 1 – (1 – 1/m) n = 1 – e -n/m (for large m) 01001101000011000100

9 Question Data supplier R has N = 1 million documents. Data supplier S also has N = 1 million documents. Each document is 1KB. They have 50 documents in common and they want to compute these. They will proceed as follows: 1. R computes a hash map M with cN bits, where c=8 and sends it to S. 2.S checks its items in M and sends all matches to R. 3. R computes the result and sends the matching 50 documents to S. Indicate the total number of bytes transferred over the network in each step.

10 Solution Step 1: Send the hash map. cN bits = 1 million bytes = 1 MB. Step 2: Number of matched tuples (included false positives) FP rate = 1 – e -n/m = 11% 110,000 false positive documents 110,050 documents in total (including the 50 common ones) 110.05 MB 50 documents = 50KB Total of 111.1 MB The naïve solution without hash maps takes 1 GB of data transfer

11 DISTRIBUTED LOCKING

12 Setup 50% read only 2% writes 50% read only 2% writes 10% read only 2% writes 10% read only 2% writes 10% read only 2% writes 10% read only 2% writes 10% read only 2% writes 10% read only 2% writes 10% read only 2% writes 10% read only 2% writes Each site can communicate with every other site.

13 Read-locks-one Write-locks-all What is the average number of inter-site messages exchanged? All reads are local, so no locks are acquired. Each write requires 4 other locks

14 Majority locking What is the average number of inter-site messages? 2 other locks needed for both reads and writes. What if you could broadcast across sites with 1 message? Lock acquisition and release is 1 message for all sites Lock grants still takes at 1 message per site.

15 Primary-copy locking What is the average number of inter-site messages? The copies need to acquire locks for each operation. 48% of the actions need locks.

16 TWO PHASE COMMIT

17 Two-Phase Commit Coordinator : 0 Three subordinates : {1, 2, 3} Messages – P (Prepare) – C (Commit) – A (Abort) – Y (Yes vote) – N (No vote) – Ignore acks.

18 2PC What messages are exchanged for a successful commit? – (0,1,P), (0,2,P), (0,3,P), (1,0,Y), (2,0,Y), (3,0,Y), (0,1,C), (0,2,C), (0,3,C) When exactly does the commit occur? – When coordinator force-wrote the commit record.

19 2PC (continued) If the coordinator has sent all the prepare messages but has not yet received a vote from site 1, can it abort the transaction at this point, and send abort messages to the subordinates? If the coordinator has sent all the prepare messages, received a No vote from site 1, but has not yet received the votes of sites 2 and 3, should it wait for the two missing votes, or should it proceed to abort? If site 1 has received a prepare message and voted Yes, but has not received any commit or abort messages, and Site 1 contacts all other subordinates and discovers that they have all voted Yes, can site 1 commit the transaction?


Download ppt "Replication and Distribution CSE 444 Spring 2012 University of Washington."

Similar presentations


Ads by Google