Presentation is loading. Please wait.

Presentation is loading. Please wait.

P2p, Fall 06 1 Topics in Database Systems: Data Management in Peer-to-Peer Systems.

Similar presentations


Presentation on theme: "P2p, Fall 06 1 Topics in Database Systems: Data Management in Peer-to-Peer Systems."— Presentation transcript:

1 p2p, Fall 06 1 Topics in Database Systems: Data Management in Peer-to-Peer Systems

2 p2p, Fall 06 2 Agenda για σήμερα 1. Σύντομη περίληψη για replication σε αδόμητα p2p συστήματα 2. Επιδημικοί Αλγόριθμοι για Ενημερώσεις Αντιγράφων (Demers et al paper + μια εφαρμογή) 3. Τρία παραδείγματα αδόμητων συστημάτων α. GIA b. KAZAA c. Bittorent Την επόμενη Πέμπτη: Freenet, Pastry, eDonkey + Ένα παράδειγμα μια p2p database (PIER)

3 p2p, Fall 06 3 Reasons for Replication  Performance load balancing locality: place copies close to the requestor geographic locality (more choices for the next step in search) reduce number of hops  Availability In case of failures Peer departures

4 p2p, Fall 06 4 Replication Theory: Replica Allocation Policies in Unstructured P2P Systems E. Cohen and S. Shenker, “Replication Strategies in Unstructured Peer-to-Peer Networks”. SIGCOMM 2002 Q. Lv et al, “Search and Replication in Unstructured Peer-to-Peer Networks”, ICS’02 – Replication Part και τα δυο αναφέρονται σε “performance”

5 p2p, Fall 06 5 Question: how to use replication to improve search efficiency in unstructured networks? Replication: Allocation Scheme How many copies of each object so that the search overhead for the object is minimized, assuming that the total amount of storage for objects in the network is fixed

6 p2p, Fall 06 6 Replication Theory - Model Assume m objects and n nodes Each node capacity ρ, total capacity R = n ρ How to allocate R among the m objects? Determine r i number of copies (distinct nodes) that hold a copy of i Σ i=1, m r i = R (R total capacity) Also, p i = r i /R – Fraction of total capacity allocated to i Allocation represented by the vector (p 1, p 2, …. p m ) = (r 1 /R, r 2 /R, r m /R)

7 p2p, Fall 06 7 Replication Theory - Model Assume that object i is requested with relative rates q i, we normalize it by setting Σ i=1, m q i = 1 For convenience, assume 1 << r i  n and that q 1  q 2  …  q m Map the query distribution q to an allocation vector p Bounds for p i At least one copy, r i  1, Lower value l = 1/R At most n copies, r i  n, Upper value, u = n/R

8 p2p, Fall 06 8 Replication Theory Assume that searches go on until a copy is found We want to determine r i that minimizes the average search size (number of nodes probed) to locate an item i Need to compute average search size per item Searches consist of randomly probing sites until the desired object is found: search at each step draws a node uniformly at random and asks whether it has a copy

9 p2p, Fall 06 9 Replication Theory A i : Expectation (average search size) for object i is the inverse of the fraction of sites that have replicas of the object A i = n/r i The average search size A of all the objects (average number of nodes probed per object query) A = Σ i q i A i = n Σ i q i /r i Minimize: A = n Σ i q i /r i

10 p2p, Fall 06 10 Replication Theory Minimize: Σ i q i /p i Subject to Σp i = 1 and l  p i  u Monotonicity Since q 1  q 2  …  q m, we must have p 1  p 2  …  p m More copies to more popular, but how many?

11 p2p, Fall 06 11 Uniform Replication Create the same number of replicas for each object r i = R/m Average search size for uniform replication A i = n/r i = m/ρ A uniform = Σ i q i m/ρ = m/ρ (m n/R) Which is independent of the query distribution

12 p2p, Fall 06 12 Proportional Replication Create a number of replicas for each object proportional to the query rate r i = R q i

13 p2p, Fall 06 13 Proportional Replication Number of replicas for each object: r i = R q i Average search size for uniform replication A i = n/r i = n/R q i A proportioanl = Σ i q i n/R q i = m/ρ = A uniform again independent of the query distribution Why? Objects whose query rate are greater than average (>1/m) do better with proportional, and the other do better with uniform The weighted average balances out to be the same Create a number of replicas for each object proportional to the query rate r i = R q i

14 p2p, Fall 06 14 Uniform and Proportional Replication Summary: Uniform Allocation: p i = 1/m Simple, resources are divided equally Proportional Allocation: p i = q i “Fair”, resources per item proportional to demand Reflects current P2P practices Example: 3 items, q 1 =1/2, q 2 =1/3, q 3 =1/6 UniformProportional

15 p2p, Fall 06 15 Space of Possible Allocations Definition: Allocation p 1, p 2, p 3,…, p m is “in-between” Uniform and Proportional if for 1< i <m, q i+1 /q i < p i+1 /p i < 1 (=1 for uniform, = for proportial, we want to favor popular but not too much) Theorem1: All (strictly) in-between strategies are (strictly) better than Uniform and Proportional Theorem2: p is worse than Uniform/Proportional if for all i, p i+1 /p i > 1 (popular gets less) OR for all i, q i+1 /q i > p i+1 /p i (less popular gets less than “fair share”) Proportional and Uniform are the worst “reasonable” strategies

16 p2p, Fall 06 16 Square-Root Replication Find r i that minimizes A, A = Σ i q i A i = n Σ i q i /r i This is done for r i = λ √q i where λ = R/ Σ i √ q i Then the average search size is A optimal = 1/ρ ( Σ i √q i ) 2

17 p2p, Fall 06 17 How much can we gain by using SR ? Zipf-like query rates A uniform /A SR

18 p2p, Fall 06 18 Other Metrics: Discussion Utilization rate, the rate of requests that a replica of an object i receives U i = R q i /r i  For uniform replication, all objects have the same average search size, but replicas have utilization rates proportional to their query rates  Proportional replication achieves perfect load balancing with all replicas having the same utilization rate, but average search sizes vary with more popular objects having smaller average search sizes than less popular ones

19 p2p, Fall 06 19 Replication: Summary

20 p2p, Fall 06 20 What is the search size of a query?  Soluble queries: number of probes until answer is found.  Insoluble queries: maximum search size  Query is soluble if there are sufficiently many copies of the item.  Query is insoluble if item is rare or non existent. Assumption that there is at least one copy per object

21 p2p, Fall 06 21 SR is best for soluble queries Uniform minimizes cost of insoluble queries OPT is a hybrid of Uniform and SR Tuned to balance cost of soluble and insoluble queries: uniformly allocate a minimum number of copies per item, use SR for the rest What is the optimal strategy?

22 p2p, Fall 06 22 We now know what we need. How do we get there?

23 p2p, Fall 06 23 Replication Algorithms Fully distributed where peers communicate through random probes; minimal bookkeeping; and no more communication than what is needed for search. Converge to/obtain SR allocation when query rates remain steady. Uniform and Proportional are “easy” – Uniform: When item is created, replicate its key in a fixed number of hosts. – Proportional: for each query, replicate the key in a fixed number of hosts (need to know or estimate the query rate) Desired properties of algorithm:

24 p2p, Fall 06 24 Replication Algorithms Uniform and Proportional are “easy” – Uniform: When item is created, replicate its key in a fixed number of hosts. – Proportional: for each query, replicate the key in a fixed number of hosts (need to know or estimate the query rate)

25 p2p, Fall 06 25 Replication Algorithms Fully distributed where peers communicate through random probes; minimal bookkeeping; and no more communication than what is needed for search. Converge to/obtain SR allocation when query rates remain steady. Desired properties of algorithm:

26 p2p, Fall 06 26 Achieving Square-Root Replication How can we achieve square-root replication in practice?  Assume that each query keeps track of the search size  Each time a query is finished the object is copied to a number of sites proportional to the number of probes On average object i will be replicated on c n/r i times each time a query is issued (for some constant c) It can be shown that this gives square root

27 p2p, Fall 06 27 Achieving Square-Root Replication What about replica deletion? Steady state: creation time equal with the deletion time The lifetime of replicas must be independent of object identity or query rate FIFO or random deletions is ok LRU or LFU no

28 p2p, Fall 06 28 Replication Thus, for Square-root replication an object should be replicated at a number of nodes that is proportional to the number of probes that the search required

29 p2p, Fall 06 29 Replication - Implementation Two strategies are popular Owner Replication When a search is successful, the object is stored at the requestor node only (used in Gnutella) Path Replication When a search succeeds, the object is stored at all nodes along the path from the requestor node to the provider node (used in Freenet) Following the reverse path back to the requestor

30 p2p, Fall 06 30 Replication - Implementation If a p2p system uses k-walkers, the number of nodes between the requestor and the provider node is 1/k of the total nodes visited (number of probes) Then, path replication should result in square-root replication Problem: Tends to replicate nodes that are topologically along the same path

31 p2p, Fall 06 31 Replication - Implementation Random Replication When a search succeeds, we count the number of nodes on the path between the requestor and the provider Say p Then, randomly pick p of the nodes that the k walkers visited to replicate the object Harder to implement

32 p2p, Fall 06 32 Experimental Evaluation Both path and random replication generates replication ratios quite close to square-root of query rates Path replication and random replication reduces the overall message traffic by a factor of 3 to 4 respectively Much of the traffic reduction comes from reducing the number of hops Path and random, better than owner For example, queries that finish with 4 hops, 71% owner, 86% path, 91% random

33 p2p, Fall 06 33 Replication & Unstructured P2P epidemic algorithms

34 p2p, Fall 06 34 Reasons for Replication Besides storage, cost associated with replication: Consistency Maintenance

35 p2p, Fall 06 35 Methods for spreading updates: Push: originate from the site where the update appeared To reach the sites that hold copies Pull: the sites holding copies contact the master site Epidemics for spreading updates

36 p2p, Fall 06 36 Update at a single site Randomized algorithms for distributing updates and driving replicas towards consistency Ensure that the effect of every update is eventually reflected to all replicas: Sites become fully consistent only when all updating activity has stopped and the system has become quiescent Analogous to epidemics A. Demers et al, Epidemic Algorithms for Replicated Database Maintenance, SOSP 87

37 p2p, Fall 06 37 Methods for spreading updates: Direct mail (server-initiated) : each new update is immediately mailed from its originating site to all other sites (+) Timely & reasonably efficient (-) Not all sites know all other sites (stateless) (-) Mails may be lost Anti-entropy : every site regularly chooses another site at random and by exchanging content resolves any differences between them (+) Extremely reliable but requires exchanging content and resolving updates (-) Propagates updates much more slowly than direct mail

38 p2p, Fall 06 38 Methods for spreading updates: Rumor mongering :  Sites are initially “ignorant”; when a site receives a new update it becomes a “hot rumor”  While a site holds a hot rumor, it periodically chooses another site at random and ensures that the other site has seen the update  When a site has tried to share a hot rumor with too many sites that have already seen it, the site stops treating the rumor as hot and retains the update without propagating it further Rumor cycles can be more frequent that anti-entropy cycles, because they require fewer resources at each site, but there is a chance that an update will not reach all sites

39 p2p, Fall 06 39 Anti-entropy and rumor spreading are examples of epidemic algorithms Three types of sites:  Infective: A site that holds an update that is willing to share is hold  Susceptible: A site that has not yet received an update  Removed: A site that has received an update but is no longer willing to share Anti-entropy: simple epidemic where all sites are always either infective or susceptible

40 p2p, Fall 06 40 A set S of n sites, each storing a copy of a database The database copy at site s  S is a time varying partial function s.ValueOf: K  {u:V x t :T} set of keys set of values set of timestamps (totally ordered by < V contains the element NIL s.ValueOf[k] = {NIL, t}: item with k has been deleted from the database Assume, just one item s.ValueOf  {u:V x t:T} thus, an ordered pair consisting of a value and a timestamp The first component may be NIL indicating that the item was deleted by the time indicated by the second component Το paper αναφέρεται σε ανταλλαγή όλου του περιεχομένου των κόμβων

41 p2p, Fall 06 41 The goal of the update distribution process is to drive the system towards  s, s’  S: s.ValueOf = s’.ValueOf Operation invoked to update the database Update[u:V] s.ValueOf {r, Now{})

42 p2p, Fall 06 42 Direct Mail At the site s where an update occurs: For each s’  S PostMail[to:s’, msg(“Update”, s.ValueOf) Each site s’ receiving the update message: (“Update”, (u, t)) If s’.ValueOf.t < t s’.ValueOf  (u, t)  The complete set S must be known to s (stateful server)  PostMail messages are queued so that the server is not delayed (asynchronous), but may fail when queues overflow or their destination are inaccessible for a long time  n (number of sites) messages per update  traffic proportional to n and the average distance between sites s originator of the update s’ receiver of the update

43 p2p, Fall 06 43 Anti-Entropy At each site s periodically execute: For some s’  S ResolveDifference[s, s’] Three ways to execute ResolveDifference : Push (sender (server) - driven) If s.Valueof.t > s’.Valueof.t s’.ValueOf  s.ValueOf Pull (receiver (client) – driven) If s.Valueof.t < s’.Valueof.t s.ValueOf  s’.ValueOf Push-Pull s.Valueof.t > s’.Valueof.t  s’.ValueOf  s.ValueOf s.Valueof.t < s’.Valueof.t  s.ValueOf  s’.ValueOf s  s’ s pushes its value to s’ s pulls s’ and gets s’ value

44 p2p, Fall 06 44 Anti-Entropy Assume that  Site s’ is chosen uniformly at random from the set S  Each site executes the anti-entropy algorithm once per period Αποδεικνύεται ότι,  An update will eventually infect the entire population  ξεκινώντας από έναν μολυσμένο (infected) κόμβο, αυτό επιτυγχάνεται σε χρόνο ανάλογο to the log of the population size η σταθερά της αναλογίας εξαρτάται από το αν θα χρησιμοποιηθεί push ή pull

45 p2p, Fall 06 45 Anti-Entropy Let p i be the probability of a site remaining susceptible (has not received the update) after the i cycle of anti-entropy (θέλουμε να τείνει στο 0 όσο το δυνατόν πιο γρήγορα) For pull, A site remains susceptible after the i+1 cycle, if (a) it was susceptible after the i cycle and (b) it contacted a susceptible site in the i+1 cycle p i+1 = (p i ) 2 For push, A site remains susceptible after the i+1 cycle, if (a) it was susceptible after the i cycle and (b) no infectious site choose to contact in the i+1 cycle p i+1 = p i (1 – 1/n) n(1-p i ) p i+1 = p i e -1 1 – 1/n (site is not contacted by a node) n(1-p i ) number of infectious nodes at cycle i Pull is preferable than push

46 p2p, Fall 06 46 Anti-Entropy Θεωρεί ότι οι κόμβοι ανταλλάσσουν όλο το περιεχόμενο τους – οπότε υπάρχει το θέμα τι στέλνουμε στο δίκτυο και πως συγκρίνουμε τα στιγμιότυπα  Use checksums Ok, αν τα checksums συνήθως συμφωνούν  + A list of recent updates for which (now – timestamp) < threshold τ Compare fist recent updates, update databases and the ckecksums and then compare the updated checksums, choice of τ  Maintain an inverted list of updates ordered by timestamp Perform anti-entropy by exchanging timestamps at reverse timestamp order until their checksums agree

47 p2p, Fall 06 47 Complex Epidemics: Rumor Spreading  Initial State: n individuals initially inactive (susceptible) Rumor planting&spreading:  We plant a rumor with one person who becomes active (infective), phoning other people at random and sharing the rumor  Every person bearing the rumor also becomes active and likewise shares the rumor  When an active individual makes an unnecessary phone call (the recipient already knows the rumor), then with probability 1/k the active individual loses interest in sharing the rumor (becomes removed) We would like to know:  How fast the system converges to an inactive state (no one is infective)  The percentage of people that know the rumor when the inactive state is reached

48 p2p, Fall 06 48 Complex Epidemics: Rumor Spreading Let s, i, r be the fraction of individuals that are susceptible, infective and removed s + i + r = 1 ds/dt = - s i di/dt = s i – 1/k(1-s) i s = e –(k+1)(1-s) An exponential decrease of s with k For k = 1, 20% miss the rumor For k = 2, only 6% miss it Unnecessary phone calls

49 p2p, Fall 06 49 Residue The value of s when i is zero: the remaining susceptible when the epidemic finishes Traffic m = Total update traffic / Number of sites Delay  Average delay (t avg ): difference between the time of the initial injection of an update and the arrival of the update at a given site averaged over all sites  The delay until (t last ) the reception by the last site that will receive the update during an epidemic Criteria to characterize epidemics

50 p2p, Fall 06 50 Blind vs. Feedback Feedback variation: a sender loses interest only if the recipient knows the rumor Blind variation: a sender loses interest with probability 1/k regardless of the recipient Counter vs. Coin Instead of losing interest with probability 1/k, use a counter so that we loose interest only after k unnecessary contacts s = e -m There are nm updates sent The probability that a single site misses all these updates is (1 – 1/n) nm Simple variations of rumor spreading m is the traffic Counters and feedback improve the delay, with counters playing a more significant role Όλες την ίδια σχέση μεταξύ traffic και residue

51 p2p, Fall 06 51 Push vs. Pull Pull converges faster If there are numerous independent updates, a pull request is likely to find a source with a non-empty rumor list If the database is quiescent, the push phase ceases to introduce traffic overhead, while the pull continues to inject useless requests for updates Simple variations of rumor spreading Counter, feedback and pull work better

52 p2p, Fall 06 52 Minimization Use a push and pull together, if both sites know the update, only the site with the smaller counter is incremented Connection Limit A site can be the recipient of more than one push in a cycle, while for pull, a site can service an unlimited number of requests What if we set a limit:  Push gets better (reduce traffic, since the spread grows exponentially, most traffic occurs at the end)  Pull gets worst

53 p2p, Fall 06 53 Hunting If a connection is rejected, then the choosing site can “hunt” for alternate sites push and pull similar – if connection limit 1 and infinite hunt

54 p2p, Fall 06 54 Complex Epidemic and Anti-entropy Anti-entropy can be run infrequently to back-up a complex epidemic, so that every update eventually reaches (or is suspended at) every site What happens when an update is discovered during anti- entropy: use rumor mongering (e.g., make it a hot rumor) or direct mail

55 p2p, Fall 06 55 Deletion and Death Certificates Replace deleted items with death certificates which carry timestamps and spread like ordinary data When old copies of deleted items meet death certificates, the old items are removed. But when to delete death certificates?

56 p2p, Fall 06 56 Dormant Death Certificates Define some threshold (but some items may be resurrected “re- appear”) If the death certificate is older than the expected time required to propagate it to all sites, then the existence of an obsolete copy of the corresponding data item is unlikely Delete very old certificates at most sites, retaining “dormant” copies at only a few sites (like antibodies) Use two thresholds, t1 and t2 + a list of r retention sites names with each death certificate (chosen at random when the death certificate is created) Once t1 is reached, all servers but the servers in the retention list delete the death certificate Dormant death certificates are deleted when t1 + t2 is reached

57 p2p, Fall 06 57 Anti-Entropy with Dormant Death Certificates Whenever a dormant death certificate encounters an obsolete data item, it must be “activated”

58 p2p, Fall 06 58 How to choose partners Consider spatial distributions in which the choice tends to favor nearby servers Spatial Distribution

59 p2p, Fall 06 59 Spatial Distribution The cost of sending an update to a nearby site is much lower that the cost of sending the update to a distant site Favor nearby neighbors Trade off between: Average traffic per link and Convergence times Example: linear network, only nearest neighbor: O(1) and O(n) vs uniform random connections: O(n) and O(log n) Determine the probability of connecting to a site at distance d For spreading updates on a line, d -2 distribution: the probability of connecting to a site at distance d is proportional to d -2 In general, each site s independently choose connections according to a distribution that is a function of Q s (d), where Q s (d) is the cumulative number of sites at distance d or less from s

60 p2p, Fall 06 60 Spatial Distribution and Anti-Entropy Extensive simulation on the actual topology with a number of different spatial distributions A different class of distributions less sensitive to sudden increases of Q s (d) Let each site s build a list of the other sites sorted by their distances from s Select anti-entropy exchange partners from the sorted list according to a function f(i), where i is its position on the list (averaging the probabilities of selecting equidistant sites) Non-uniform distribution induce less overload on critical links

61 p2p, Fall 06 61 Spatial Distribution and Rumors Anti-entropy converges with probability 1 for a spatial distribution such that for every pair (s’, s) of sites there is a nonzero probability that s will choose to exchange data with s’ However, rumor mongering is less robust against changes in spatial distributions and network topology As the spatial distribution is made less uniform, we can increase the value of k to compensate


Download ppt "P2p, Fall 06 1 Topics in Database Systems: Data Management in Peer-to-Peer Systems."

Similar presentations


Ads by Google