Download presentation
Presentation is loading. Please wait.
Published byRebekah Hensell Modified over 9 years ago
1
Data Currency in Replicated DHTs Reza Akbarinia, Esther Pacitti and Patrick Valduriez University of Nantes, France, INIRA ACM SIGMOD 2007 Presenter Jerry Wu
2
Motivation P2P data sharing systems –Enable large amount of users to share a massive number of files –Query Reply Send request Download Message forwarding on these systems –Flooding : KaZaA, Gnutella –DHT : CAN, Chord, Pastry, … etc.
3
Distributed Hash Table (DHT) Use hash functions to locate files –h(meta data) = k (for identification) –g(k) = k 1 (for routing) A BF D E C Meta FreeLoop.mp3 g(k)=k 1 (A) U k1k1
4
k1k1 Data Replication What if node A fails? Duplicate several copies A BF D E C g(h(FreeLoop.mp3))=k 1 (A) U g 2 (h(FreeLoop.mp3))=k 2 (D) g 3 (h(FreeLoop.mp3))=k 3 (E) Meta FreeLoop.mp3 k2k2 k3k3
5
Basic Operations put (meta key k, File D) –Insert a file into the DHT get (meta key k) –Retrieve the file from the DHT : { g(k, D) | g is used as a hash function} | | : The replication level of the system Each file will be stored at | | peers
6
Additional Problems If the owner can modify the data … The nature of P2P system –Peers can join and leave dynamically Update while some peers depart and rejoins later? Concurrent update?
7
Solution If we have a timestamp for each transaction of update/insert ? –The currency of the file is judged by its timestamp –FileX = File + timestamp –Put (k, FileX) instead of (k, File) into the DHT!! Then we know the freshness of the file Only the latest update can succeed
8
How Can We Get A Timestamp? KTS (Key-based Timestamp Service) –Issue timestamps for each transaction –gen_ts(key k) Generate a timestamp w.r.t. key k –last_ts(key k) Return the finally issued timestamp
9
The New DHT Functions Based on the KTS service Insert(key k, FileX D, Hash function set H r ) –Insert or update a file with identity key k into the DHT Retrieve(k, H r ) –Retrieve the latest copy of the file with identity key k
10
Insert A File BF G E C g(k)=k 1 (A) U g 2 (k)=k 2 (C) Insert P.avi k2k2 k1k1 D H h(P.avi)=k KTS Timestamp Service gen_ts(k)=t A A put g (k, (t A, P.avi)) put g 2 (k, (t A, P.avi))
11
Retrieve A File BF G E C g(k)=k 1 (A) U g 2 (k)=k 2 (C) Get P.avi k2k2 k1k1 D H h(P.avi)=k KTS Timestamp Service last_ts(k)=t A A get g (k) get g 2 (k) (t 0, P.avi) (t A, P.avi)
12
If( ts x > ts 0 ) then –Update File D Update A File put g (k, (ts x, File D)) KeyTSFile kts 0 File D (P.avi) k1k1 ts 1 File D 1 (X.mp3) k2k2 ts 2 File D 2 (Y.m4v) k3k3 ts 3 File D 3 (Z.tar)
13
Retrieval Cost Analysis C = C kts + N * C ret C kts = C ret = O(logn), n = # of peers Let X be the random variable of N N : Number of retries to get the latest copy p t : The probability of finding a fresh copy Prob(X = i) = p t * (1 - p t ) i-1 |H r | = number of replicas of the system
14
Retrieval Cost Analysis Then, how can we get a timestamp? –Key-based Timestamp Service (KTS)
15
The KTS Service Use the same DHT but with different hash function h ts 1 2 Hash Table Req (k, h ts ) Req(k, h ts )=p TimeStamp Request (k) Hash Table Req(k, h ts ) 3 4
16
The KTS Service How can node p generate timestamps w.r.t. key k? –Receive the counters from a leaving peer DHT system will distribute the load of the leaving peer to its neighbors Direct initialization –Send a file request w.r.t. key k to obtain the latest timestamp Take place if the leaving peer fails Indirect initialization
17
The KTS Service Indirect initialization –The probability to fail p f –p f = (1-p t ) | | –If p t = 30%, | |=13, then p f < 1% After initialization, increase timestamp on every timestamp request
18
Experiments And Simulations Environments –64 node cluster –10000 nodes on the SimJava platform Metrics –Response time : Time to return a current replica in response to a query –Communication cost : # of messages to send to answer a query
19
The Competitor - BRICKS Use a function to map key k to multiple keys (k1, k2, k3, k4, …) Each replica has a version number –Concurrent update problems –Must extract all replicas to find the newest one
20
Response Time VS DHT Size
21
Communication Cost VS DHT Size
22
Response Time VS # of Replica
23
Failure Rate VS Response Time
24
Conclusion Pros –Use DHT to provide timestamp service is smart! –Consider the concurrent update problem –Easy to apply on exiting DHTs Cons –KTS service can raise additional communication overhead
25
Thank You
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.