Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Indexing in Peer- to-Peer DHT Networks Garces-Erice, P.A.Felber, E.W.Biersack, G.Urvoy-Keller, K.W.Ross ICDCS 2004.

Similar presentations


Presentation on theme: "Data Indexing in Peer- to-Peer DHT Networks Garces-Erice, P.A.Felber, E.W.Biersack, G.Urvoy-Keller, K.W.Ross ICDCS 2004."— Presentation transcript:

1 Data Indexing in Peer- to-Peer DHT Networks Garces-Erice, P.A.Felber, E.W.Biersack, G.Urvoy-Keller, K.W.Ross ICDCS 2004

2 DHT Structure P2P Distributed Hash Table mapping between the file identifier and location Ex:  Search for file "Starwars.divx“  Convert "Starwars.divx" to a key, say "123456789“  Lookup "123456789" in the DHT, find out the file location  Download the file

3 Indexing Indexes don’t contain key-to-data mapping Indexes provide a key-to-key service, or more precisely a query-to-query service Ex: Query q A list of more specific queries, covered by q Select a query q If q is the most specific query of a file, returns the file

4

5 Maintain In order to consists of query-to-query mappings, each node:  Insert( q, q i ) function, with q 包含所有的 q i adds a mapping( q ; q i ) to the index of the node responsible for key q  Lookup( q ) function, with q not being the most specific query of a file, returns a list of all the queries qi such there is a mapping(q;qi) in the index of the node responsible for key q

6 Example: bibliographic database Query-to-key Query-to-Query

7

8

9 Discussion Some interesting properties of this indexing techniques:  Space efficient  Scalability  Loose coupling between data and indexes  Versatility  Adaptability  Decentralized architecture  Resilient to arbitrary linking

10 System point of view  Search process should be simple  Amount of network traffic should be minimized  Storage space dedicated to the indexing metadata should remain within reasonable limits.

11 Evaluation Distributed Bibliographic Database  Bibliographic database sites: BibFinder http://kilimanjaro.eas.asu.edu NetBib http://edas.info/S.cgi?search=1http://kilimanjaro.eas.asu.eduhttp://edas.info/S.cgi?search=1

12 Indexing scheme Simple indexing schemeFlat indexing scheme

13 Indexing scheme Complex indexing scheme

14 Indexing scheme Simple: A query for an author or a title returns a set of author and title pairs. The most space-efficient of the three, requiring 152MB of extra storage in the system. Flat: index query length is always 2. require 37% increase more space. Complex: some queries in the simple scheme are split into more specific queries. Require 25% increase more space.

15 Probability vs. Ranking

16 Caching Multi-cache: shortcuts are created on each node along the lookup path. Cache size is unbounded. Single-cache: shortcuts are created only on the first node that was contacted. Cache size is unbounded. LRU (least-recently used) : only a limited number of shortcuts can be stored on each node.

17 Average number of interactions required to find data.

18 Average network traffic (bytes) generated per query.

19 Cache efficiency: distributed hit ratio.

20 Conclusion Indexing the data stored in the peer-to- peer network. Indexes are distributed across the nodes of the network and contain key-to-key (or query-to-query) mappings. Given a broad query, a user can look up the more specific queries that match its original query.


Download ppt "Data Indexing in Peer- to-Peer DHT Networks Garces-Erice, P.A.Felber, E.W.Biersack, G.Urvoy-Keller, K.W.Ross ICDCS 2004."

Similar presentations


Ads by Google