Routing Indices For Peer-to-Peer Systems Arturo Crespo, Hector Garcia-Molina Stanford ICDCS 2002
Motivation and ideas Search (text) documents with specific keyword (category) in P2P network content-based Users only interested in TOP 20 results Each peer stores statistics of Documents shared by itself Documents shared by its neighbors Route query to a “good” peer Sequential search vs parallel search
Proposed Methods Compound RI - naive Hop-count RI - improved Exponential RI - best
What is routing indices? For A, there are 100 documents available from B (and its descendents) 20 belong to Database category 10 belong to Theory category 30 belong to Languages category “Goodness” of a neighbor
Computing goodness For documents of “databases” and “languages”
Updating of routing indices New connection RI propagation D+A+J D+A+I
Proposed Methods Compound RI – naive Hop-count RI – improved Exponential RI - best
Problems and improvements Improved cost model take into account of query messages generated Less update cost RI propagates through limited hops (horizon) Robust against cycles 300 items 250 items
Hop-count RI For W, it can reach 30 documents from Y 1 hop away Y has 30 documents 50 documents from Y 2 hops away Y1,Y2 have 50 documents
Goodness measure in Hop-count RI Goodness of
Proposed Methods Compound RI – naive Hop-count RI – improved Exponential RI - best
Improvements Hop-count RI exhibits High storage cost High update cost of RI Compress RI of different Hops together Similar to Compound RI with RI update method differs
RI Update I changed to 70, 30, 10, 20, 50
RI Update D update I’s row as 70, 30, 10, 20,
RI Update D sent J’s update as 590, 86.67, 130, 70,
RI Update D send A and update as 140, 75, 3.3, 75, 100
Experiment
Query Message generated Why CRI, HRI, ERI perform much better in the uniform distribution?
Effect of index compression
Effect of cycles
Query message in different network topology
Update cost in different network topology
END