Download presentation
Presentation is loading. Please wait.
Published byKathleen Wilkinson Modified over 9 years ago
1
Tim Benke Supervisors: Josiane Xavier Parreira, Sebastian Michel Bachelor thesis
2
Why P2P-Networks ? Decentralisation No single point of failure No content-control Distribution of content, computing power, bandwith
3
4 hops3 hops2 hops1 hop0 hops Querying in P2P-networks Dirk Nowitzki TTL:
4
Idea Semantic Overlay Networks Querying in unstructured P2P-networks message flooding with TimeToLive many redundant messages Group peers according to their content Querying in Semantic Overlay Network (SON) only ask all nodes for specific content field
5
Querying in a SON Dirk Nowitzki basketball flowers geology computer science
6
How to build a SON Contact other peer P If( isFriend(P) ) Add P in list of friends Add P‘s friends in list of candidates isFriend(P) judged by How high is the similarity? How small is the overlap? How well did P cooperate? Dating
7
Process of P2P-dating peer to send to chosen from 3 lists: friends, candidates, random send check-alive message to friends send contact message to candidates and random peers receive synopses of collections and compute scores friend and candidate lists have fixed lengths Add until full then drop worst peers
8
Search in SON peer P sends queries to peers with similar interest profile, i.e. all friends Each peer only sends his top-k results back When all answers have arrived P merges results, removes duplicates and delivers top-k results
9
Strategies for scores Similarity Only: Overlap Only: Weighted Sum: Random: no Score computed Similarity(A,B) 0 = the same >0 until ∞ : differs
10
Overlap Measure Minwise Independent Permutations measure the overlap with formula: = hashs of documents
11
Similarity Measure Kullback Leibler Divergence/ Relative Entropy Similarity(A,B) 0 = the same >0 until ∞ : differs
12
PASTRY: network infrastructure Distributed Hash Table maps keys to peers currently responsible for that key MINERVA uses PASTRY O( log(N) ) hops for any message to reach any destination
13
Local Collections Index file saved on hard disk LUCENE Index is an Inverted Index for terms occuring in websites obtained by user – with surfing (e.g. by a plugin) crawler on bookmarks Allows additions and deletions
14
Experimental Setup NUTCH was used as crawler Seeds: 14-16 start URL‘s on a certain topic from del.icio.us and dmoz.org Depth: 2 each peer ~400 pages peer 1-4 Basketball peer 5-7 Computer Science peer 8-10 Flowers peer 11-12 Geology Queries for peer 1: „playoffs“, „Dirk Nowitzki“ Queries for peer 7: „thesis“ Queries for peer 12: „earth science“
15
Chart 1 Comparision for 75 Iterations between - 5 random peers - and p2pdating for 5 friends with weighted sum strategy, alpha=0.8 y-axis: recall x-axis: iterations in steps of 5
16
Chart 1
17
Chart 2 Comparision for 50 Iterations between - random peers asked - and p2pdating for x friends with weighted sum strategy, alpha=0.8 y-axis: recall x-axis: #peers asked
18
Chart 2
19
Conclusion Use of PASTRY as underlying routing/networking infrastructure Implementation of details of peer-to-peer network, p2pdating algorithm Messages-handling several message types protocol for sending and receiving messages Adaption of NUTCH to crawling Use of LUCENE to query indexes Experiments show benifit of P2PDating algorithm
20
Future Work Further Experiments: real-world data from bookmark lists of active del.icio.us users Firefox- or Proxy-Plugin for on-the-fly indexing, querying and display of results Further Applications: Adaption to MINERVA P2P Web Search
21
Thank you for your interest 14.05.2015 21 Tim Benke PLAGIA
22
FreePastry Free open source version under BSD-license called FreePastry FreePastry provides application level interface to underlying P2P-Network API for Java 1.5 Version used: 2.0 Beta
23
Overview Basics of P2P-networks Querying in P2P-networks Overlap and Similarity Computation Process of P2P-dating Application examples: Firefox plugin del.icio.us
24
Chart 2 Comparision for 50 Iterations between - random peers asked - and p2pdating for x-1 Friends and 1 Stranger with weighted sum strategy, alpha=0.8 - only K-L-Divergence y-axis: recall x-axis: #Peers asked
25
Chart 1 Comparision for 75 Iterations between - 5 random peers - and p2pdating for 4 Friends and 1 Stranger with weighted sum strategy, alpha=0.8 - only K-L-Divergence y-axis: recall x-axis: iterations in steps of 5
26
O:P2P-Dating Project Internet Crawls performed with APACHE- Project NUTCH provides collections Collections are indexed by NUTCH and a LUCENE index is produced 1 similarity measure and 1 overlap measure used to determine if node is a Friend
27
Process of P2P-dating Michael Jordan Friend List
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.