Presentation is loading. Please wait.

Presentation is loading. Please wait.

Semantic Overlay Networks for P2P Systems Arturo Crespo and Hector Garcia-Molina.

Similar presentations


Presentation on theme: "Semantic Overlay Networks for P2P Systems Arturo Crespo and Hector Garcia-Molina."— Presentation transcript:

1 Semantic Overlay Networks for P2P Systems Arturo Crespo and Hector Garcia-Molina

2 Motivation and Related Work Partition the P2P network into several thematic networks Queries for a content will not reach nodes without such content Flooding in smaller networks with smaller TTL (or more results with same) Edutella, Hypercup: peers with similar content connect to the same SuperPeer

3 Node partitioning When does a node belong to SON A? When it contains a piece of type A When it contains more than x pieces of type A Less nodes per SON=>more results sooner Less SONs per node=>less connections As in DBFS, coverage problem

4 SON Classification Classification must provide: Load-balance Each category has similar number of nodes Each node belongs to a small number of categories Easy and accurate way to classify a document

5 Statistics Classification from “All Music Guide” based on music style (26 styles without and 255 styles with substyles) Traces from 1800 Napster nodes Classifications on decade and tone provide worse results

6 Statistics of styles A node belongs to a style if it has 1 document 24% of the nodes belong to one style 90% nodes belong to up to eight 14 styles with 200 to 400 nodes 2 styles with 1000 to 1200 nodes 1 style with 1600 to 1800 nodes

7 Statistics of substyles 18% of the nodes belong to one substyle 90% nodes belong to up to 30 styles 87% of substyles with less than 400 nodes

8 Statistics of document classification Classification based on the “All Music Guide” database Reasons of failure: Filename format All documents not songs Misspellings Database not complete 25% of documents classified incorrectly 4% of nodes classified incorrectly

9 Layered SONs Styles and substyles Minimum required percentage (or number) of documents of type A to join SON A May not be able to join a substyle but may join a style Nodes belonging to a substyle, do not join the parent style

10 Searching A document is classified to belong to (sub)style A Search all substyles of style A SONs Search (sub)style A Search a higher level SON Until we get enough results (How do we locate each SON?)

11 Results Acyclic graphs, to measure effect of msgs to nodes without such content To get half the documents that match a query: Layered SONs: 461 msgs Gnutella: 1731 msgs


Download ppt "Semantic Overlay Networks for P2P Systems Arturo Crespo and Hector Garcia-Molina."

Similar presentations


Ads by Google