Semantic Overlay Networks for P2P Systems Arturo Crespo and Hector Garcia-Molina.

Slides:



Advertisements
Similar presentations
Massively Distributed Database Systems Distributed Hash Spring 2014 Ki-Joune Li Pusan National University.
Advertisements

Routing Indices For Peer-to-Peer Systems Arturo Crespo, Hector Garcia-Molina Stanford ICDCS 2002.
P2p, Spring 05 1 Topics in Database Systems: Data Management in Peer-to-Peer Systems Routing indexes A. Crespo & H. Garcia-Molina ICDCS 02.
LightFlood: An Optimal Flooding Scheme for File Search in Unstructured P2P Systems Song Jiang, Lei Guo, and Xiaodong Zhang College of William and Mary.
Expediting Searching Processes via Long Paths in P2P Systems 05/30 IDEA Lab.
An Overview of Peer-to-Peer Networking CPSC 441 (with thanks to Sami Rollins, UCSB)
Peer-to-Peer Networks João Guerreiro Truong Cong Thanh Department of Information Technology Uppsala University.
An Interactive Visualization of Super-peer P2P Networks Peiqun (Anthony) Yu.
P2p, Spring 05 1 Topics in Database Systems: Data Management in Peer-to-Peer Systems March 29, 2005.
Evaluation of Ad hoc Routing Protocols under a Peer-to-Peer Application Authors: Leonardo Barbosa Isabela Siqueira Antonio A. Loureiro Federal University.
Eddie Bortnikov/Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation.
YAPPERS: A Peer-to-Peer Lookup Service over Arbitrary Topology Qixiang Sun Prasanna Ganesan Hector Garcia-Molina Stanford University.
Spotlighting Decentralized P2P File Sharing Archie Kuo and Ethan Le Department of Computer Science San Jose State University.
HyperCuP – P2P Network Boyko Syarov. 2 Outline  HyperCup: What is it?  Basic Concepts  Broadcasting Algorithm  Topology Construction  Ontology Based.
CSc 461/561 CSc 461/561 Peer-to-Peer Streaming. CSc 461/561 Summary (1) Service Models (2) P2P challenges (3) Service Discovery (4) P2P Streaming (5)
A Trust Based Assess Control Framework for P2P File-Sharing System Speaker : Jia-Hui Huang Adviser : Kai-Wei Ke Date : 2004 / 3 / 15.
Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai.
Exploiting Content Localities for Efficient Search in P2P Systems Lei Guo 1 Song Jiang 2 Li Xiao 3 and Xiaodong Zhang 1 1 College of William and Mary,
1 Maximizing Remote Work in Flooding-based P2P Systems Qixiang Sun Neil Daswani Hector Garcia-Molina Stanford University.
LSDS-IR’08, October 30, Peer-to-Peer Similarity Search over Widely Distributed Document Collections Christos Doulkeridis 1, Kjetil Nørvåg 2, Michalis.
Count / Top-k Continuous Queries on P2P Networks 01/11/2006.
1 Client-Server versus P2P  Client-server Computing  Purpose, definition, characteristics  Relationship to the GRID  Research issues  P2P Computing.
Object Naming & Content based Object Search 2/3/2003.
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
1 CS 194: Distributed Systems Distributed Hash Tables Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
1 Seminar: Information Management in the Web Gnutella, Freenet and more: an overview of file sharing architectures Thomas Zahn.
Searching in Unstructured Networks Joining Theory with P-P2P.
1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.
INTRODUCTION TO PEER TO PEER NETWORKS Z.M. Joseph CSE 6392 – DB Exploration Spring 2006 CSE, UT Arlington.
P2P Architecture Case Study: Gnutella Network
IR Techniques For P2P Networks1 Information Retrieval Techniques For Peer-To-Peer Networks Demetrios Zeinalipour-Yazti, Vana Kalogeraki and Dimitrios Gunopulos.
1 P2P Querying Wes Hatch MUMT-614 Mar What is P2P? Nodes of equal roles exchanging information and services directly “distributed databases”
1 Telematica di Base Applicazioni P2P. 2 The Peer-to-Peer System Architecture  peer-to-peer is a network architecture where computer resources and services.
Introduction to Peer-to-Peer Networks. What is a P2P network A P2P network is a large distributed system. It uses the vast resource of PCs distributed.
Peer to Peer Research survey TingYang Chang. Intro. Of P2P Computers of the system was known as peers which sharing data files with each other. Build.
Peer-to-Peer Networks University of Jordan. Server/Client Model What?
P2p, Fall 06 1 Topics in Database Systems: Data Management in Peer-to-Peer Systems Routing indexes A. Crespo & H. Garcia-Molina ICDCS 02.
Routing Indices For P-to-P Systems ICDCS Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.
PSI Peer Search Infrastructure. Introduction What are P2P Networks? The term "peer-to-peer" refers to a class of systems and applications that employ.
Super-peer Network. Motivation: Search in P2P Centralised (Napster) Flooding (Gnutella)  Essentially a breadth-first search using TTLs Distributed Hash.
Quantitative Evaluation of Unstructured Peer-to-Peer Architectures Fabrício Benevenuto José Ismael Jr. Jussara M. Almeida Department of Computer Science.
SIGCOMM 2001 Lecture slides by Dr. Yingwu Zhu Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
03/19/02Scalab Seminar Series1 Routing in Peer-to-Peer Systems Ramaswamy N.Vadivelu Scalab, ASU.
Efficient Semantic Based Content Search in P2P Network Heng Tao Shen, Yan Feng Shu, and Bei Yu.
1 Peer-to-Peer Technologies Seminar by: Kunal Goswami (05IT6006) School of Information Technology Guided by: Prof. C.R.Mandal, School of Information Technology.
Peer to Peer A Survey and comparison of peer-to-peer overlay network schemes And so on… Chulhyun Park
Efficient P2P Search by Exploiting Localities in Peer Community and Individual Peers A DISC’04 paper Lei Guo 1 Song Jiang 2 Li Xiao 3 and Xiaodong Zhang.
Patterns around Gnutella Network Nodes Sui-Yu Wang.
LightFlood: An Efficient Flooding Scheme for File Search in Unstructured P2P Systems Song Jiang, Lei Guo, and Xiaodong Zhang College of William and Mary.
Semantic Overlay Networks in P2P systems A. Crespo, H. Garcia-Molina Speaker: Pavel Serdyukov Tutor: Jens Graupmann.
1 Query-Flood DoS Attacks in Gnutella by Andreas Legrum based upon a paper by Neil Daswani and Hector Garcia-Molina.
Peer-to-peer systems (part I) Slides by Indranil Gupta (modified by N. Vaidya)
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
CS Spring 2014 CS 414 – Multimedia Systems Design Lecture 37 – Introduction to P2P (Part 1) Klara Nahrstedt.
Reorganization in Network Regions for Optimality and Fairness Robert E. Beverly IV, MSc Thesis.
Two Peer-to-Peer Networking Approaches Ken Calvert Net Seminar, 23 October 2001 Note: Many slides “borrowed” from S. Ratnasamy’s Qualifying Exam talk.
P2P Search COP6731 Advanced Database Systems. P2P Computing  Powerful personal computer Share computing resources P2P Computing  Advantages: Shared.
P2P Search COP P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella.
CS Spring 2012 CS 414 – Multimedia Systems Design Lecture 37 – Introduction to P2P (Part 1) Klara Nahrstedt.
Distributed Caching and Adaptive Search in Multilayer P2P Networks Chen Wang, Li Xiao, Yunhao Liu, Pei Zheng The 24th International Conference on Distributed.
Adlib : A Self-Tuning Index for Dynamic Peer-to-Peer Systems Proceedings of the 21st International Conference on Data Engineering(ICDE'05) Prasanna Ganesan.
CS Spring 2010 CS 414 – Multimedia Systems Design Lecture 24 – Introduction to Peer-to-Peer (P2P) Systems Klara Nahrstedt (presented by Long Vu)
Composing Web Services and P2P Infrastructure. PRESENTATION FLOW Related Works Paper Idea Our Project Infrastructure.
Early Measurements of a Cluster-based Architecture for P2P Systems
EE 122: Peer-to-Peer (P2P) Networks
Peer-to-Peer Information Systems Week 6: Performance
Deterministic and Semantically Organized Network Topology
DNSR: Domain Name Suffix-based Routing in Overlay Networks
Information Retrieval and Web Design
Presentation transcript:

Semantic Overlay Networks for P2P Systems Arturo Crespo and Hector Garcia-Molina

Motivation and Related Work Partition the P2P network into several thematic networks Queries for a content will not reach nodes without such content Flooding in smaller networks with smaller TTL (or more results with same) Edutella, Hypercup: peers with similar content connect to the same SuperPeer

Node partitioning When does a node belong to SON A? When it contains a piece of type A When it contains more than x pieces of type A Less nodes per SON=>more results sooner Less SONs per node=>less connections As in DBFS, coverage problem

SON Classification Classification must provide: Load-balance Each category has similar number of nodes Each node belongs to a small number of categories Easy and accurate way to classify a document

Statistics Classification from “All Music Guide” based on music style (26 styles without and 255 styles with substyles) Traces from 1800 Napster nodes Classifications on decade and tone provide worse results

Statistics of styles A node belongs to a style if it has 1 document 24% of the nodes belong to one style 90% nodes belong to up to eight 14 styles with 200 to 400 nodes 2 styles with 1000 to 1200 nodes 1 style with 1600 to 1800 nodes

Statistics of substyles 18% of the nodes belong to one substyle 90% nodes belong to up to 30 styles 87% of substyles with less than 400 nodes

Statistics of document classification Classification based on the “All Music Guide” database Reasons of failure: Filename format All documents not songs Misspellings Database not complete 25% of documents classified incorrectly 4% of nodes classified incorrectly

Layered SONs Styles and substyles Minimum required percentage (or number) of documents of type A to join SON A May not be able to join a substyle but may join a style Nodes belonging to a substyle, do not join the parent style

Searching A document is classified to belong to (sub)style A Search all substyles of style A SONs Search (sub)style A Search a higher level SON Until we get enough results (How do we locate each SON?)

Results Acyclic graphs, to measure effect of msgs to nodes without such content To get half the documents that match a query: Layered SONs: 461 msgs Gnutella: 1731 msgs