Semantic Overlay Networks in P2P systems A. Crespo, H. Garcia-Molina Speaker: Pavel Serdyukov Tutor: Jens Graupmann.

Slides:



Advertisements
Similar presentations
Schema Matching and Query Rewriting in Ontology-based Data Integration Zdeňka Linková ICS AS CR Advisor: Július Štuller.
Advertisements

GridVine: Building Internet-Scale Semantic Overlay Networks By Lan Tian.
University of Cincinnati1 Towards A Content-Based Aggregation Network By Shagun Kakkar May 29, 2002.
Denial-of-Service Resilience in Peer-to-Peer Systems D. Dumitriu, E. Knightly, A. Kuzmanovic, I. Stoica and W. Zwaenepoel Presenter: Yan Gao.
Expediting Searching Processes via Long Paths in P2P Systems 05/30 IDEA Lab.
Towards Semantic Web Mining Bettina Berndt Andreas Hotho Gerd Stumme.
LSDS-IR’08, October 30, Peer-to-Peer Similarity Search over Widely Distributed Document Collections Christos Doulkeridis 1, Kjetil Nørvåg 2, Michalis.
Presented by Zeehasham Rasheed
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
Information retrieval Finding relevant data using irrelevant keys Example: database of photographic images sorted by number, date. DBMS: Well structured.
1 Seminar: Information Management in the Web Gnutella, Freenet and more: an overview of file sharing architectures Thomas Zahn.
P2P Course, Structured systems 1 Introduction (26/10/05)
Ecole Polytechnique Fédérale de Lausanne, Switzerland Efficient processing of XPath queries with structured overlay networks Gleb Skobeltsyn, Manfred Hauswirth,
Peer-to-peer file-sharing over mobile ad hoc networks Gang Ding and Bharat Bhargava Department of Computer Sciences Purdue University Pervasive Computing.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
1 P2P Querying Wes Hatch MUMT-614 Mar What is P2P? Nodes of equal roles exchanging information and services directly “distributed databases”
09/07/2004Peer-to-Peer Systems in Mobile Ad-hoc Networks 1 Lookup Service for Peer-to-Peer Systems in Mobile Ad-hoc Networks M. Tech Project Presentation.
Paraskevi Raftopoulou 1,2 Paraskevi Raftopoulou 1,2 and Euripides G.M. Petrakis 2 1 Max-Planck Institute for Informatics, Saarbruecken, Germany
PNear Combining Content Clustering and Distributed Hash-Tables Ronny Siebes Vrije Universiteit, Amsterdam The netherlands
Peer to Peer Research survey TingYang Chang. Intro. Of P2P Computers of the system was known as peers which sharing data files with each other. Build.
Improving Web Spam Classification using Rank-time Features September 25, 2008 TaeSeob,Yun KAIST DATABASE & MULTIMEDIA LAB.
Query Routing in Peer-to-Peer Web Search Engine Speaker: Pavel Serdyukov Supervisors: Gerhard Weikum Christian Zimmer Matthias Bender International Max.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
A Web Services Search Engine CS 8803 [AIA] - Spring 2008 Roland Krystian Alberciak Piotr Kozikowski Sudnya Padalikar Tushar Sugandhi.
TOMA: A Viable Solution for Large- Scale Multicast Service Support Li Lao, Jun-Hong Cui, and Mario Gerla UCLA and University of Connecticut Networking.
Automatic Image Annotation by Using Concept-Sensitive Salient Objects for Image Content Representation Jianping Fan, Yuli Gao, Hangzai Luo, Guangyou Xu.
Ground Truth Free Evaluation of Segment Based Maps Rolf Lakaemper Temple University, Philadelphia,PA,USA.
An IP Address Based Caching Scheme for Peer-to-Peer Networks Ronaldo Alves Ferreira Joint work with Ananth Grama and Suresh Jagannathan Department of Computer.
11th SSDBM, Cleveland, Ohio, July 28-30, 1999 Supporting Imprecision in Multidimensional Databases Using Granularities T. B. Pedersen 1,2, C. S. Jensen.
Enhancing Cluster Labeling Using Wikipedia David Carmel, Haggai Roitman, Naama Zwerdling IBM Research Lab (SIGIR’09) Date: 11/09/2009 Speaker: Cho, Chin.
Temporal-DHT and its Application in P2P-VoD Systems Abhishek Bhattacharya, Zhenyu Yang & Shiyun Zhang.
th CODATA 2006, BEJING1 A P2P Service Discovery Strategy Based on Content Catalogues Dr. Lican Huang, Director Institute of Network & Distributed.
Understanding User’s Query Intent with Wikipedia G 여 승 후.
Efficient Semantic Based Content Search in P2P Network Heng Tao Shen, Yan Feng Shu, and Bei Yu.
1 Towards Taxonomy-based Routing in P2P Networks Alexander L¨oser 指導老師 : 許子衝 老師 學生 : 羅英辰 學號 :M97G0216.
21/11/20151Gianluca Demartini Ranking Clusters for Web Search Gianluca Demartini Paul–Alexandru Chirita Ingo Brunkhorst Wolfgang Nejdl L3S Info Lunch Hannover,
Efficient P2P Search by Exploiting Localities in Peer Community and Individual Peers A DISC’04 paper Lei Guo 1 Song Jiang 2 Li Xiao 3 and Xiaodong Zhang.
ASSESSING LEARNING ALGORITHMS Yılmaz KILIÇASLAN. Assessing the performance of the learning algorithm A learning algorithm is good if it produces hypotheses.
Majid Sazvar Knowledge Engineering Research Group Ferdowsi University of Mashhad Semantic Web Reasoning.
Scalable Hybrid Keyword Search on Distributed Database Jungkee Kim Florida State University Community Grids Laboratory, Indiana University Workshop on.
1. Efficient Peer-to-Peer Lookup Based on a Distributed Trie 2. Complex Queries in DHT-based Peer-to-Peer Networks Lintao Liu 5/21/2002.
Ranking of Database Query Results Nitesh Maan, Arujn Saraswat, Nishant Kapoor.
A Patent Document Retrieval System Addressing Both Semantic and Syntactic Properties Liang Chen*,Naoyuki Tokuda+, Hisahiro Adachi+ *University of Northern.
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
Achieving Semantic Interoperability at the World Bank Designing the Information Architecture and Programmatically Processing Information Denise Bedford.
P2P Search COP P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella.
Hybrid Content and Tag-based Profiles for recommendation in Collaborative Tagging Systems Latin American Web Conference IEEE Computer Society, 2008 Presenter:
An Efficient Implementation of File Sharing Systems on the Basis of WiMAX and Wi-Fi Jingyuan Li, Liusheng Huang, Weijia Jia, Mingjun Xiao and Peng Du Joint.
JavaScript Introduction and Background. 2 Web languages Three formal languages HTML JavaScript CSS Three different tasks Document description Client-side.
CMSC 691B Multi-Agent System A Scalable Architecture for Peer to Peer Agent by Naveen Srinivasan.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
CS791 - Technologies of Google Spring A Web­based Kernel Function for Measuring the Similarity of Short Text Snippets By Mehran Sahami, Timothy.
OntoZilla: An Ontology-based, Semi-structured, and Evolutionary P2P Network for Information Systems and Services 指導教授:李官陵 學 生:陳建博 蔡英傑
Semantic Overlay Networks for P2P Systems Arturo Crespo and Hector Garcia-Molina.
Ontology Engineering and Feature Construction for Predicting Friendship Links in the Live Journal Social Network Author:Vikas Bahirwani 、 Doina Caragea.
CS Spring 2010 CS 414 – Multimedia Systems Design Lecture 24 – Introduction to Peer-to-Peer (P2P) Systems Klara Nahrstedt (presented by Long Vu)
Composing Web Services and P2P Infrastructure. PRESENTATION FLOW Related Works Paper Idea Our Project Infrastructure.
| presented by Vasileios Zois CS at USC 09/20/2013 Introducing Scalability into Smart Grid 1.
Improving Data Discovery Through Semantic Search
CHAPTER 3 Architectures for Distributed Systems
EE 122: Peer-to-Peer (P2P) Networks
Towards Evaluation of P2P-based DKMS
Paraskevi Raftopoulou, Euripides G.M. Petrakis
Hierarchical Search on DisCSPs
draft-bryan-sipping-p2p
Ying Dai Faculty of software and information science,
Ying Dai Faculty of software and information science,
Deterministic and Semantically Organized Network Topology
Hierarchical Search on DisCSPs
A P2P Service Discovery Strategy Based on Content Catalogues
Presentation transcript:

Semantic Overlay Networks in P2P systems A. Crespo, H. Garcia-Molina Speaker: Pavel Serdyukov Tutor: Jens Graupmann

Talk Outline Motivation SONs  Conceptual idea  Formal definition  Hierarchical definiton Generatiing SONs  Classification criteria  Sources of classification errors  Peer assignment strategies Layered SONs  Overview  Searching through layers Experiments Later contributions

Motivation Broadcasting all queries to all information sources obviously doesn‘t scale efficiently Hash-based queries scale, but can’t be complex  no approximate queries  no range queries  no text queries Arising intuitions:  It’s better to route queries only to peers that are more likely to have answers  Shared content often has pronounced ontological structure (music, movies, scientific papers etc.)

Conceptual idea Peers are clustered Clusters overlap Query is distributed to relevant cluster(s) only Query is routed within each relevant cluster only Irrelevant clusters are not bothered with the query

Semantic Overlay Network (SON) Semantic Overlay Network Virtual, abstract, independent layer of selected peers Advantages  Introduces semantic “views” to the physical network  Mediation and integration (correspondences, query rewriting)  Reduces overfloodding the network

Formal definition of SON Semantic Overlay Network (SON) is a set of triples (links): { (n i,n j,L) } n i,n j - linked peers L- string (name of category) Each SON L implements functions: Join (n i ) Search (q) Leave (n i ) jazz country rock

SON: Hierarchical definiton SON is an overlay network, associated with a concept of classification hierarchy For example, we have 9 SONs for classification of music by style or 4 SONs for classification of music by tone Documents of the peer must be assigned to concepts, so that this peer could be assigned to corresponding SONs Styles Substyles Tones

Semantic Overlay Networks in action

Criteria of good classification hierarchy Classification hierarchy is good, if Documents in each category belong to a small number of peers (high granularity + equal popularity) Peers have documents in a small number of categories (sensible, moderate granilarity) Classification algorithm is fast and errorless

Sources of errors Format of the files may not follow the expected standard Classification ontology may be incompatible with files Users make misspellings in the names of files So, 25% of music files were classified incorrectly But peer can still be correctly classified even if some of its documents are misclassified! So, only 4% of peers were classifed incorrectly

Peer assignment strategies Conservative strategy: place peer in SON c, if it has any document classified in concept c produces too many links Less conservative strategy: place peer in SON c, if it has “significant” number of documents, classified in concept c prevents from finding all documents Final solution: use Layered SONs

Layered SONs: Overview ≥ 15 % Hierarchy of concepts I. Apply less conservative strategy with threshold parameter II. Consider combination of “non-assigned” concepts, try to join peer to “upper-level” SON ≥ 15 %

Layered SONs: Searching Query can be assigned to:  Leaf concept, i.e. precisely classified (figure a)  Non-leaf concept, i.e. imprecisely classified (figures b, c) Imprecise classification leads to additional overhead a c query b

Layered SONs: Experiments Layered approach help peers belong to less SONs (left figure) Layered approach helps reduce the number of peers per SON (right figure) 1800 peers / 16 SONs

Comparative experiments Layered SONs were able to obtain the same level of matches with significantly fewer messages than the Gnutella-like system Layered SONs couldn’t archive 100% recall level due to classification errors, average maximum is 93%

Later contributions: exploiting super-peer network Global SP-SP Network for broadcasting messages Local SP-P Network for registering Peers Client Server like Matching, Distribution and Routing Tables at Super-Peers A. Loser, F. Naumann, W. Siberski, W. Nejdl, U. Thaden. Semantic Overlay Clusters within Super-Peer Networks. August 2003

Thank you for attention Queries?