Data Indexing in Peer- to-Peer DHT Networks Garces-Erice, P.A.Felber, E.W.Biersack, G.Urvoy-Keller, K.W.Ross ICDCS 2004.

Slides:



Advertisements
Similar presentations
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Advertisements

Peer to Peer and Distributed Hash Tables
Evaluation of a Scalable P2P Lookup Protocol for Internet Applications
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
P2PIR'06: "Distributed Cache Table (DCT)" Gleb Skobeltsyn, Karl Aberer D istributed T able: Efficient Query-Driven Processing of Multi-Term Queries in.
Digital Library Service – An overview Introduction System Architecture Components and their functionalities Experimental Results.
Scalable Content-Addressable Network Lintao Liu
Peer-to-Peer (P2P) Distributed Storage 1Dennis Kafura – CS5204 – Operating Systems.
GridVine: Building Internet-Scale Semantic Overlay Networks By Lan Tian.
Building a Distributed Full-Text Index for the Web S. Melnik, S. Raghavan, B.Yang, H. Garcia-Molina.
Chord: A scalable peer-to- peer lookup service for Internet applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashock, Hari Balakrishnan.
Search and Replication in Unstructured Peer-to-Peer Networks Pei Cao, Christine Lv., Edith Cohen, Kai Li and Scott Shenker ICS 2002.
Chord: A Scalable Peer-to-Peer Lookup Protocol for Internet Applications Stoica et al. Presented by Tam Chantem March 30, 2007.
P2P: Advanced Topics Filesystems over DHTs and P2P research Vyas Sekar.
Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai.
Integrating Semantics-Based Access Mechanisms with P2P File Systems Yingwu Zhu, Honghao Wang and Yiming Hu.
Object Naming & Content based Object Search 2/3/2003.
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
Freenet A Distributed Anonymous Information Storage and Retrieval System I Clarke O Sandberg I Clarke O Sandberg B WileyT W Hong.
1 CS 194: Distributed Systems Distributed Hash Tables Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
ICDE A Peer-to-peer Framework for Caching Range Queries Ozgur D. Sahin Abhishek Gupta Divyakant Agrawal Amr El Abbadi Department of Computer Science.
Peer-to-peer file-sharing over mobile ad hoc networks Gang Ding and Bharat Bhargava Department of Computer Sciences Purdue University Pervasive Computing.
1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.
Storage management and caching in PAST PRESENTED BY BASKAR RETHINASABAPATHI 1.
Introduction to Peer-to-Peer Networks. What is a P2P network Uses the vast resource of the machines at the edge of the Internet to build a network that.
INTRODUCTION TO PEER TO PEER NETWORKS Z.M. Joseph CSE 6392 – DB Exploration Spring 2006 CSE, UT Arlington.
Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.
1 Route Table Partitioning and Load Balancing for Parallel Searching with TCAMs Department of Computer Science and Information Engineering National Cheng.
Introduction to Peer-to-Peer Networks. What is a P2P network A P2P network is a large distributed system. It uses the vast resource of PCs distributed.
Thesis Proposal Data Consistency in DHTs. Background Peer-to-peer systems have become increasingly popular Lots of P2P applications around us –File sharing,
The Anatomy of a Large-Scale Hypertextual Web Search Engine Presented By: Sibin G. Peter Instructor: Dr. R.M.Verma.
Using the Small-World Model to Improve Freenet Performance Hui Zhang Ashish Goel Ramesh Govindan USC.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
Routing Indices For P-to-P Systems ICDCS Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.
Network Computing Laboratory Scalable File Sharing System Using Distributed Hash Table Idea Proposal April 14, 2005 Presentation by Jaesun Han.
G063 - Distributed Databases. Learning Objectives: By the end of this topic you should be able to: explain how databases may be stored in more than one.
The Anatomy of a Large-Scale Hyper textual Web Search Engine S. Brin, L. Page Presenter :- Abhishek Taneja.
AlvisP2P : Scalable Peer-to-Peer Text Retrieval in a Structured P2P Network Toan Luu, Gleb Skobeltsyn, Fabius Klemm, Maroje Puh, Ivana Podnar Zarko, Martin.
Freelib: A Self-sustainable Digital Library for Education Community Ashraf Amrou, Kurt Maly, Mohammad Zubair Computer Science Dept., Old Dominion University.
An IP Address Based Caching Scheme for Peer-to-Peer Networks Ronaldo Alves Ferreira Joint work with Ananth Grama and Suresh Jagannathan Department of Computer.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
1 Biometric Databases. 2 Overview Problems associated with Biometric databases Some practical solutions Some existing DBMS.
1 Peer-to-Peer Technologies Seminar by: Kunal Goswami (05IT6006) School of Information Technology Guided by: Prof. C.R.Mandal, School of Information Technology.
Scalable Content- Addressable Networks Prepared by Kuhan Paramsothy March 5, 2007.
Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such.
Peer to Peer A Survey and comparison of peer-to-peer overlay network schemes And so on… Chulhyun Park
Freenet “…an adaptive peer-to-peer network application that permits the publication, replication, and retrieval of data while protecting the anonymity.
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
1 Distributed Hash Table CS780-3 Lecture Notes In courtesy of Heng Yin.
Computer Networking P2P. Why P2P? Scaling: system scales with number of clients, by definition Eliminate centralization: Eliminate single point.
Plethora: Infrastructure and System Design. Introduction Peer-to-Peer (P2P) networks: –Self-organizing distributed systems –Nodes receive and provide.
1. Efficient Peer-to-Peer Lookup Based on a Distributed Trie 2. Complex Queries in DHT-based Peer-to-Peer Networks Lintao Liu 5/21/2002.
Peer to Peer Network Design Discovery and Routing algorithms
CS Spring 2014 CS 414 – Multimedia Systems Design Lecture 37 – Introduction to P2P (Part 1) Klara Nahrstedt.
CS 347Notes081 CS 347: Parallel and Distributed Data Management Notes 08: P2P Systems.
P2P Search COP P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella.
Challenge: Peers on Wheels – A Road to New Traffic Information Systems Jedrzej Rybicki, Björn Scheuermann, Wolfgang Kiess Christian Lochert, Pezhman Fallahi,
Large Scale Sharing Marco F. Duarte COMP 520: Distributed Systems September 19, 2004.
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
Author: Akiyoshi Matonoy, Toshiyuki Amagasay, Masatoshi Yoshikawaz, Shunsuke Uemuray.
The Anatomy of a Large-Scale Hypertextual Web Search Engine S. Brin and L. Page, Computer Networks and ISDN Systems, Vol. 30, No. 1-7, pages , April.
Peer-to-Peer File Sharing Systems Group Meeting Speaker: Dr. Xiaowen Chu April 2, 2004 Centre for E-transformation Research Department of Computer Science.
Distributed Caching and Adaptive Search in Multilayer P2P Networks Chen Wang, Li Xiao, Yunhao Liu, Pei Zheng The 24th International Conference on Distributed.
Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications * CS587x Lecture Department of Computer Science Iowa State University *I. Stoica,
Integrating Semantics-Based Access Mechanisms with P2P File Systems
CHAPTER 3 Architectures for Distributed Systems
Plethora: Infrastructure and System Design
DHT Routing Geometries and Chord
Prof. Leonardo Mostarda University of Camerino
A Semantic Peer-to-Peer Overlay for Web Services Discovery
Presentation transcript:

Data Indexing in Peer- to-Peer DHT Networks Garces-Erice, P.A.Felber, E.W.Biersack, G.Urvoy-Keller, K.W.Ross ICDCS 2004

DHT Structure P2P Distributed Hash Table mapping between the file identifier and location Ex:  Search for file "Starwars.divx“  Convert "Starwars.divx" to a key, say " “  Lookup " " in the DHT, find out the file location  Download the file

Indexing Indexes don’t contain key-to-data mapping Indexes provide a key-to-key service, or more precisely a query-to-query service Ex: Query q A list of more specific queries, covered by q Select a query q If q is the most specific query of a file, returns the file

Maintain In order to consists of query-to-query mappings, each node:  Insert( q, q i ) function, with q 包含所有的 q i adds a mapping( q ; q i ) to the index of the node responsible for key q  Lookup( q ) function, with q not being the most specific query of a file, returns a list of all the queries qi such there is a mapping(q;qi) in the index of the node responsible for key q

Example: bibliographic database Query-to-key Query-to-Query

Discussion Some interesting properties of this indexing techniques:  Space efficient  Scalability  Loose coupling between data and indexes  Versatility  Adaptability  Decentralized architecture  Resilient to arbitrary linking

System point of view  Search process should be simple  Amount of network traffic should be minimized  Storage space dedicated to the indexing metadata should remain within reasonable limits.

Evaluation Distributed Bibliographic Database  Bibliographic database sites: BibFinder NetBib

Indexing scheme Simple indexing schemeFlat indexing scheme

Indexing scheme Complex indexing scheme

Indexing scheme Simple: A query for an author or a title returns a set of author and title pairs. The most space-efficient of the three, requiring 152MB of extra storage in the system. Flat: index query length is always 2. require 37% increase more space. Complex: some queries in the simple scheme are split into more specific queries. Require 25% increase more space.

Probability vs. Ranking

Caching Multi-cache: shortcuts are created on each node along the lookup path. Cache size is unbounded. Single-cache: shortcuts are created only on the first node that was contacted. Cache size is unbounded. LRU (least-recently used) : only a limited number of shortcuts can be stored on each node.

Average number of interactions required to find data.

Average network traffic (bytes) generated per query.

Cache efficiency: distributed hit ratio.

Conclusion Indexing the data stored in the peer-to- peer network. Indexes are distributed across the nodes of the network and contain key-to-key (or query-to-query) mappings. Given a broad query, a user can look up the more specific queries that match its original query.