Design of a Robust Search Algorithm for P2P Networks

Slides:

Advertisements

Similar presentations

UNIVERSITY OF JYVÄSKYLÄ Resource Discovery in P2P Networks Using Evolutionary Neural Networks Presentation for International Conference on Advances in.

Advertisements

Analysis and Modeling of Social Networks Foudalis Ilias.

UNIVERSITY OF JYVÄSKYLÄ Building NeuroSearch – Intelligent Evolutionary Search Algorithm For Peer-to-Peer Environment Master’s Thesis by Joni Töyrylä

Modeling and Analysis of Random Walk Search Algorithms in P2P Networks Nabhendra Bisnik, Alhussein Abouzeid ECSE, Rensselaer Polytechnic Institute.

Niloy Ganguly Complex Networks Research Group Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur Kharagpur Collaborators.

Search and Replication in Unstructured Peer-to-Peer Networks Pei Cao, Christine Lv., Edith Cohen, Kai Li and Scott Shenker ICS 2002.

Context-aware Query Suggestion by Mining Click-through and Session Data Authors: H. Cao et.al KDD 08 Presented by Shize Su 1.

CPSC 689: Discrete Algorithms for Mobile and Wireless Systems Spring 2009 Prof. Jennifer Welch.

Small-world Overlay P2P Network

Mobile and Wireless Computing Institute for Computer Science, University of Freiburg Western Australian Interactive Virtual Environments Centre (IVEC)

An Interactive Visualization of Super-peer P2P Networks Peiqun (Anthony) Yu.

1 LINK STATE PROTOCOLS (contents) Disadvantages of the distance vector protocols Link state protocols Why is a link state protocol better?

UNIVERSITY OF JYVÄSKYLÄ Topology Management in Unstructured P2P Networks Using Neural Networks Presentation for IEEE Congress on Evolutionary Computing.

Vassilios V. Dimakopoulos and Evaggelia Pitoura Distributed Data Management Lab Dept. of Computer Science, Univ. of Ioannina, Greece

Mobile and Wireless Computing Institute for Computer Science, University of Freiburg Western Australian Interactive Virtual Environments Centre (IVEC)

On Distinguishing between Internet Power Law B Bu and Towsley Infocom 2002 Presented by.

Ecole Polytechnique Fédérale de Lausanne, Switzerland Efficient processing of XPath queries with structured overlay networks Gleb Skobeltsyn, Manfred Hauswirth,

UNIVERSITY OF JYVÄSKYLÄ Resource Discovery in Unstructured P2P Networks Distributed Systems Research Seminar on Mikko Vapa, research student.

CS401 presentation1 Effective Replica Allocation in Ad Hoc Networks for Improving Data Accessibility Takahiro Hara Presented by Mingsheng Peng (Proc. IEEE.

Population-based metaheuristics Nature-inspired Initialize a population A new population of solutions is generated Integrate the new population into the.

Delivery, Forwarding and

Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.

Data Communications & Computer Networks

1 Napster & Gnutella An Overview. 2 About Napster Distributed application allowing users to search and exchange MP3 files. Written by Shawn Fanning in.

J. He, G. Kesidis and D.J. Miller – The Pennsylvania State University In collaboration with K. Levitt, J. Rowe, S.F. Wu – The University of California.

P2P Architecture Case Study: Gnutella Network

Developing Analytical Framework to Measure Robustness of Peer-to-Peer Networks Niloy Ganguly.

1 Pertemuan 20 Teknik Routing Matakuliah: H0174/Jaringan Komputer Tahun: 2006 Versi: 1/0.

Niloy Ganguly Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur Kharagpur Stability analysis of peer to peer.

Lyon, June 26th 2006 ICPS'06: IEEE International Conference on Pervasive Services 2006 Routing and Localization Services in Self-Organizing Wireless Ad-Hoc.

Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation Niloy Ganguly Immune System and Search Technology Designing a Fast Search Algorithm.

Using the Small-World Model to Improve Freenet Performance Hui Zhang Ashish Goel Ramesh Govindan USC.

Full-Text Search in P2P Networks Christof Leng Databases and Distributed Systems Group TU Darmstadt.

Content-Based Music Information Retrieval in Wireless Ad-hoc Networks.

Niloy Ganguly (Zentrum für Hochleistungsrechnen (ZHR) – TU Dresden) Project funded by the Future and Emerging Technologies arm of the IST Programme Immune.

BitTorrent enabled Ad Hoc Group 1  Garvit Singh( )  Nitin Sharma( )  Aashna Goyal( )  Radhika Medury( )

CCAN: Cache-based CAN Using the Small World Model Shanghai Jiaotong University Internet Computing R&D Center.

Efficient Peer to Peer Keyword Searching Nathan Gray.

Implicit group messaging in peer-to-peer networks Daniel Cutting, 28th April 2006 Advanced Networks Research Group.

Self Regulated Search in Unstructured Peer-to-Peer Networks Niloy Ganguly Department of Computer Science and Engineering IIT Kharagpur.

Project funded by the Future and Emerging Technologies arm of the IST Programme Analytical Insights into Immune Search Niloy Ganguly Center for High Performance.

A genetic approach to the automatic clustering problem Author : Lin Yu Tseng Shiueng Bien Yang Graduate : Chien-Ming Hsiao.

Cellular Automata Machine For Pattern Recognition Pradipta Maji 1 Niloy Ganguly 2 Sourav Saha 1 Anup K Roy 1 P Pal Chaudhuri 1 1 Department of Computer.

Analyzing the Vulnerability of Superpeer Networks Against Attack Niloy Ganguly Department of Computer Science & Engineering Indian Institute of Technology,

Complex Network Theory – An Introduction Niloy Ganguly.

TELE202 Lecture 6 Routing in WAN 1 Lecturer Dr Z. Huang Overview ¥Last Lecture »Packet switching in Wide Area Networks »Source: chapter 10 ¥This Lecture.

Mining Document Collections to Facilitate Accurate Approximate Entity Matching Presented By Harshda Vabale.

Comparison of Tarry’s Algorithm and Awerbuch’s Algorithm Mike Yuan CS 6/73201 Advanced Operating Systems Fall 2007 Dr. Nesterenko.

P2p, Fall 06 1 Topics in Database Systems: Data Management in Peer-to-Peer Systems Search in Unstructured P2p.

Complex Network Theory – An Introduction Niloy Ganguly.

Teknik Routing Pertemuan 10 Matakuliah: H0524/Jaringan Komputer Tahun: 2009.

Introduction to Models Lecture 8 February 22, 2005.

Comparison of Tarry’s Algorithm and Awerbuch’s Algorithm CS 6/73201 Advanced Operating System Presentation by: Sanjitkumar Patel.

Project funded by the Future and Emerging Technologies arm of the IST Programme Are Proliferation Techniques more efficient than Random Walk with respect.

Brief Announcement : Measuring Robustness of Superpeer Topologies Niloy Ganguly Department of Computer Science & Engineering Indian Institute of Technology,

Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks

A configuration method for structured P2P overlay network considering delay variations Tomoya KITANI (Shizuoka Univ. 、 Japan) Yoshitaka NAKAMURA (NAIST,

Project funded by the Future and Emerging Technologies arm of the IST Programme Search in Unstructured Networks Niloy Ganguly, Andreas Deutsch Center for.

CS440 Computer Networks 1 Link State Routing and OSPF Neil Tang 10/31/2008.

1 “Hybrid Search Schemes for Unstructured Peer- to-Peer Networks” “Random Walks in Peer-to-Peer Networks” Christos Gkantsidis, Milena Mihail, Amin Saberi.

Distributed Caching and Adaptive Search in Multilayer P2P Networks Chen Wang, Li Xiao, Yunhao Liu, Pei Zheng The 24th International Conference on Distributed.

OBJECT LOCATION IN UNSTRUCTURED P2P NETWORKS USING RANDOM WALK

Content-Based Music Information Retrieval in Wireless Ad-hoc Networks

Niloy Ganguly, Andreas Deutsch Center for High Performance Computing

Research Scopes in Complex Network

Paraskevi Raftopoulou, Euripides G.M. Petrakis

Department of Computer Science University of York

Effective Replica Allocation

Ajay Vyasapeetam Brijesh Shetty Karol Gryczynski

Redundant Ghost Nodes in Jacobi

Presentation transcript:

Design of a Robust Search Algorithm for P2P Networks Niloy Ganguly, Geoffrey Canright, Andreas Deutsch Indian Institute of Social Welfare and Business Management, Kolkata Telenor Research and Development, Norway Center for High Performance Computing, Technical University Dresden, Germany

Talk Overview Problem Definition Design Overview Experimental Results

Talk Overview Problem Definition Search in p2p Network Immune Inspiration Cellular Automata Design Experimental Results Theoretical Explanation

Unstructured Peer to Peer Networks Each Network consists of peers (a, b, c, ..). Peers host data (1, 2, 3, …) a c b f g d e 5 4 2 1 3 7 6 a c b f g d e 1 2 3 6 4 7 5 Structured Network Unstructured Network

Unstructured Networks a c b f g d e 5 4 2 1 3 7 6 6? Searching in unstructured networks – Non-deterministic Algorithms Flooding, random walk 6!!! 6? 6? 6? 6? 6? Unstructured Network

Solution Our ImmuneSearch algorithm 1. Packet movement guided by Immune System inspired concept of packet proliferation and mutation. 2. Topology evolution of the network to provide some structure (semi – structure) in the network speeding up the search process Topology evolution speeds up search algorithm as we conduct more and more search operation (the network develops memory!)

Immune Inspiration Immune search algorithm Message proliferation/mutation + Topology evolution (memory formation) Similarity (message, searched item) Interaction between message and searched item P2p Network Query Message Searched Item Human Body Antibody Antigen

Talk Overview Problem Definition Design Overview Representing network by a 2-dimensional grid Data and query distribution Algorithms Experimental Results

Mapping an unstructured network to a 2-dimensional grid Network = (peers, neighborhood) f a b c d g e 5 4 2 1 3 7 6 a c b f g d e 1 4 5 7 3 2 6 Peers host data

Query and Data Distribution Query/Data – 10-bit strings – 1024 unique queries/data (tokens) – Distribution based on Zipf’s law power law - frequency of occurrence of a token T α 1/r, rank of the token eg. Most popular word = 1000 times 2nd most popular word = 500 times 3rd most popular word = 333 times – Each node host one data item (information profile) and one query item (search profile) f a b c d g e 1 4 5 7 3 2 6 1001001001 1001001001?

Algorithm 6? 6! f a b c d g e 1 4 5 7 3 2 6 Query Processing f a b c d Query Initiation – Start a search by flooding k query message packets to the neighborhood Query Processing – Compare query message with data. Report a match if hamming distance(message,data) ≤ 1 Query Forwarding – Forward the message to the neighbors Topology Evolution – Change the neighborhoods of the peer 6! f a b c d g e 1 4 5 7 3 2 6 Query Processing f a b c d g e 1 4 5 7 3 2 6 6? Query Initiation

Proliferation/Mutation Query forwarding Proliferation/Mutation Produce N message copies of the single message. (Mutate one bit with prob. β) Spread the messages to the neighboring nodes 1010110011 1010010011 original mutated 1 4 5 7 3 2 6 N = 3 f a b c d g e N = 8 · S, where S = sim(PI,M)/d and S ≥ Threshold

Topology Evolution Aim Cluster Similar Nodes (Similar in Information and Search Profile) Initiator node Movement Depends on The Distance from the user node Amount of Matching Age visited node

Talk Overview Problem Definition Design Overview Experimental Results Experiment Search Processes Metrics & Fairness Criteria Stable Condition Transient Condition

Experiment Search Calculate the number of search items found after 50 time steps from initiation of a search. Average the result over 100 searches (a generation). Grid has 100 x 100 nodes

Processes 1. Immune Search Algorithm Immune Search Algorithm without Topology evolution 2. Proliferation1 – Threshold (d – 1) 3. Proliferation2 – Threshold (d – 2) 4. Random Walk 5. Flooding

Metrics 1. Search efficiency No of search items found within 50 time steps from initiation of search 2. Cost per item No of message packets needed to search one item Clustering Amount of clustering of similar peers Time Step - A time step is the period within which all the nodes operate once in a random sequence

Fairness Criteria The processes (Proliferation1,random walk, flooding) work with same average number of packets. Since flooding produces a lot of packets, it is stopped once it produces the average number of packets as Proliferation1. ImmuneSearch and Proliferation1 has same threshold level, but ImmuneSearch produces more packets due to topology evolution. ImmuneSearch and Proliferation2 produces roughly same average number of packets.

Search Efficiency and Cost Regulation Stable Condition Search Efficiency and Cost Regulation

Search Efficiency and Cost Regulation Stable Condition Search Efficiency and Cost Regulation Excellent cost regulation, number of messages required by Proliferation is virtually constant in spite of varying search output

Clustering Stable Condition Generation 3 Generation 0 Generation 24 Most frequent information. Search Profile – yellow. Information Profile – blue Generation 24

Search Efficiency Transient Condition -- Without replacemnt -- 0.5% replacement -- 5% replacement -- 50 % replacement -- Proliferation1

Transient Condition Search Efficiency

Summary ImmueSearch algorithm produces 2.5 times more search output than random walk. ImmueSearch algorithm has a distinct learning phase. The algorithm is stable even when peers constantly leave the system. Simple proliferation/mutation is also better than random walk. Proliferation/mutation has a special cost regulatory function inbuilt. Higher proliferation rate necessarily doesn’t mean higher search output.

Limitation The work is done on grid – it should be tested on other type of networks (power-law graph, random graph). The profiles (data) are too simplistic, it should be made more realistic.

Thank you