Quasar A Probabilistic Publish-Subscribe System for Social Networks over P2P Kademlia network David Arinzon Supervisor: Gil Einziger April 2012 1.

Slides:



Advertisements
Similar presentations
ECE /24/2005 A Survey on Position-Based Routing in Mobile Ad-Hoc Networks Alok Sabherwal.
Advertisements

Capacity of wireless ad-hoc networks By Kumar Manvendra October 31,2002.
CAN 1.Distributed Hash Tables a)DHT recap b)Uses c)Example – CAN.
Interest Management Objectives – –Understand what is meant by the term interest management. –Realise how interest management schemes may be deployed. –Understand.
Segmented Hash: An Efficient Hash Table Implementation for High Performance Networking Subsystems Sailesh Kumar Patrick Crowley.
Kademlia: A Peer-to-peer Information System Based on the XOR Metric Petar Mayamounkov David Mazières A few slides are taken from the authors’ original.
Rumor Routing in Sensor Networks David Braginsky and Deborah Estrin LECS – UCLA Modified and Presented by Sugata Hazarika.
Technion –Israel Institute of Technology Computer Networks Laboratory A Comparison of Peer-to-Peer systems by Gomon Dmitri and Kritsmer Ilya under Roi.
MANETs Routing Dr. Raad S. Al-Qassas Department of Computer Science PSUT
Common approach 1. Define space: assign random ID (160-bit) to each node and key 2. Define a metric topology in this space,  that is, the space of keys.
Small-world Overlay P2P Network
P2p, Spring 05 1 Topics in Database Systems: Data Management in Peer-to-Peer Systems March 29, 2005.
Effects of Applying Mobility Localization on Source Routing Algorithms for Mobile Ad Hoc Network Hridesh Rajan presented by Metin Tekkalmaz.
Internet Networking Spring 2006 Tutorial 12 Web Caching Protocols ICP, CARP.
1 Internet Networking Spring 2004 Tutorial 7 Multicast Routing Protocols.
©NEC Laboratories America 1 Hui Zhang Samrat Ganguly Sudeept Bhatnagar Rauf Izmailov NEC Labs America Abhishek Sharma University of Southern California.
1 Efficient Retrieval of User Contents in MANETs Marco Fiore, Claudio Casetti, Carla-Fabiana Chiasserini Dipartimento di Elettronica, Politecnico di Torino,
Slide Set 15: IP Multicast. In this set What is multicasting ? Issues related to IP Multicast Section 4.4.
1 Spring Semester 2007, Dept. of Computer Science, Technion Internet Networking recitation #13 Web Caching Protocols ICP, CARP.
ITIS 6010/8010 Wireless Network Security Dr. Weichao Wang.
Secure routing for structured peer-to-peer overlay networks (by Castro et al.) Shariq Rizvi CS 294-4: Peer-to-Peer Systems.
Internet Networking Spring 2002
presented by Hasan SÖZER1 Scalable P2P Search Daniel A. Menascé George Mason University.
Algorithm Efficiency and Sorting
1 CS 194: Distributed Systems Distributed Hash Tables Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
P2P Course, Structured systems 1 Introduction (26/10/05)
“Umbrella”: A novel fixed-size DHT protocol A.D. Sotiriou.
 Structured peer to peer overlay networks are resilient – but not secure.  Even a small fraction of malicious nodes may result in failure of correct.
The Zone Routing Protocol (ZRP)
Seminar Presentation IP Spoofing Attack, detection and effective method of prevention. Md. Sajan Sana Ansari Id: /8/20151.
Tree-Based Double-Covered Broadcast for Wireless Ad Hoc Networks Weisheng Si, Roksana Boreli Anirban Mahanti, Albert Zomaya.
1 Plaxton Routing. 2 Introduction Plaxton routing is a scalable mechanism for accessing nearby copies of objects. Plaxton mesh is a data structure that.
1 Reading Report 5 Yin Chen 2 Mar 2004 Reference: Chord: A Scalable Peer-To-Peer Lookup Service for Internet Applications, Ion Stoica, Robert Morris, david.
Multicast Routing Algorithms n Multicast routing n Flooding and Spanning Tree n Forward Shortest Path algorithm n Reversed Path Forwarding (RPF) algorithms.
CCAN: Cache-based CAN Using the Small World Model Shanghai Jiaotong University Internet Computing R&D Center.
TinyLFU: A Highly Efficient Cache Admission Policy
Structuring P2P networks for efficient searching Rishi Kant and Abderrahim Laabid Abderrahim Laabid.
Peacock Hash: Deterministic and Updatable Hashing for High Performance Networking Sailesh Kumar Jonathan Turner Patrick Crowley.
Content Addressable Network CAN. The CAN is essentially a distributed Internet-scale hash table that maps file names to their location in the network.
ENERGY-EFFICIENT FORWARDING STRATEGIES FOR GEOGRAPHIC ROUTING in LOSSY WIRELESS SENSOR NETWORKS Presented by Prasad D. Karnik.
Load-Balancing Routing in Multichannel Hybrid Wireless Networks With Single Network Interface So, J.; Vaidya, N. H.; Vehicular Technology, IEEE Transactions.
ITI-510 Computer Networks ITI 510 – Computer Networks Meeting 3 Rutgers University Internet Institute Instructor: Chris Uriarte.
Conjunctive Filter: Breaking the Entropy Barrier Daisuke Okanohara *1, *2 Yuichi Yoshida *1*3 *1 Preferred Infrastructure Inc. *2 Dept. of Computer Science,
On Adding Bloom Filters to Longest Prefix Matching Algorithms
1 Leader Election in Rings. 2 A Ring Network Sense of direction left right.
Segmented Hash: An Efficient Hash Table Implementation for High Performance Networking Subsystems Sailesh Kumar Patrick Crowley.
DHT-based unicast for mobile ad hoc networks Thomas Zahn, Jochen Schiller Institute of Computer Science Freie Universitat Berlin 報告 : 羅世豪.
1 Gossip-Based Ad Hoc Routing Zygmunt J. Haas, Joseph Halpern, LiLi Cornell University Presented By Charuka Silva.
Routing protocols. Static Routing Routes to destinations are set up manually Route may be up or down but static routes will remain in the routing tables.
IETF78 Multimob Masstricht1 Proposal for Tuning IGMPv3/MLDv2 Protocol Behavior in Wireless and Mobile networks draft-wu-multimob-igmp-mld-tuning-02 Qin.
UNIT 5.  The related activities of sorting, searching and merging are central to many computer applications.  Sorting and merging provide us with a.
a/b/g Networks Routing Herbert Rubens Slides taken from UIUC Wireless Networking Group.
Author: Haoyu Song, Murali Kodialam, Fang Hao and T.V. Lakshman Publisher/Conf. : IEEE International Conference on Network Protocols (ICNP), 2009 Speaker:
Cuckoo Filter: Practically Better Than Bloom Author: Bin Fan, David G. Andersen, Michael Kaminsky, Michael D. Mitzenmacher Publisher: ACM CoNEXT 2014 Presenter:
Ad Hoc On-Demand Distance Vector Routing (AODV) ietf
IP Routing table compaction and sampling schemes to enhance TCAM cache performance Author: Ruirui Guo a, Jose G. Delgado-Frias Publisher: Journal of Systems.
1 IP Routing table compaction and sampling schemes to enhance TCAM cache performance Author: Ruirui Guo, Jose G. Delgado-Frias Publisher: Journal of Systems.
Coping with Link Failures in Centralized Control Plane Architecture Maulik Desai, Thyagarajan Nandagopal.
Performance Comparison of Ad Hoc Network Routing Protocols Presented by Venkata Suresh Tamminiedi Computer Science Department Georgia State University.
Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications * CS587x Lecture Department of Computer Science Iowa State University *I. Stoica,
Ahoy: A Proximity-Based Discovery Protocol Robbert Haarman.
MZR: A Multicast Protocol based on Zone Routing
Pastry Scalable, decentralized object locations and routing for large p2p systems.
(How the routers’ tables are filled in)
Internet Networking recitation #12
Mobile and Wireless Networking
Presentation by Theodore Mao CS294-4: Peer-to-peer Systems
COS 461: Computer Networks Spring 2014
Kademlia: A Peer-to-peer Information System Based on the XOR Metric
Presentation transcript:

Quasar A Probabilistic Publish-Subscribe System for Social Networks over P2P Kademlia network David Arinzon Supervisor: Gil Einziger April

Quasar Quasar is a “Publish-Subscribe” mechanism, which bases its routing mechanism on the usage of Bloom Filters. 2

Bloom Filter A bloom filter is “is a space-efficient probabilistic data structure that is used to test whether an element is a member of a set”. (Wikipedia entry based on Donald Knuth’s “The art of computer programming”, 1970)probabilisticdata structureelementset In this structure, false positives are possible, but false negatives are not. When an element is added, its value is sent to k hash functions which will produce k array positions in the bitmap (They’re set to 1). Upon querying for an element, the same process is applied, and if at least one of the given bits is 0, the element is not in the structure. 3

Bloom Filters in Quasar An entry for a Bloom Filter is an ID for a subscription. (In our case, the publisher ID). Each node contains an “enhanced” routing table. The radius defines “how much each node knows about the surrounding subscription interests”. For radius of k, each node contains a set of k attenuated bloom filters for each level of “closeness” (0 – k-1). – The bloom filter on level n will contain subscription information about nodes existing n+1 hops away. – The location of the information is saved in the attenuated bloom filters of the relevant immediate neighbor. – Along with the information is a set of nodes achievable by using this particular entry. 4

For the radius of 2, nodes 2 and 3 are considered immediate neighbors (1 hop), 4 and 5 are considered level 2 neighbors (2 hops), and 6 and 7, are of the “recognition radius” Bloom Filters in Quasar (example)

LevelNeighborBloom FilterReachable nodes 022, 4 33, 5 Node 4 subscription information will be set in the bloom filter of node 2, but on the 1 st level. Same for Node 5, but in the bloom filter of node 3. Bloom Filters in Quasar (example) LevelNeighborBloom FilterReachable nodes

Each node periodically (Depends on whether the network status is static or not) sends it subscription list to its neighbors, which propagate it further, depends on the allowed TTL. A node updates its proper routing table entry (Attenuated bloom filter) according to the information, and the direction (who’s the original sender node, and from which immediate neighbor it has been received). Subscription mechanism 7

During our simulations we realized that the mechanism described above is very consuming in terms of time and traffic. Therefore, a different mechanism has been used in order to achieve the same goal. – Based on the fact that the simulation was executed on UDP over Kademlia-based KeyBasedRouting network, each node can reach another node regardless of the radius defined for Quasar. – Alternatively to the Quasar subscription mechanism, two steps were applied. In the first, each node requests information from each radius level, about its neighbors. After each node builds a picture of its radius neighborhood, it propagates its own subscription information to each of them directly. Subscription mechanism (Alternative) 8

When a node decides to publish a topic (A.K.A publisher node), it replicates a message alpha times, and sends it to a random set of neighbors. When a node receives a publication, it can act in multiple ways – If it is the publisher (Message routed back), it acts as a “middle” node, and routes it randomly. – If the node is subscribed to the topic, it renews the TTL, and sends it again to alpha random neighbors (as if it published it). – Otherwise, the node searches the first routing table entry (level by level) which contains this subscription in the bloom filters, and routes it accordingly. Publication mechanism 9

The publication methodology may cause problems, which may prevent a message from “leaving a gravity well”, a case in which nodes within a small radius from the publishing node are subscribed to it, and route it between one another. A set of methods have been applied (Negative information) – Each message contains information about the “already received subscribers”. To complete that, each node stores information about the publications it already received. – When routing, if a candidate entry is found (publication ID exists in the bloom filter), the entry won’t be used in one of the “received subscribers” are in the list of the reachable nodes. – A subscriber which receives a publication more than once, routes it randomly without duplicating it. Publication mechanism (Continued) 10

As mentioned before, the simulation was executed over the Kademlia-based KeyBasedRouting network, developed in the CS faculty. The main focus of the comparison was the behavior of the Attenuated bloom filters when routing publications. As a competitor, it has been decided that instead of using the routing table, messages will be propagated to a random neighbor. Simulation, scenarios and comparison 11

Three scenarios were tested – Scenario 1 Each node is randomly assigned ten subscriptions. Afterwards, each of the nodes in its turn publishes once. – Scenario 2 A subset of publishers (10% of all the nodes) is selected from all the nodes (The also act as subscribes, but not to their own publications). Each node is randomly assigned a set of publications (A random number between 1 and half the number of publishers). Afterwards, periodically, each period of time (5 seconds), three publicators are chosen randomly in order to publish. – Scenario 3 A publisher node is chosen in random. 10% of all the nodes are chosen to be subscribers of that publisher. Afterwards, the publisher node publishes once. Simulation, scenarios and comparison 12

13

14

In this scenario the advantage of the routing table bloom filters applied in Quasar can be observed. By using the routing table, which contains information about the surrounding subscriptions, the messages were routed properly, which results in a high “hit rate”. It should be noted, that the high hit-rate provided a much higher traffic rate, because for each “first successful hit”, the message is being duplicated alpha times. Scenario conclusions 15

16

17

This scenario is supposed to represent a “general” state of the network, in which there’s a set of publicators, which periodically publish to the entire network. Even though it seems like Quasar reduces the network traffic by a relatively high amount, the hit-rate is considerably low (at least 15% lower). One possible explanation may be the limitation of the bloom filter. One must keep in mind that one of the caveats of the attenuated bloom filters is the false positive entries that may appear. In our case, this can be resulted in false message routing. Scenario conclusions 18

19

20

In this scenario, unlike the 1 st scenario, the difference is much lower. But, it can be observed again, that by looking on a single publication, the routing policy of Quasar, which is based on information from the neighbors and the attenuated bloom filters, provides a better routing in the publish-subscribe methodology. Please note that in the case of the Random routing, there was a very high variance rate, since there were cases in which the delivery rate was 1, as opposed to 0, or 0.5. The Quasar execution provided a much more stable rate. Scenario conclusions 21