Large-Scale Monitoring of DHT Traffic Ghulam Memon – University of Oregon Reza Rejaie – University of Oregon Yang Guo – Corporate Research, Thomson Daniel.

Slides:



Advertisements
Similar presentations
Evaluation of a Scalable P2P Lookup Protocol for Internet Applications
Advertisements

Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK
Click to edit Master title style Defeating Vanish with Low-Cost Sybil Attacks Against Large DHTs Scott Wolchok 1 Owen S. Hofmann 2 Nadia Heninger 3 Edward.
OpenFlow-Based Server Load Balancing GoneWild
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
1 PASTRY Partially borrowed from Gabi Kliot ’ s presentation.
Common approach 1. Define space: assign random ID (160-bit) to each node and key 2. Define a metric topology in this space,  that is, the space of keys.
Resilient Peer-to-Peer Streaming Paper by: Venkata N. Padmanabhan Helen J. Wang Philip A. Chou Discussion Leader: Manfred Georg Presented by: Christoph.
Small-Scale Peer-to-Peer Publish/Subscribe
Peer-to-Peer Networks as a Distribution and Publishing Model Jorn De Boever (june 14, 2007)
Traffic Engineering With Traditional IP Routing Protocols
Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems Antony Rowstron and Peter Druschel Proc. of the 18th IFIP/ACM.
Improving Lookup Performance over a Widely-Deployed DHT Daniel Stutzbach Reza Rejaie The ION P2P Project University of.
A Trust Based Assess Control Framework for P2P File-Sharing System Speaker : Jia-Hui Huang Adviser : Kai-Wei Ke Date : 2004 / 3 / 15.
Spring 2003CS 4611 Peer-to-Peer Networks Outline Survey Self-organizing overlay network File system on top of P2P network Contributions from Peter Druschel.
Characterizing the Two-Tier Gnutella Topology  Gnutella, FastTrack, and eDonkey use two-tier overlay topologies.  Our initial study focuses on Gnutella.
Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai.
SkipNet: A Scalable Overlay Network with Practical Locality Properties Nick Harvey, Mike Jones, Stefan Saroiu, Marvin Theimer, Alec Wolman Microsoft Research.
Aggregating Information in Peer-to-Peer Systems for Improved Join and Leave Distributed Computing Group Keno Albrecht Ruedi Arnold Michael Gähwiler Roger.
CS 672 Paper Presentation Presented By Saif Iqbal “CarNet: A Scalable Ad Hoc Wireless Network System” Robert Morris, John Jannotti, Frans Kaashoek, Jinyang.
Understanding Churn in Peer-to-Peer Networks Daniel Stutzbach – University of Oregon Reza Rejaie – University of Oregon Internet Measurement Conference.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
On Unbiased Sampling for Unstructured Peer-to-Peer Networks Daniel Stutzbach – University of Oregon Reza Rejaie – University of Oregon Nick Duffield –
Wide-area cooperative storage with CFS
1 Peer-to-Peer Networks Outline Survey Self-organizing overlay network File system on top of P2P network Contributions from Peter Druschel.
Unconstrained Endpoint Profiling (Googling the Internet)‏ Ionut Trestian Supranamaya Ranjan Aleksandar Kuzmanovic Antonio Nucci Northwestern University.
P-Grid Presentation by Thierry Lopez P-Grid: A Self-organizing Structured P2P System Karl Aberer, Philippe Cudré-Mauroux, Anwitaman Datta, Zoran Despotovic,
ICDE A Peer-to-peer Framework for Caching Range Queries Ozgur D. Sahin Abhishek Gupta Divyakant Agrawal Amr El Abbadi Department of Computer Science.
Peer-to-peer file-sharing over mobile ad hoc networks Gang Ding and Bharat Bhargava Department of Computer Sciences Purdue University Pervasive Computing.
1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.
Mobile Ad-hoc Pastry (MADPastry) Niloy Ganguly. Problem of normal DHT in MANET No co-relation between overlay logical hop and physical hop – Low bandwidth,
INTRODUCTION TO PEER TO PEER NETWORKS Z.M. Joseph CSE 6392 – DB Exploration Spring 2006 CSE, UT Arlington.
1 Measurements and Mitigation of Peer-to-Peer-based Botnets: A Case Study on Storm Worm T. Holz, M. Steiner, F. Dahl, E. Biersack, and F. Freiling - Proceedings.
IPDPS 2007 Making Peer-to-Peer Anonymous Routing Resilient to Failures Yingwu Zhu Seattle University
Multi-level Hashing for Peer-to-Peer System in Wireless Ad Hoc Environment Dewan Tanvir Ahmed and Shervin Shirmohammadi Distributed & Collaborative Virtual.
Thesis Proposal Data Consistency in DHTs. Background Peer-to-peer systems have become increasingly popular Lots of P2P applications around us –File sharing,
Peer to Peer Research survey TingYang Chang. Intro. Of P2P Computers of the system was known as peers which sharing data files with each other. Build.
1 Reading Report 5 Yin Chen 2 Mar 2004 Reference: Chord: A Scalable Peer-To-Peer Lookup Service for Internet Applications, Ion Stoica, Robert Morris, david.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
Global Intrusion Detection Using Distribute Hash Table Jason Skicewicz, Laurence Berland, Yan Chen Northwestern University 6/2004.
1 Distributed Hash Tables (DHTs) Lars Jørgen Lillehovde Jo Grimstad Bang Distributed Hash Tables (DHTs)
Aditya Akella The Performance Benefits of Multihoming Aditya Akella CMU With Bruce Maggs, Srini Seshan, Anees Shaikh and Ramesh Sitaraman.
Wide-scale Botnet Detection and Characterization Anestis Karasaridis, Brian Rexroad, David Hoeflin In First Workshop on Hot Topics in Understanding Botnets,
Rushing Attacks and Defense in Wireless Ad Hoc Network Routing Protocols ► Acts as denial of service by disrupting the flow of data between a source and.
AlvisP2P : Scalable Peer-to-Peer Text Retrieval in a Structured P2P Network Toan Luu, Gleb Skobeltsyn, Fabius Klemm, Maroje Puh, Ivana Podnar Zarko, Martin.
An IP Address Based Caching Scheme for Peer-to-Peer Networks Ronaldo Alves Ferreira Joint work with Ananth Grama and Suresh Jagannathan Department of Computer.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
Probabilistic Coverage in Wireless Sensor Networks Authors : Nadeem Ahmed, Salil S. Kanhere, Sanjay Jha Presenter : Hyeon, Seung-Il.
Peer to Peer A Survey and comparison of peer-to-peer overlay network schemes And so on… Chulhyun Park
PROP: A Scalable and Reliable P2P Assisted Proxy Streaming System Computer Science Department College of William and Mary Lei Guo, Songqing Chen, and Xiaodong.
DHT-based unicast for mobile ad hoc networks Thomas Zahn, Jochen Schiller Institute of Computer Science Freie Universitat Berlin 報告 : 羅世豪.
BY: REBECCA NAVARRE & MICHAEL BAKER II Persea: Making Networks More Secure Since Early 2013.
A Comparative Study of the DNS Design with DHT-Based Alternatives 95/08/31 Chen Chih-Ming.
Peer to Peer Network Design Discovery and Routing algorithms
1 Presented by Jing Sun Computer Science and Engineering Department University of Conneticut.
Measurements and Mitigation of Peer-to-peer Botnets: A Case Study on Storm Worm Thorsten Holz, Moritz Steiner, Frederic Dahl, Ernst Biersack, Felix Freiling.
Automated Worm Fingerprinting Authors: Sumeet Singh, Cristian Estan, George Varghese and Stefan Savage Publish: OSDI'04. Presenter: YanYan Wang.
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
Peter R Pietzuch and Jean Bacon Peer-to-Peer Overlay Networks in an Event-Based Middleware DEBS’03, San Diego, CA, USA,
Click to edit Master title style Multi-Destination Routing and the Design of Peer-to-Peer Overlays Authors John Buford Panasonic Princeton Lab, USA. Alan.
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.
CS Spring 2014 CS 414 – Multimedia Systems Design Lecture 37 – Introduction to P2P (Part 1) Klara Nahrstedt.
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
P2P Search COP P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella.
Large Scale Sharing Marco F. Duarte COMP 520: Distributed Systems September 19, 2004.
Fabián E. Bustamante, Fall 2005 A brief introduction to Pastry Based on: A. Rowstron and P. Druschel, Pastry: Scalable, decentralized object location and.
Nuno Salta Supervisor: Manuel Ricardo Supervisor: Ricardo Morla
Controlling the Cost of Reliability in Peer-to-Peer Overlays
The Case for DDoS Resistant Membership Management in P2P Systems
Presentation transcript:

Large-Scale Monitoring of DHT Traffic Ghulam Memon – University of Oregon Reza Rejaie – University of Oregon Yang Guo – Corporate Research, Thomson Daniel Stutzbach – Stutzbach Enterprises International Workshop on Peer-to-Peer Systems (IPTPS) 2009, Boston MA.

3/5/2016IPTPS 2009 Boston, MA. 2 Introduction Distributed Hash Tables (DHT) provide a scalable approach for distributed content management, e.g. file sharing DHTs have been an active area of research since 2001 DHTs have been recently deployed in real world applications.  e.g. Kad, Azureus, Mojito Characterizing traffic in widely deployed DHTs allows us to:  Identify opportunities for performance improvement.  Detect anomalous behavior.  Accurately capturing traffic in a widely deployed DHT is challenging.

3/5/2016IPTPS 2009 Boston, MA. 3 Challenges in Capturing DHT Traffic Common approach for capturing DHT traffic is to use a instrumented peers as monitors. Using a small number of monitors can not capture an accurate view of traffic Using a large number of monitors is expensive and may change and/or disrupt the DHT.  e.g. 8 monitors per peer [Steiner:DBISP2P 2007 ] Goal: Capturing a representative view of DHT traffic efficiently without changing and/or disrupting the system.

3/5/2016IPTPS 2009 Boston, MA. 4 Classifying DHT traffic Two types of messages are observed by peer p: DHT Traffic PrPr PiPi PiPi PtPt PrPr PiPi PiPi PtPt We focus on capturing destination traffic Routing Traffic: Messages that are routed by but not destined to peer p.  Depends on DHT geometry and peer visibility. Destination Traffic: Messages that are destined to peer p  Demonstrates DHT usage.

3/5/2016IPTPS 2009 Boston, MA. 5 This paper presents Montra, a new approach to efficiently & accurately capture DHT traffic without disrupting the system  Montra should be applicable to most DHTs Validation of Montra over a deployed DHT, Kad. Preliminary characterization of Kad traffic

3/5/2016IPTPS 2009 Boston, MA. 6 Real-world DHTs add redundancy to cope with churn:  Each file is published at multiple peers  Search operation identifies multiple peers If monitor peer P m is the closest peer to the target peer P t, P m will observe all the destination traffic of P t Montra Key Idea

3/5/2016IPTPS 2009 Boston, MA. 7 Key Idea 0x0 ID Space 0x80xe PrPr 0x8 …… 0xe …… 0xf …… 0xe 0xf PtPt PmPm 0xe Request Orig. (P r ) searches destination for content ID 0xe. Node 0xe (P t ) is closest to requested ID 0xe. Monitor 0xf (P m ) captures the request. 0x90x1 Placing one monitor per peer will provide an accurate view of traffic. How to avoid/minimize the impact on system? Montra Routing Table

3/5/2016IPTPS 2009 Boston, MA. 8 Minimally Visible Monitors (MVMs) To minimize the disruption on the system, we use Minimally Visible Monitors (MVMs).  MVMs are only visible to (i.e. exchange messages with) their target peer. Deploying a large number of MVMs causes minmum/no disruption in the system.  Each MVM slightly changes the routing table of the target peer. PtPt PmPm Request ID Space PrPr Request PrPr PrPr Response Request Montra

3/5/2016IPTPS 2009 Boston, MA. 9 Identifying Destination Peers In the presence of churn and packet loss, a single peer (or MVM) can not reliably identify its destination traffic.  Closer peers may exist.  Requires a regional view of traffic Montra - MVMs 0xad0xa90xaf PmPm 0xac PmPm 0xa8 PmPm 0xae 0xa We monitor all peers in a continuous zone of ID space. e.g. 4 bit zone 0xa Periodically crawl to detect all the peers in the zone. All the captured requests within a zone have a destination in that zone. Destination peers are identified during post-processing. For a given captured request find the closest monitored peer.

3/5/2016IPTPS 2009 Boston, MA. 10 Validation We quantify the accuracy of Montra from 2 different angles, using the Kad network:  Content Accuracy: What fraction of destination traffic per zone is captured?  Peer Accuracy: How accurately Montra determines destination peers? Validation Methodology:  Instrumented Source  Instrumented Destination

3/5/2016IPTPS 2009 Boston, MA. 11 Instrumented Source Validation Use instrumented Kad client to send requests for random IDs in a zone (Instrumented Source).  Log all requests and their destinations. Monitor the same zone using Montra. Compare source and monitor logs to determine content and peer accuracy. Uses synthetic workload but the requests are distributed over the entire zone. Validation

3/5/2016IPTPS 2009 Boston, MA. 12 Instrumented Destination Validation Use instrumented Kad client to passively observe and log requests (Instrumented Destination). Monitor the same zone simultaneously. Compare destination and monitor logs.  Using some heuristics Uses real-world workload but the requests are localized to the instrumented destination. Validation

3/5/2016IPTPS 2009 Boston, MA. 13 Results Zone size decreases with zone prefix length. Both the figures show similar results. Instrumented Source: increasing zone size beyond 6-bit degrades accuracy  Time taken to crawl <=5 bit zone hinders prompt addition of MVMs. Instrumented Destination: zone size has minimal impact on accuracy.  MVMs are promptly added around instrumented destination. Validation Content Accuracy Peer Accuracy

3/5/2016IPTPS 2009 Boston, MA. 14 Publish Request Rate How request rate varies across different zones? The heavily skewed behavior is consistent across different zones Each zone has some hot keywords and files Rate for Publish keywords is higher than files.  A lot of common names occur in filenames See the paper for more results. Characterization Kad Keywords Files

3/5/2016IPTPS 2009 Boston, MA. 15 Relation Between Published and Searched Content. Characterization Kad What is the balance between supply and demand for a file? Balance = Pub./(Sear. + Pub) 15% of files are searched but never published  Newly popular files that are not yet widely available. 60% of files are published but never searched.  Popular files from past that are highly available. 95% of keywords are published but never searched  A very small pool of keywords is actually used. Keywords Files

3/5/2016IPTPS 2009 Boston, MA. 16 Conclusion Montra is a new technique for capturing DHT traffic accurately and efficiently without disrupting the system. Montra’s accuracy was validated over the Kad network. Presented initial characterization of traffic in Kad Ongoing work:  Further evaluation of Montra over other DHTs, e.g. Azureus, Mojito  Further analysis of captured traffic in Kad and other DHTs  Exploring other usage of Montra, e.g. detecting botnet c&c

3/5/2016IPTPS 2009 Boston, MA. 17 Search Request Rate Search file and search keyword requests have the lowest range of requests  Demonstrates user behavior. User behavior for search keywords is different across different zones.  Some zones have more popular keywords User behavior for search files across different zones is consistent. Characterization Kad Keywords Files