A DoS-Resilient Information System for Dynamic Data Management by Baumgart, M. and Scheideler, C. and Schmid, S. In SPAA 2009 Mahdi Asadpour

Slides:



Advertisements
Similar presentations
Stefan Schmid & Christian Scheideler Dept. of Computer Science
Advertisements

P2P data retrieval DHT (Distributed Hash Tables) Partially based on Hellerstein’s presentation at VLDB2004.
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
Digital Library Service – An overview Introduction System Architecture Components and their functionalities Experimental Results.
Scalable Content-Addressable Network Lintao Liu
Kademlia: A Peer-to-peer Information System Based on the XOR Metric Petar Mayamounkov David Mazières A few slides are taken from the authors’ original.
CHORD – peer to peer lookup protocol Shankar Karthik Vaithianathan & Aravind Sivaraman University of Central Florida.
Chord: A scalable peer-to- peer lookup service for Internet applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashock, Hari Balakrishnan.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Robert Morris Ion Stoica, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT.
A Distributed and Oblivious Heap Christian Scheideler and Stefan Schmid Dept. of Computer Science University of Paderborn.
1 Accessing nearby copies of replicated objects Greg Plaxton, Rajmohan Rajaraman, Andrea Richa SPAA 1997.
Common approach 1. Define space: assign random ID (160-bit) to each node and key 2. Define a metric topology in this space,  that is, the space of keys.
Small-world Overlay P2P Network
Storage Management and Caching in PAST, a large-scale, persistent peer- to-peer storage utility Authors: Antony Rowstorn (Microsoft Research) Peter Druschel.
Vault: A Secure Binding Service Guor-Huar Lu, Changho Choi, Zhi-Li Zhang University of Minnesota.
Robust Mixing for Structured Overlay Networks Christian Scheideler Institut für Informatik Technische Universität München.
Distributed Computing Group A Self-Repairing Peer-to-Peer System Resilient to Dynamic Adversarial Churn Fabian Kuhn Stefan Schmid Roger Wattenhofer IPTPS.
P2P: Advanced Topics Filesystems over DHTs and P2P research Vyas Sekar.
A Scalable Content-Addressable Network Authors: S. Ratnasamy, P. Francis, M. Handley, R. Karp, S. Shenker University of California, Berkeley Presenter:
Aggregating Information in Peer-to-Peer Systems for Improved Join and Leave Distributed Computing Group Keno Albrecht Ruedi Arnold Michael Gähwiler Roger.
SCALLOP A Scalable and Load-Balanced Peer- to-Peer Lookup Protocol for High- Performance Distributed System Jerry Chou, Tai-Yi Huang & Kuang-Li Huang Embedded.
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
Freenet A Distributed Anonymous Information Storage and Retrieval System I Clarke O Sandberg I Clarke O Sandberg B WileyT W Hong.
Secure Overlay Services Adam Hathcock Information Assurance Lab Auburn University.
Wide-area cooperative storage with CFS
Or, Providing Scalable, Decentralized Location and Routing Network Services Tapestry: Fault-tolerant Wide-area Application Infrastructure Motivation and.
P2P Course, Structured systems 1 Skip Net (9/11/05)
Peer-to-Peer Networks Slides largely adopted from Ion Stoica’s lecture at UCB.
 Structured peer to peer overlay networks are resilient – but not secure.  Even a small fraction of malicious nodes may result in failure of correct.
1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.
1 Koorde: A Simple Degree Optimal DHT Frans Kaashoek, David Karger MIT Brought to you by the IRIS project.
A DoS-Resilient Information System for Dynamic Data Management Stefan Schmid & Christian Scheideler Dept. of Computer Science University of Paderborn Matthias.
Hashing it Out in Public Common Failure Modes of DHT-based Anonymity Schemes Andrew Tran, Nicholas Hopper, Yongdae Kim Presenter: Josh Colvin, Fall 2011.
Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.
Towards Scalable and Robust Distributed Systems Christian Scheideler Institut für Informatik Technische Universität München.
1 Plaxton Routing. 2 Introduction Plaxton routing is a scalable mechanism for accessing nearby copies of objects. Plaxton mesh is a data structure that.
PIC: Practical Internet Coordinates for Distance Estimation Manuel Costa joint work with Miguel Castro, Ant Rowstron, Peter Key Microsoft Research Cambridge.
Towards Scalable and Robust Overlay Networks Christian Scheideler Institut für Informatik Technische Universität München Baruch Awerbuch Dept. of Computer.
Content Overlays (Nick Feamster). 2 Content Overlays Distributed content storage and retrieval Two primary approaches: –Structured overlay –Unstructured.
Chord & CFS Presenter: Gang ZhouNov. 11th, University of Virginia.
Freenet: A Distributed Anonymous Information Storage and Retrieval System Presenter: Chris Grier ECE 598nb Spring 2006.
Using the Small-World Model to Improve Freenet Performance Hui Zhang Ashish Goel Ramesh Govindan USC.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
SOS: Security Overlay Service Angelos D. Keromytis, Vishal Misra, Daniel Rubenstein- Columbia University ACM SIGCOMM 2002 CONFERENCE, PITTSBURGH PA, AUG.
CCAN: Cache-based CAN Using the Small World Model Shanghai Jiaotong University Internet Computing R&D Center.
Security Michael Foukarakis – 13/12/2004 A Survey of Peer-to-Peer Security Issues Dan S. Wallach Rice University,
Content Addressable Network CAN. The CAN is essentially a distributed Internet-scale hash table that maps file names to their location in the network.
A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.
Peer-to-Peer Name Service (P2PNS) Ingmar Baumgart Institute of Telematics, Universität Karlsruhe IETF 70, Vancouver.
An IP Address Based Caching Scheme for Peer-to-Peer Networks Ronaldo Alves Ferreira Joint work with Ananth Grama and Suresh Jagannathan Department of Computer.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
National Institute of Advanced Industrial Science and Technology Query Processing for Distributed RDF Databases Using a Three-dimensional Hash Index Akiyoshi.
Dynamic Networks for Peer-to-Peer Systems Pierre Fraigniaud CNRS LRI, Univ. Paris Sud Joint work with Philippe Gauron.
1 Distributed Hash Table CS780-3 Lecture Notes In courtesy of Heng Yin.
Dynamic Networks for Peer-to-Peer Systems Pierre Fraigniaud CNRS Lab. de Recherche en Informatique (LRI) Univ. Paris-Sud, Orsay Joint work with Philippe.
Peer to Peer Network Design Discovery and Routing algorithms
A Denial-of-Service Resistant DHT Christian Scheideler Technische Universität München Joint work with Baruch Awerbuch, JHU.
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.
Towards a Scalable and Robust DHT Baruch Awerbuch Johns Hopkins University Christian Scheideler Technical University of Munich.
Peer-to-Peer Data Structures Christian Scheideler Dept. of Computer Science Johns Hopkins University.
1 Plaxton Routing. 2 History Greg Plaxton, Rajmohan Rajaraman, Andrea Richa. Accessing nearby copies of replicated objects, SPAA 1997 Used in several.
Robust Random Number Generation for Peer-to-Peer Systems Baruch Awerbuch Johns Hopkins University Christian Scheideler Technical University of Munich.
Incrementally Improving Lookup Latency in Distributed Hash Table Systems Hui Zhang 1, Ashish Goel 2, Ramesh Govindan 1 1 University of Southern California.
Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications * CS587x Lecture Department of Computer Science Iowa State University *I. Stoica,
Christian Scheideler Dept. of Computer Science
CS 268: Lecture 22 (Peer-to-Peer Networks)
Accessing nearby copies of replicated objects
DHT Routing Geometries and Chord
A Scalable Peer-to-peer Lookup Service for Internet Applications
Presentation transcript:

A DoS-Resilient Information System for Dynamic Data Management by Baumgart, M. and Scheideler, C. and Schmid, S. In SPAA 2009 Mahdi Asadpour

2 Outline  Denial of Service Attacks  Chameleon: System Description  Chameleon: Operational Details  Conclusion

Denial of Service Attacks

4 DoS attack  (Distributed) Denial of Service (DoS) attacks are one of the biggest problems in today’s open distributed systems.  Botnet: A set of compromised networked computers controlled through the attacker‘s program (the “bot“).  Image credit: Network Security course, Thomas Dübendorfer, ETH Zürich.

5 Examples  DoS attack against the root servers of the DNS system: roots, top-level domains,...  TCP SYN flood attack  Prevention: SYN cookies  Image credit: Root Domain Name Servers

6 DoS prevention  Redundancy: information is replicated on multiple machines.  Storing and maintaining multiple copies have large overhead in storage and update costs.  Full replication is not feasible in large information systems.  In order to preserve scalability, the burden on the servers should be minimized.  Limited to logarithmic factor.  Challenge: how to be robust against DoS attacks?

7 Therefore, a dilemma  Scalability: minimize replication of information  Robustness: maximize resources needed by attacker

8 Related work  Many scalable information systems:  Chord, CAN, Pastry, Tapestry,...  Not robust against flash crowds  Caching strategies against flash crowds:  CoopNet, Backlash, PROOFS,…  Not robust against adaptive lookup attacks

9 Related work, cont.  Systems robust against DoS-attacks:  SOS, WebSOS, Mayday, III,…  Basic strategy: hiding original location of data  Not work against past insiders  Awerbuch and Scheideler (DISC 07):  DoS-resistent information system that can only handle get requests under DoS attack

Chameleon: System Description

11 Model  Chameleon: a distributed information system, which is provably robust against large-scale DoS attacks.  N fixed nodes in the system, and all are honest and reliable.  The system supports these operations:  Put(d): inserts/updates data item d into the system  Get(name): this returns the data item d with Name(d)=name, if any.  Assume that time proceeds in steps that are synchronized among the nodes.

12 Past insider attack  Attacker knows everything up to some phase t0 that may not be known to the system.  A fired employee, for example ( Image Credit: Bio Job Blog ).  Can block any ε -fraction of servers  Can generate any set of put/get requests, one per server.

13 Goals  Scalability: every server spends at most polylog time and work on put and get requests.  Robustness: every get request to a data item inserted or updated after t0 is served correctly.  Correctness: every get request to a data item is served correctly if the system is not under DoS-attack.  The paper does not seek to prevent DoS attacks, but rather focuses on how to maintain a good availability and performance during the attack.

14 Also, distributing the load evenly among all nodes Image credit: Internet!

15 Basic strategy  Choose suitable hash functions h 1,..,h c :D  V (D: name space of data, V: set of servers)  Store copy of item d for every i and j randomly in a set of servers of size 2 j that contains h i (d) h i (d) easy to block difficult to block easy to find difficult to find

16 Put and Get Requests  Most get requests can access close-by copies, only a few get requests have to find distant copies.  Work for each server altogether just polylog(n) for any set of n get requests, one per server.  All areas must have up-to-date copies, so put requests may fail under DoS attack. h i (d)

17 Distributed Hash Table (DHT)  Chameleon employs the idea of DHT.  Decentralized distributed systems that provide a lookup service of (key, value) pairs: any participating node can efficiently retrieve the value associated with a given key.  Image credit:

18 Data stores  Data management of Chameleon relies on two stores:  p-store: a static DHT, in which the positions of the data items are fixed unless they are updated.  t-store: a classic dynamic DHT that constantly refreshes its topology and positions of items (not known to a past insider).

19 P-store  Nodes are completely interconnected and mapped to [0,1).  A node i is responsible for the interval [i/n, (i+1)/n). It is represented by log n bits, i.e. ∑ x i /2 i  The data items are also mapped to [0, 1), based on fixed hash functions h 1,..,h c : U → [0, 1) (known by everybody).  For each data item d, the lowest level i = 0 gives fixed storage locations h 1 (d),..., h c (d) for d of which O(log n) are picked at random to store up-to-date copies of d.  Replicas are along prefix paths in the p-store.

20 P-store, prefix path

21 T-store  In order to correctly store the copies of a data item d, Ω(log n) roots should be reached, which may not always be possible due to a past-insider attack. T-store is used to temporarily store data.  Its topology is a de Bruijn-like network with logarithmic node degree, is constructed from scratch in every phase.  de Bruijn graphs are useful as they have a logarithmic diameter and a high expansion. t-store

22 T-store, de Bruijn graph  [0, 1)-space is partitioned into intervals of size ßlog n/n.  In every phase, every non-blocked node chooses a random position x in the interval.  Then tries to establish connections to all other nodes that selected the positions x, x-, x+, x/2, (x+1)/2  Image credit:

23 New T-store  Once the t-store has been established, the nodes at position 0 select a random hash function h : U → [0, 1) (by leader election) and broadcast that to all nodes in the t-store.  Not known to a past insider after t0.  h determines the locations of the data items in the new t-store.  d in the old t-store is stored in the cluster responsible for h(d). t-store h(d)

Chameleon: Operational Details Image credit: Internet!

25 Overall procedure in a phase 1. Adversary blocks servers and initiates put & get requests 2. build new t-store, transfer data from old to new t-store 3. process all put requests in t-store 4. process all get requests in t-store and p-store 5. try to transfer data items from t-store to p-store p-store t-store Internet

26 Stages  Stage 1: Building a New t-Store  Stage 2: Put Requests in t-Store  Stage 3: Processing Get Requests  Stage 4: Transferring Items

27 Stage 1: Building a New t-Store  Join protocol: To form a de Bruijn network.  Every non-blocked node chooses new random location in de Bruijn network.  Searches for neighbors in p-store using join(x) operation.  Nodes in graph agree on a set of log n random hash functions g 1,..., g c‘ : [0, 1) → [0, 1) via randomized leader election.  Randomized leader election: each node guesses a random bit string and the one with lowest bit string wins and proposes the hash functions, in O(log n) round/time.

28 Stage 1: Building a New t-Store, cont.  Insert protocol: to transfer data items from the old t-store to the new t-store.  For every cluster in the old t-store with currently non-blocked nodes, one of its nodes issues an insert(d) request for each of the data items d stored in it.  Each of these requests is sent to the nodes owning g 1 (x),..., g c‘ (x) in the p-store, where x = ⌊ h(d) ⌋ (δ log n)/n.  Each non-blocked node collects all data items d to point x and forwards them to those contacted it in the join protocol.  O(n) items w.h.p.

29 Stage 2: Put Requests in t-Store  New put requests are served in the t-store: for a put(d) requests, a t-put(d) request is executed.  Each t-put(d) request aims at storing d in the cluster responsible for h(d) passing.  The t-put requests are sent to their destination clusters using de Bruijn paths, e.g. X → Y  (x 1, …, x logn ) → (y logn, x 1, …, x logn -1 ) → … → (y 1, …, y logn )  Filtering mechanism:  Only one of the same t-put requests survives.  Routing rule:  Just ρ log 2 n to pass a node  O(logn) time, O(log 2 n) congestion. d d d d

30 Stage 3: Processing Get Requests  First: in the t-store using the t-get protocol  de Bruijn routing with combining to lookup data in t-store  O(log n) time and O(log 2 n) congestion  Second: If cannot be served in the t-store, then store in the p-store using the p-get protocol.  Three stages: preprocessing, contraction and expansion  Filtering: almost similar to t-put. d d d d name

31 Stage 3: Processing p-Get Requests, Preprocessing  P-get Preprocessing: Determines blocked areas via sampling.  Every non-blocked node v checks the state of α log n random nodes in T i (v) for every 0 ≤ i ≤ log n.  If >= 1/4 of the nodes are blocked, v declares T i (v) as blocked.  O(1) time: Since the checking can be done in parallel.  O(log 2 n) congestion

32 Stage 3: Processing p-Get Requests, Contraction  Each p-get(d) request issued by some node v selects a random node out of all nodes and aims at reaching the node responsible for h i (d), i in {1, …, c} in at most ξ log n hops.  Stop: T l (h i (id)) is blocked or hops > ξ log n => deactivate i  O(log n) time, w.h.p 01 h i (x) log n

33 Stage 3: Processing p-Get Requests, Expansion  Looks for copies at successively wider areas.  Every not-finished p-get(d) request sends (d, r, i,−) to a non- blocked node v that was successfully contacted before.  V maintains a copy b of d in (d, r, i, b) and executes O(log n) :  Sends (d, r, i, b) to a random node in the same level.  Replace b with most current copy of d, if any.  O( log 2 n)

34 Stage 4: Transferring Items  Transfers all items stored in the t-store to the p-store using the p-put protocol.  After, the corresponding data item in the t-store is removed.  p-put protocol has three stages: Preprocessing, Contraction, Permanent storage p-store t-store p-put(d1)p-put(d2)

35 Stage 4: Transferring Items, p-Put preprocessing  p-Put preprocessing is like in the p-get protocol  Determines blocked areas and average load in p-store via sampling.  O(1) time  O( log 2 n) congestion

36 Stage 4: Transferring Items, p-put Contraction  p-put Contraction is identical to the contraction stage of the p-get protocol.  Tries to get to sufficiently many hash-based positions in p- store.  O(log n) time. 0 1 h i (x) log n

37 Stage 4: Transferring Items, p-put Permanent storage 1. p-put Permanent storage: For each successful data item, store new copies and delete as many old ones as possible. 2. In the node responsible for h i (d) (d’s root node) information about the nodes storing a copy of d is stored. 3. This information is used to remove all out-of-date copies of d.

38 Stage 4: Transferring Items, p-put Permanent storage 4. If it is not possible (blocking), references to these out-of-date copies are left in the roots (be deleted later on). 5. Select a random non-blocked node in each Tℓ( h i (d)) with ℓ in {0,..., log n}. 6. Store an up-to-date copy of d in these nodes, and store references to these nodes in h i (d). 7. O(log n) time. The number of copies of d remains O( log 2 n).

Conclusion

40 Main theorem  Theorem: Under any  -bounded past-insider attack (for some constant  >0), the Chameleon system can serve any set of requests (one per server) in O(log 2 n) time s.t. every get request to a data item inserted or updated after t 0 is served correctly, w.h.p.  No degradation over time:  O(log 2 n) copies per data item  Fair distribution of data among servers

41 Summary  This paper shows how to build a scalable dynamic information system that is robust against a past insider.  Two distributed hash tables for data managements: temporary and permanent, respectively t-store and p-store.  The authors defined many constants ξ, ß, ρ, … but did not optimize them, e.g. the replication factors.  As also authors proposed, it would be interesting to study whether the runtime of a phase can be reduced to O(log n).  No experimental evaluation.

42 References  Some of the slides are taken from the authors, with permission.  Main references: 1. B. Awerbuch and C. Scheideler. A Denial-of-Service Resistant DHT. DISC B. Awerbuch and C. Scheideler. Towards a Scalable and Robust DHT. SPAA D. Karger, et al. Consistent Hashing and Random Trees: Distributed Caching Protocols for Relieving Hot Spots on the World Wide Web. STOC 1997.

Thanks for your attention. Any question?