Secure Routing for Structured Peer-to-Peer Overlay Networks

Slides:



Advertisements
Similar presentations
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Advertisements

Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK
Storage management and caching in PAST Antony Rowstron and Peter Druschel Presented to cs294-4 by Owen Cooper.
Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility Antony Rowstron, Peter Druschel Presented by: Cristian Borcea.
Kademlia: A Peer-to-peer Information System Based on the XOR Metric Petar Mayamounkov David Mazières A few slides are taken from the authors’ original.
Identity Theft Protection in Structured Overlays Lakshmi Ganesh Ben Y. Zhao University of California, Santa Barbara NPSec 2005.
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
1 PASTRY Partially borrowed from Gabi Kliot ’ s presentation.
Common approach 1. Define space: assign random ID (160-bit) to each node and key 2. Define a metric topology in this space,  that is, the space of keys.
Secure routing for structured peer-to-peer overlay networks M. Castro, P. Druschel, A. Ganesch, A. Rowstron, D.S. Wallach 5th Unix Symposium on Operating.
Scribe: A Large-Scale and Decentralized Application-Level Multicast Infrastructure Miguel Castro, Peter Druschel, Anne-Marie Kermarrec, and Antony L. T.
Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems Antony Rowstron and Peter Druschel Proc. of the 18th IFIP/ACM.
Storage Management and Caching in PAST, a large-scale, persistent peer- to-peer storage utility Authors: Antony Rowstorn (Microsoft Research) Peter Druschel.
Secure routing for structured peer-to-peer overlay networks Miguel Castro, Ayalvadi Ganesh, Antony Rowstron Microsoft Research Ltd. Peter Druschel, Dan.
Pastry Partially borrowed for Gabi Kliot. Pastry Scalable, decentralized object location and routing for large-scale peer-to-peer systems  Antony Rowstron.
Spring 2003CS 4611 Peer-to-Peer Networks Outline Survey Self-organizing overlay network File system on top of P2P network Contributions from Peter Druschel.
Secure routing for structured peer-to-peer overlay networks (by Castro et al.) Shariq Rizvi CS 294-4: Peer-to-Peer Systems.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
Decentralized Location Services CS273 Guest Lecture April 24, 2001 Ben Y. Zhao.
Wide-area cooperative storage with CFS
1 Peer-to-Peer Networks Outline Survey Self-organizing overlay network File system on top of P2P network Contributions from Peter Druschel.
 Structured peer to peer overlay networks are resilient – but not secure.  Even a small fraction of malicious nodes may result in failure of correct.
1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.
Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems (Antony Rowstron and Peter Druschel) Shariq Rizvi First.
Mobile Ad-hoc Pastry (MADPastry) Niloy Ganguly. Problem of normal DHT in MANET No co-relation between overlay logical hop and physical hop – Low bandwidth,
Tapestry GTK Devaroy (07CS1012) Kintali Bala Kishan (07CS1024) G Rahul (07CS3009)
1 PASTRY. 2 Pastry paper “ Pastry: Scalable, decentralized object location and routing for large- scale peer-to-peer systems ” by Antony Rowstron (Microsoft.
PIC: Practical Internet Coordinates for Distance Estimation Manuel Costa joint work with Miguel Castro, Ant Rowstron, Peter Key Microsoft Research Cambridge.
Security Considerations for Structured p2p Peng Wang 6/04/2003.
Securing Every Bit: Authenticated Broadcast in Wireless Networks Dan Alistarh, Seth Gilbert, Rachid Guerraoui, Zarko Milosevic, and Calvin Newport.
Chord & CFS Presenter: Gang ZhouNov. 11th, University of Virginia.
Security Michael Foukarakis – 13/12/2004 A Survey of Peer-to-Peer Security Issues Dan S. Wallach Rice University,
Content Addressable Network CAN. The CAN is essentially a distributed Internet-scale hash table that maps file names to their location in the network.
A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.
CIS 640-2, Presenter: Yun Mao1 Security for Structured Peer- to-peer Overlay Networks By Miguel Castro et al. OSDI ’ 02 Presented by Yun Mao in CIS640.
Agenda Fail Stop Processors –Problem Definition –Implementation with reliable stable storage –Implementation without reliable stable storage Failure Detection.
Secure Routing for Structured Peer-to-Peer Overlay Networks M. Castro, P. Druschel, A. Ganesh, A. Rowstron and D. S. Wallach Proc. Of the 5 th Usenix Symposium.
Eclipse Attacks on Overlay Networks: Threats and Defenses By Atul Singh, et. al Presented by Samuel Petreski March 31, 2009.
Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems Antony Rowstron and Peter Druschel, Middleware 2001.
Pastry Antony Rowstron and Peter Druschel Presented By David Deschenes.
Peer to Peer Network Design Discovery and Routing algorithms
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.
Energy Efficient Data Management for Wireless Sensor Networks with Data Sink Failure Hyunyoung Lee, Kyoungsook Lee, Lan Lin and Andreas Klappenecker †
1 Plaxton Routing. 2 History Greg Plaxton, Rajmohan Rajaraman, Andrea Richa. Accessing nearby copies of replicated objects, SPAA 1997 Used in several.
Fabián E. Bustamante, Fall 2005 A brief introduction to Pastry Based on: A. Rowstron and P. Druschel, Pastry: Scalable, decentralized object location and.
Chapter 29 Peer-to-Peer Paradigm Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Coordination and Agreement
Christian Scheideler Dept. of Computer Science
Ion Stoica, Robert Morris, David Liben-Nowell, David R. Karger, M
CS 268: Lecture 22 (Peer-to-Peer Networks)
Pastry Scalable, decentralized object locations and routing for large p2p systems.
Packet Leashes: Defense Against Wormhole Attacks
Controlling the Cost of Reliability in Peer-to-Peer Overlays
COS 461: Computer Networks
Intra-Domain Routing Jacob Strauss September 14, 2006.
Deanonymization of Clients in Bitcoin P2P Network
PASTRY.
EE 122: Peer-to-Peer (P2P) Networks
Providing Secure Storage on the Internet
Chord Advanced issues.
Building Peer-to-Peer Systems with Chord, a Distributed Lookup Service
Indexing and Hashing Basic Concepts Ordered Indices
Chord Advanced issues.
ITIS 6010/8010 Wireless Network Security
Distributed Hash Tables
Distributed Systems CS
COS 461: Computer Networks
Outline The spoofing problem Approaches to handle spoofing
Kademlia: A Peer-to-peer Information System Based on the XOR Metric
Presentation transcript:

Secure Routing for Structured Peer-to-Peer Overlay Networks M. Castro, P. Druschel, A. Ganesh, A. Rowstron and D. S. Wallach Proc. Of the 5th Usenix Symposium on Operating Systems Design and Implementation, Boston, MA, Dec. 2002 Sai Rama Krishna Kona Bhavik Mehta

Outline Background of P2P overlay networks Routing Overlay model System model Secure routing Secure nodeId assignment Secure Routing table maintenance Secure message forwarding Conclusion

Structured P2P Overlays Provide a powerful platform for decentralized services: network storage, content distribution, and application-level multicast. Examples: CAN, Chord, Pastry and Tapestry Key’s Root Replica roots NodeId

Routing overlay model An abstract model of a structured p2p routing overlay nodeIds, uniform random identifiers from a large id space keys, unique identifiers selected from the id space used for assigning nodeIds key is mapped to a unique live node, called root routing table, maintained by each node neighbor set, a random sample replica function, maps a key to replica keys.

Pastry – Node State NodeIds, Keys – sequence of digits in base 2b 1x 2x 3x 4x 5x 7x 8x 9x ax bx cx dx ex fx 60x 61x 62x 63x 64x 66x 67x 68x 69x 6ax 6bx 6cx 6dx 6ex 6fx 650x 651x 652x 653x 654x 655x 656x 657x 658x 659x 65bx 65cx 65dx 65ex 65fx 65a0x 65a2x 65a3x 65a4x 65a5x 65a6x 65a7x 65a8x 65a9x 65aax 65abx 65acx 65adx 65aex 65afx 0x NodeIds, Keys – sequence of digits in base 2b A node’s routing table has 128/2b rows and 2b columns. Each node maintains a neighbor set (“leaf set”) Includes a set of l nodes with nodeIds that are numerically closest to the present node’s nodeId l/2 larger and l/2 smaller nodeIds than the current node’s id l is constant for all nodes A typical value is 8*log2bN Ensures reliable message delivery Used to store replicas Routing table of a Pastry node with nodeId 65a1x, b=4. Digits are in base 16, x represents an arbitrary suffix

Message routing in Pastry 1x 2x 3x 4x 5x 7x 8x 9x ax bx cx dx ex fx 60x 61x 62x 63x 64x 66x 67x 68x 69x 6ax 6bx 6cx 6dx 6ex 6fx 650x 651x 652x 653x 654x 655x 656x 657x 658x 659x 65bx 65cx 65dx 65ex 65fx 65a0x 65a2x 65a3x 65a4x 65a5x 65a6x 65a7x 65a8x 65a9x 65aax 65abx 65acx 65adx 65aex 65afx 0x Routing a message from node 65a1fc with key d46a1c. The dots depict live nodes in Pastry’s circular namespace.

Tapestry, Chord, CAN Tapestry Chord CAN Neighboring nodes are not aware of each other Uses Surrogate routing Log2N, expected number of routing hops Chord Forwards messages only in clockwise direction (1/2)Log2N, expected number of routing hops CAN Routes messages in d-dimensional space Routing table, O(d) entries, does not grow with n/w size (d/4)(N1/d), routing hops on average CAN & Chord: Proximity routing is harder but protects from certain attacks

System model Assumptions N: size of the overlay network f : 0<= f < 1, fraction of faulty nodes Constrained-collusion Byzantine failure model for faults c: 1/N <= c <= f; most damage case: c=f Each node has a static IP address Network level & Overlay level A message is delivered within time D with probability PD (with no faulty nodes)

Secure Routing Routing primitives implemented by Pastry etc., not suitable for developing secure applications Secure routing primitive: ensures that when a non-faulty node sends a message to a key k, the message reaches all non-faulty members in the set of replica roots Rk with very high probability. Implementation requires: Securely assigning nodeIds to nodes Securely maintaining the routing tables Securely forwarding messages

Secure nodeId assignment Goal ensure that uniform random distribution of nodeIds cannot be controlled by an attacker Attacks By carefully choosing nodeIds, attack a victim node’s routing table Control access to target objects by choosing closest nodeIds to all replica keys. Obtain a large number of legitimate nodeIds, Sybil attack Solution Centralized - Certified nodeId A set of trusted certification authorities (CAs) to assign nodeIds and to sign nodeId certificates. The nodeId certificate binds a random nodeId to the public key and IP address Nodes with valid certificates can join the overlay network CAs are not involved in the overlay network

Secure nodeId assignment (cont’d) Measures to counter Sybil attacks Charge a fee Identity-based CA Decentralized Require prospective node to solve crypto puzzle to gain a nodeId. The cost to solving a crypto puzzle must be acceptable to legitimate node but hard enough to slow down attackers Simple approach using crypto puzzle Each node generates a key pair: public key and private key SHA-1(I, K) has the first p bits zero I—initialization vector or MD5 K – public key The expected number of operations required to generate such a key pair is 2^p. NodeId = SHA-1(I, K) Periodically invalidate nodeIds

Secure routing table maintenance Goal Ensure that the fraction of faulty nodes that appears in the routing tables of correct nodes does not exceed f. (Damage Control, anyone??) Attacks Attackers fake proximity to increase the fraction of bad routing table entries A correct node p sends a probe to estimate delay to a faulty node. An attacker intercepts the probe and have the faulty node closest to p reply to the probe. Ok … now even after all this there will still be some fraction f of the nodes available that are faulty. It is time for some damage control. They are already there. You don’t know which ones they are but you can make sure that their effect is minimal. Now our goal is to ensure that this fraction is not exceeded in the routing table. Attackers have several ways to increase this fraction. Routing tables count on honesty on the part of other nodes. What happens if they are not honest. They can simply fake proximity to increase the no of bad routing table entries. A node sends a probe which can be intercepted. The faulty node closest to P will reply. Let me show you in a diagram.

Good Old Honest Routing Closest Node This is the good old honest routing…you can see ..The source node wants to find out which is the closest Node to d461ac. So it sends data and you can see that it will be able to reach it. Source Node

Honesty is not always the best policy !! Closest Node Now with the same diagram. Let’s assume there’s a fake node in this path. It gets the probe and it just flat out lies saying me, me, me …. the probe has been intercepted and there’s a false closest node. Source Node

Secure Routing Table-Attacks Point to faulty or non existent nodes Lie about next hop Supply incorrect routing updates while nodes join the overlay network. Causes fraction of bad routing table increase. Probability that a routing table entry is faulty after an update is (1-f)*f +f*1 = 2f –f^2 > f ( remember f <1 ) There are other types of attacks too. You can point to faulty or non existent nodes. They can just lie about the next hop and also supply incorrect updates. Now the In these systems, attackers can more easily supply routing updates that always point to faulty nodes. This simple attack causes the fraction of bad routing table entries to increase toward one as the bad routing updates are propagated. More precisely, routing updates from correct nodes point to a faulty node with probability at least f whereas this probability can be as high as one for routing updates from faulty nodes. Correct nodes receive updates from other correct nodes with probability at most 1 –f and from faulty nodes with probability at least f . Therefore, the probability that a routing table entry h is greater than f . This effect cascades with each subsequent update, causing the fraction of faulty entries to tend towards one.

Secure routing table maintenance Solutions – constrained routing table One routing table that maintains network proximity information for efficient routing (as in Pastry and Tapestry) The other routing table constraints routing entries as in Chord –needs to be the closest node id at some point in ID space bringing probability down to f. Normal operation the 1st table is used for efficiency purposes. If routing fails use the 2nd table Now what’s the solution----Make sure that compromised nodes cannot fake proximity. How do we do that ? Make sure that there are some constraints. if attackers cannot choose the nodeIds they control(by secure node id assignment), the probability that an attacker controls the nodeId closest to a point in the id space is f . This will not stop it but the probability again goes back to f. ….So for normal operations use locality table and for others use the constrained routing table.

Secure Routing tables Use two routing tables: Pastry+Chord First: normal locality-aware Pastry routing table Slot(l,d): share first l digits, has value d in l+1 digit Second: Constrained Pastry routing table Slot(l,d): closest nodeId to a point p p: share first l digits, has value d in l+1 digit, and has the same remaining digits as l First is efficient, second is for backup So the routing tables we use is a combination of Chord and Pastry. The first table is locality aware. Share (l,d) where u share the first l digits and value d in the l + 1 st digit. For the other has the same remianing digits as l+1

Secure routing table maintenance (con’t) 1x 2x 3x 4x 5x 7x 8x 9x ax bx cx dx ex fx 60x 61x 62x 63x 64x 66x 67x 68x 69x 6ax 6bx 6cx 6dx 6ex 6fx 650x 651x 652x 653x 654x 655x 656x 657x 658x 659x 65bx 65cx 65dx 65ex 65fx 65a0x 65a2x 65a3x 65a4x 65a5x 65a6x 65a7x 65a8x 65a9x 65aax 65abx 65acx 65adx 65aex 65afx 0x 64a1x 6501x We modified Pastry to use this solution. We use the normal locality-aware Pastry routing table and an additional constrained Pastry routing table. In the locality-aware routing table of a node with identifier i, the slot at level l and domain d can contain any nodeId that shares the first l digits with i and has the value d in the l 1st digit. In the constrained routing table, the entry is further constrained to point to the closest nodeId to a point p in the domain. We define p as follows: it shares the first l digits with i, it has the value d in the l + 1st digit, and it has the same remaining digits as i. Constraint routing table of a Pastry node with nodeId 65a1x, b=4. Digits are in base 16, x represents an arbitrary suffix

Secure routing table initialization Bootstrap Nodes - Use a diverse set of bootstrap nodes - Big enough to ensure one is correct Procedure Pick a set of bootstrap nodes and ask them to route using node id as key No-faulty bootstrap node uses secure forwarding techniques Collects all the proposed neighbor set from each of bootstrap nodes, pick the “closest” as its neighbor Pick the route entry with minimal delay as the locality-aware routing table Initialize each entry of constrained routing table as the live nodeId closest to the desired point p in the id space (secure forwarding) Alternative way to initialize constraint routing table Use secure forwarding to get live nodeId for each entry p for n’s constraint routing table – too expensive n request its neighbor set’s constraint routing table Side Effect :- Neighbours also know about new arrival How do you initialize the tables. You pick a bootstrap node and ask them to route using node id as key. If you pick one and that is faulty then you are in trouble. The difference is that there are several routes; n picks the entry with minimal network delay from the set of candidates it receives for each routing table slot. Constraint routing table ---secure forwarding too expensive, request neighbours set. They also know about your arrival set.

Secure message forwarding(1) Goal: Ensures that at lease one copy of a message sent to a key reaches each correct replica root for the key with high probability. Attacks: Faulty nodes can drop message route message to the wrong place Pretend to be the key’s root. The root node itself may be faulty The probably of routing successfully to a correct replica node is (1-f)h (h is the average routing hops) Even though u assigned secure node ids. The routing table has f false entries. A message is sent…it can still be dropped. b = 4

Secure message forwarding(2) Solution Detect faults and redundant routes Routes a message to the key’s root using locality-aware routing table Collect the prospective set of replica roots from the prospective root node Apply failure test to the set of replica roots. If the test is negative, accept the prospective replica roots as the correct ones. Otherwise, message copies are sent over diverse routes toward the various replica roots Now what do you do……..U take the optimistic scenario. You route a message to the key’s root using locality aware routing table. Collect the prosepective set of replica routes from the node. You then apply a failure test to figure out wther they are correct or compromised. If it is negative, accept them else send message copies over diverse routes towards the various replica routes.

Secure message forwarding(3) Routing failure test(Based on the observation: the average density of nodeIds per unit of “volume” in the id space is greater than the average density of faulty nodeIds). Input: a key x and a set of prospective replica roots for the key x: rn = id0,…, idl+1 Output: negative or positive p calculate the average numerical distance Up between consecutive nodesIds in its neighbor set. P checks All nodeIds in rn have a valid nodeId certificate, the closest nodeId to the key is the middle one, and the nodeIds satisfy the definition of a neighbor set. The average numerical distance Urn in rn satisfies Urn < Up *γ Urn = average numerical distance between consecutive nodeIds in rn Urn < Up *γ I talked about the routing test. What is it?? It is basically based on the observation that the average densty of node ids’ per unit of volume in the id space is greater than average density of faulty node ID’s. Since nodeIds are assumed to be uniformly distributed, the routing failure test is based on the observation that if faulty nodes try to suppress the existance of some correct nodes, the density of nodeIds in the id space would be much lower than the average. The test works by comparing the density of nodeIds in the neighbor set of the sender with the density of nodeIds close to the replica roots of the destination key. If the density is suspiciously low, the secure routing is repeated with a different set of routes. Specically, let the nodeId density around the responding node be d and the local nodeId density be a. We accept the responding node only if d< ag , where g is a parameter chosen for minimizing false positives and false negatives.

Secure message forwarding(4) Other attacks Collect old nodeId certificates Include both nodeIds of nodes it controls and nodeId of correct nodes in a prospective root neighbor set. nodeId suppression attack Suppress nodeId close to sender, increase β(false negative) Suppress nodeId in root neighbor set , which increase α(false positive) This test does not always work…You can collect old nodeId certifictes, include both ids….Suppress NodeId closer to sender , increase beta –false negative…. Suppress nodeId in root neighbour set..which increases aplha..giving us false positives

Redundant Routing While failure test is positive, send message to each replica root via multiple routes. In Pastry, they send message from the source node to all of its neighbors in the p2p overlay. Because nodeIds are random, the neighbors should represent a random, geographically diverse, sampling of the nodes in the p2p overlay. From there, each neighbor node forwards the message toward the target node. If at least one of the neighbors can achieve a successful route, then the message is considered successfully delivered. This concept is basically called redundant routing…when the failure test is positive send message to each replica route via multiple routes. One of them gets it you are happy…..

Redundant route Neighbor set anycast: 1) p sends r messages to the destination key x with a nonce. 2) Any correct node that receives the message and has x’s root in its neighbor set returns its nodeId certificate and the nonce, signed by its private key. 3) p collects in a set N the l/2+1 nodeId certificates closest to x on the left and l/2+1 nodeId certificates closest to x on the right, marked pending. 4) After timeout or r replies are received, p sends a list of nodeIds in N to each node in N. and mark as done. 5) Any correct node that receives the list forwards p’s original message to the nodes in its neighbor set that are not in the list or returns a confirmation if no such nodes exist. 6) P receives r confirmation or step 4 was executed three times. it computes the set of replica roots for x from N. This is just one more method…basically x’s root in neigbour set..signs it and sendsit……after a timeout sends a list of node ids inn N to every one. Any one that receives it forwards message t everybody else………it receives confirmations and it is happy……

Simulation results Model and simulation results for the probability of reaching all correct replica roots using redundant routing with neighbor set anycast.

Secure Routing Prevents attacks at join time: secure nodeID assignment and bootstrapping Ensure that when a correct node sends a message for a particular key, the message reaches all correct replica roots for the key with very high probability. For data, we need other mechanisms, for example self-certifying data Now we have used secure routing to ensure all the node ids are proper and the right keys get the right message. For data to avoid overhead use self certifying data.

Self Certifying Data Client can check data and only needs to rely on routing when certification check fails. Reduces the reliance on the redundant, secure routing primitive (you still need secure forwarding otherwise there is no data to verify in the first place) Uses concepts like proactive signature sharing or group keys/signatures. Self-certifying data can eliminate the overhead of secure routing in common cases You all know how this works..

Related Work Dinglidiene and Doucer’s work addresses spoofing attacks. Goal is to prevent malicious nodes using reputation or micro-cash. Bellovin works with Gnutella and Napster Sit and Morris discuss security attack, Cover various node lookups, routing table maintenance, network partitioning, Denial of Service Attacks, file storage etc The first one …I have seen it a lot………

Summary No Factors Attacks Solutions 1 Node Id Assignment 2 Control victim’s access to overlay, Control access to objects, Sybil Attacks NodeID certificates assigned by trusted CAs, crypto puzzles 2 Secure routing tables Fake proximity, send false updates, lie about the next hop etc 2 tables, constrained and locality aware, for initialization use diverse nodes 3 Message Forwarding Drop the message, route to wrong place, root node itself is faulty etc Apply failure test, redundant routing over diverse routes, self certifying data. Even though u been sleeping for the entire lecture this will help u out..

Conclusion Presented the design and analysis of techniques for secure node joining, routing table maintenance and message forwarding in p2p overlay Based on modeling and corroborated with simulations, they have measured that this operation can be successful with a 99.9% probability, as long as f<= 30%.

Questions ?

Since nodeIds are assumed to be uniformly distributed, the routing failure test is based on the observation that if faulty nodes try to suppress the existance of some correct nodes, the density of nodeIds in the id space would be much lower than the average. The test works by comparing the density of nodeIds in the neighbor set of the sender with the density of nodeIds close to the replica roots of the destination key. If the density is suspiciously low, the secure routing is repeated with a different set of routes. Specically, let the nodeId density around the responding node be d and the local nodeId density be a. We accept the responding node only if d< ag , where g is a parameter chosen for minimizing false positives and false negatives.