Design considerations for P2P Stefan Saroiu, Krishna Gummadi, Steve Gribble Department of Computer Science and Engineering University of Washington
IPTPS 2002Saroiu, Gummadi, Gribble What this talk is about Peer-to-Peer systems have arrivedPeer-to-Peer systems have arrived – but, are they here to stay? – many things work well, but many things are still missing Goals of this talk:Goals of this talk: – to argue that current P2P designs are missing crucial properties – to show how specific architectural decisions have led to the loss of these properties – to encourage this community to recapture these properties in their designs, so that our systems are practical and successful
IPTPS 2002Saroiu, Gummadi, Gribble The Good and the Bad P2P systems have achieved several goalsP2P systems have achieved several goals – decentralized and symmetric infrastructure – evenly sharing resources and responsibilities among participants But, some things are still missingBut, some things are still missing – we’ve lost a few “ilities”: securability: more vulnerable to malicious attack composability: can’t transparently compose functionality into the system predictability and controllability: cannot easily engineer system to provision for performance, reliability, workload or value
IPTPS 2002Saroiu, Gummadi, Gribble How to show why this has happened Start with a well understood distributed system that is known to have all of the “ilities”Start with a well understood distributed system that is known to have all of the “ilities” – the WWW Compare its architecture to a representative P2P systemCompare its architecture to a representative P2P system – Chord/CFS [but we could use any of CAN, Pastry, Tapestry, …] Figure out the differences, and draw implicationsFigure out the differences, and draw implications – show how the difference lead to loss of the properties Convince you that we need to recapture back these “ilities”Convince you that we need to recapture back these “ilities”
NameAddressRoutingLookupTopology WWWURLIP Address- based DNS Physical, arbitrary The Architecture of the Web
NameAddressRoutingLookupTopology WWWURLIP Address- based DNS Physical, arbitrary Chord/CFS Chord ID Chord ID-based, implicit, deterministic Logical, deterministic, random Contrasting the Web with P2P
IPTPS 2002Saroiu, Gummadi, Gribble Collapsing name and address space A name is an addressA name is an address – the ability to create an address implies control over certain names – lose controllability: cannot separately grant authority over names and addresses must engineer other (higher level) mechanisms The name of content dictates the node that must manage itThe name of content dictates the node that must manage it – inserting content into the system forces others to do work on my behalf, and others can force me to do work – lose securability: denial of service attacks are possible
IPTPS 2002Saroiu, Gummadi, Gribble Collapsing routing and lookup Name-based routingName-based routing – lack of explicit lookup system removes a level of indirection lose composability: harder to make replication transparent to the client Servers must now be routers: roles are not separableServers must now be routers: roles are not separable – cannot have different levels of trust Web: core routers typically more trustworthy than web servers P2P: lose controllability, cannot engineer trust according to role – failure of routing and content serving is intertwined Web: failure of server doesn’t affect routing to other servers P2P: lose controllability, isolation of failures is lost
IPTPS 2002Saroiu, Gummadi, Gribble Topology is deterministic Chord IDs of participants dictates overlay topologyChord IDs of participants dictates overlay topology – lose securability: possible to attack system by choosing content/server address can hijack content by choosing appropriate address can surround and “monitor” a node in system by choosing address – lose controllability: harder to do policy-based routing set of routes available is dictated by nodes that participate
IPTPS 2002Saroiu, Gummadi, Gribble Topology is randomized Hashing spreads mapping of overlay to physical linksHashing spreads mapping of overlay to physical links – lose controllability and predictability: can’t predict the nodes or physical links involved with content cannot do local provisioning for hotspots, high value content content provider must trust strangers to provide high quality of service – randomness amplifies bad local properties of system, but not good a 1000 node Chord overlay with 20% modems has >80% slow paths
IPTPS 2002Saroiu, Gummadi, GribbleSummary Architectural decisions have several implicationsArchitectural decisions have several implications – helped to simplify systems and achieve many good properties – but, have also lost several crucial “ilities” composability: transparently compose functionality into system securability: prevent certain classes of attacks –e.g., denial of service by forcing others to do work predictability: anticipate which nodes/links are involved with content –randomness diffuses local properties globally, and amplifies the bad controllability: ability to engineer roles, resources, and responsibilities –e.g., don’t want to have to depend on others for quality of my content
IPTPS 2002Saroiu, Gummadi, Gribble Moving Forward To mature P2P systems into a practical application infrastructure, we need to recapture the “ilities”To mature P2P systems into a practical application infrastructure, we need to recapture the “ilities” – ability to enforce: who publishes, who participates – ability to engineer according to: specific load, value of content – ability to delegate and engineer responsibilities by: trustworthiness, capability of contributing resources Otherwise, our systems will fail for reasons that are incidental to the current design goalsOtherwise, our systems will fail for reasons that are incidental to the current design goals – why? because these are the pragmatic, engineering reasons that world needs to make a system successful in practice