Comet: An Active Distributed Key-Value Store Roxana Geambasu Amit Levy Yoshi Kohno Arvind Krishnamurthy Hank Levy University of Washington.

Slides:



Advertisements
Similar presentations
NetServ Dynamic in-network service deployment Henning Schulzrinne (Columbia University) Srinivasan Seetharaman (Georgia Tech) Volker Hilt (Bell Labs)
Advertisements

P2P data retrieval DHT (Distributed Hash Tables) Partially based on Hellerstein’s presentation at VLDB2004.
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
Case Study - Amazon. Amazon r Amazon has many Data Centers r Hundreds of services r Thousands of commodity machines r Millions of customers at peak times.
TAP: A Novel Tunneling Approach for Anonymity in Structured P2P Systems Yingwu Zhu and Yiming Hu University of Cincinnati.
PROVENANCE FOR THE CLOUD (USENIX CONFERENCE ON FILE AND STORAGE TECHNOLOGIES(FAST `10)) Kiran-Kumar Muniswamy-Reddy, Peter Macko, and Margo Seltzer Harvard.
Clayton Sullivan PEER-TO-PEER NETWORKS. INTRODUCTION What is a Peer-To-Peer Network A Peer Application Overlay Network Network Architecture and System.
Kademlia: A Peer-to-peer Information System Based on the XOR Metric Petar Mayamounkov David Mazières A few slides are taken from the authors’ original.
Dynamo: Amazon's Highly Available Key-value Store Distributed Storage Systems CS presented by: Hussam Abu-Libdeh.
Click to edit Master title style Defeating Vanish with Low-Cost Sybil Attacks Against Large DHTs Scott Wolchok 1 Owen S. Hofmann 2 Nadia Heninger 3 Edward.
Vanish: Increasing Data Privacy with Self-Destructing Data Roxana Geambasu Yoshi Kohno Amit Levy Hank Levy University of Washington.
S EMINAR A SELF DESTRUCTING DATA SYSTEM BASED ON ACTIVE STORAGE FRAMEWORK ONON P RESENTED BY S HANKAR G ADHVE G UIDED BY P ROF.P RAFUL P ARDHI.
Extensible Networking Platform IWAN 2005 Extensible Network Configuration and Communication Framework Todd Sproull and John Lockwood
Small-Scale Peer-to-Peer Publish/Subscribe
Outline for today Structured overlay as infrastructures Survey of design solutions Analysis of designs.
WWW’04 – Towards Context-Aware Adaptable Web Services1 Towards Context-Aware Adaptable Web Services Markus Keidl Universität Passau Fakultät für Mathematik.
Small-world Overlay P2P Network
The Most Dangerous Code in the Browser Stefan Heule, Devon Rifkin, Alejandro Russo, Deian Stefan Stanford University, Chalmers University of Technology.
FeedTree: Sharing Web Micronews with Peer-to-Peer Event Notification D. Sandler, A. Mislove, A. Post, P. Druschel Presented by: Andrew Sutton.
Applications over P2P Structured Overlays Antonino Virgillito.
A BitTorrent Module for the OMNeT++ Simulator MASCOTS 2009, London, UK G. Xylomenos (with K. Katsaros, V.P. Kemerlis and C. Stais)
P2P: Advanced Topics Filesystems over DHTs and P2P research Vyas Sekar.
A Routing Control Platform for Managing IP Networks Jennifer Rexford Princeton University
On the Feasibility of Large-Scale Infections of iOS Devices
Nikolay Tomitov Technical Trainer SoftAcad.bg.  What are Amazon Web services (AWS) ?  What’s cool when developing with AWS ?  Architecture of AWS 
Building a Strong Foundation for a Future Internet Jennifer Rexford ’91 Computer Science Department (and Electrical Engineering and the Center for IT Policy)
Distributed Publish/Subscribe Network Presented by: Yu-Ling Chang.
Replay Debugging for Distributed Systems Dennis Geels, Gautam Altekar, Ion Stoica, Scott Shenker.
Platform as a Service (PaaS)
Introduction to Cyberspace
Active Network Applications Tom Anderson University of Washington.
Evaluating Centralized, Hierarchical, and Networked Architectures for Rule Systems Benjamin Craig University of New Brunswick Faculty of Computer Science.
Application Layer – Peer-to-peer UIUC CS438: Communication Networks Summer 2014 Fred Douglas Slides: Fred, Kurose&Ross (sometimes edited)
RUNNING PARALLEL APPLICATIONS BEYOND EP WORKLOADS IN DISTRIBUTED COMPUTING ENVIRONMENTS Zholudev Yury.
Distributed Systems Concepts and Design Chapter 10: Peer-to-Peer Systems Bruce Hammer, Steve Wallis, Raymond Ho.
HomeViews: P2P Middleware for Personal Data Sharing Applications Roxana Geambasu, Magdalena Balazinska, Steve Gribble, Hank Levy University of Washington.
Institute of Computer and Communication Network Engineering OFC/NFOEC, 6-10 March 2011, Los Angeles, CA Lessons Learned From Implementing a Path Computation.
Rapid Development and Flexible Deployment of Adaptive Wireless Sensor Network Applications Chien-Liang Fok, Gruia-Catalin Roman, Chenyang Lu
Peer to Peer Research survey TingYang Chang. Intro. Of P2P Computers of the system was known as peers which sharing data files with each other. Build.
Ahmad Al-Shishtawy 1,2,Tareq Jamal Khan 1, and Vladimir Vlassov KTH Royal Institute of Technology, Stockholm, Sweden {ahmadas, tareqjk,
Rapid Development and Flexible Deployment of Adaptive Wireless Sensor Network Applications Chien-Liang Fok, Gruia-Catalin Roman, Chenyang Lu
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
1 Distributed Hash Tables (DHTs) Lars Jørgen Lillehovde Jo Grimstad Bang Distributed Hash Tables (DHTs)
Vanish: Increasing Data Privacy with Self-Destructing Data Roxana Geambasu, Tadayoshi Kohno, Amit Levy, et al. University of Washington USENIX Security.
1 Introduction to Middleware. 2 Outline What is middleware? Purpose and origin Why use it? What Middleware does? Technical details Middleware services.
Cryptographic Security Secret Sharing, Vanishing Data 1Dennis Kafura – CS5204 – Operating Systems.
CEPH: A SCALABLE, HIGH-PERFORMANCE DISTRIBUTED FILE SYSTEM S. A. Weil, S. A. Brandt, E. L. Miller D. D. E. Long, C. Maltzahn U. C. Santa Cruz OSDI 2006.
Vanish: Increasing Data Privacy with Self-Destructing Data Roxana Geambasu Tadayoshi Kohno Amit A. Levy Henry M. Levy University of Washington.
Vanish: Increasing Data Privacy with Self-Destructing Data Roxana Geambasu | Tadayoshi Kohno | Amit A. Levy | Henry M. Levy Presented by: Libert Tapia.
Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such.
Multimedia & Mobile Communications Lab.
Rendezvous Regions: A Scalable Architecture for Service Location and Data-Centric Storage in Large-Scale Wireless Sensor Networks Karim Seada, Ahmed Helmy.
Paper by: Roxana Geambasu, Tadayoshi Kohno, Amit A. Levy, Henry M. Levy University of Washington Vanish: Increasing Data Privacy with Self-Destructing.
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
1 Object Oriented Logic Programming as an Agent Building Infrastructure Oct 12, 2002 Copyright © 2002, Paul Tarau Paul Tarau University of North Texas.
Protocol Requirements draft-bryan-p2psip-requirements-00.txt D. Bryan/SIPeerior-editor S. Baset/Columbia University M. Matuszewski/Nokia H. Sinnreich/Adobe.
P2PSIP Security Analysis and evaluation draft-song-p2psip-security-eval-00 Song Yongchao Ben Y. Zhao
1 VLDB - Data Management in Grids B. Del-Fabbro, D. Laiymani, J.M. Nicod and L. Philippe Laboratoire d’Informatique de l’Université de Franche-Comté Séoul,
A Security Framework with Trust Management for Sensor Networks Zhiying Yao, Daeyoung Kim, Insun Lee Information and Communication University (ICU) Kiyoung.
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
COMP7330/7336 Advanced Parallel and Distributed Computing MapReduce - Introduction Dr. Xiao Qin Auburn University
Evaluating and Optimizing Indexing Schemes for a Cloud-based Elastic Key- Value Store Apeksha Shetty and Gagan Agrawal Ohio State University David Chiu.
Platform as a Service (PaaS)
Platform as a Service (PaaS)
Platform as a Service (PaaS)
Providing Secure Storage on the Internet
Distributed Publish/Subscribe Network
Vanish: Increasing Data Privacy with Self-Destructing Data
Kademlia: A Peer-to-peer Information System Based on the XOR Metric
Presentation transcript:

Comet: An Active Distributed Key-Value Store Roxana Geambasu Amit Levy Yoshi Kohno Arvind Krishnamurthy Hank Levy University of Washington

Distributed Key/Value Stores A simple put / get interface Great properties: scalability, availability, reliability Increasingly popular both within data centers and in P2P 2 Data center P2P Dynamo amazon.com Voldemort LinkedIn Cassandra Facebook Vuze DHT Vuze uTorrent DHT uTorrent

Increasingly, key/value stores are shared by many apps  Avoids per-app storage system deployment However, building apps atop today’s stores is challenging Distributed Key/Value Storage Services 3 Data centerP2P Amazon S3 Altexa Photo Bucket Jungle Disk Vuze App One- Swarm Vanish Vuze DHT

Challenge: Inflexible Key/Value Stores Applications have different (even conflicting) needs:  Availability, security, performance, functionality But today’s key/value stores are one-size-fits-all Motivating example: our Vanish experience 4 App 1App 2App 3 Key/value store

Vanish is a self-destructing data system built on Vuze Vuze problems for Vanish:  Fixed 8-hour data timeout  Overly aggressive replication, which hurts security Changes were simple, but deploying them was difficult:  Need Vuze engineer  Long deployment cycle  Hard to evaluate before deployment Motivating Example: Vanish [USENIX Security ‘09] Vuze App Vanish Vuze DHT Vuze App Vanish Vuze DHT 5 VuzeVanish Vuze DHT VuzeVanish Vuze DHT VuzeVanish Vuze DHT VuzeVanish Vuze DHT Future app Vuze App Vanish Future app Vuze DHT Question: How can a key/value store support many applications with different needs?

Extensible Key/Value Stores Allow apps to customize store’s functions  Different data lifetimes  Different numbers of replicas  Different replication intervals Allow apps to define new functions  Tracking popularity: data item counts the number of reads  Access logging: data item logs readers’ IPs  Adapting to context: data item returns different values to different requestors 6

Design Philosophy We want an extensible key/value store But we want to keep it simple!  Allow apps to inject tiny code fragments (10s of lines of code)  Adding even a tiny amount of programmability into key/value stores can be extremely powerful This paper shows how to build extensible P2P DHTs  We leverage our DHT experience to drive our design 7

Outline Motivation Architecture Applications Conclusions 8

Comet DHT that supports application-specific customizations Applications store active objects instead of passive values  Active objects contain small code snippets that control their behavior in the DHT 9 App 1App 2App 3 Comet Active objectComet node

Comet’s Goals Flexibility  Support a wide variety of small, lightweight customizations Isolation and safety  Limited knowledge, resource consumption, communication Lightweight  Low overhead for hosting nodes 10

Active Storage Objects (ASOs) The ASO consists of data and code  The data is the value  The code is a set of handlers that are called on put / get 11 App 1App 2App 3 Comet ASO data code function onGet() […] end

Each replica keeps track of number of gets on an object The effect is powerful:  Difficult to track object popularity in today’s DHTs  Trivial to do so in Comet without DHT modifications Simple ASO Example 12 ASO data code aso.value = “Hello world!” aso.getCount = 0 function onGet() self.getCount = self.getCount + 1 return {self.value, self.getCount} end

Local Store Comet Architecture 13 Routing Substrate K1K1 ASO 1 ASO 2 K2K2 DHT Node Traditional DHT Comet Active Runtime External Interaction Handler Invocation Sandbox Policies ASO 1 data code ASO Extension API

The ASO Extension API ApplicationsCustomizations Vanish Replication Timeout One-time values Adeona Password access Access logging P2P File Sharing Smart tracker Recursive gets P2P Twitter Publish / subscribe Hierarchical pub/sub Measurement Node lifetimes Replica monitoring

The ASO Extension API Small yet powerful API for a wide variety of applications  We built over a dozen application customizations We have explicitly chosen not to support:  Sending arbitrary messages on the Internet  Doing I/O operations  Customizing routing … 15 Intercept accesses Periodic Tasks Host Interaction DHT Interaction onPut ( caller ) onTimer () getSystemTime () get (key, nodes) onGet ( caller ) getNodeIP () put (key, data, nodes) onUpdate ( caller ) getNodeID () lookup (key) getASOKey () deleteSelf ()

The ASO Sandbox Limit ASO’s knowledge and access  Use a standard language-based sandbox  Make the sandbox as small as possible (<5,000 LOC) Start with tiny Lua language and remove unneeded functions 2. Limit ASO’s resource consumption  Limit per-handler bytecode instructions and memory  Rate-limit incoming and outgoing ASO requests 3. Restrict ASO’s DHT interaction  Prevent traffic amplification and DDoS attacks  ASOs can talk only to their neighbors, no recursive requests

Comet Prototype We built Comet on top of Vuze and Lua  We deployed experimental nodes on PlanetLab In the future, we hope to deploy at a large scale  Vuze engineer is particularly interested in Comet for debugging and experimentation purposes 17

Outline Motivation Architecture Applications Conclusions 18

ApplicationsCustomizationLines of Code Vanish Security-enhanced replication 41 Flexible timeout 15 One-time values 15 Adeona Password-based access 11 Access logging 22 P2P File Sharing Smart Bittorrent tracker 43 Recursive gets* 9 Publish/subscribe 14 P2P Twitter Hierarchical pub/sub* 20 Measurement DHT-internal node lifetimes 41 Replica monitoring 21 Comet Applications 19 * Require signed ASOs (see paper)

Three Examples 1. Application-specific DHT customization 2. Context-aware storage object 3. Self-monitoring DHT 20

Example: customize the replication scheme We have implemented the Vanish-specific replication  Code is 41 lines in Lua 1. Application-Specific DHT Customization function aso:selectReplicas(neighbors) [...] end function aso:onTimer() neighbors = comet.lookup() replicas = self.selectReplicas(neighbors) comet.put(self, replicas) end 21

2. Context-Aware Storage Object Traditional distributed trackers return a randomized subset of the nodes Comet: a proximity-based distributed tracker  Peers put their IPs and Vivaldi coordinates at torrentID  On get, the ASO computes and returns the set of closest peers to the requestor ASO has 37 lines of Lua code 22

Proximity-Based Distributed Tracker 23 Comet tracker Random tracker

Example: monitor a remote node’s neighbors  Put a monitoring ASO that “pings” its neighbors periodically Useful for internal measurements of DHTs  Provides additional visibility over external measurement (e.g., NAT/firewall traversal) 3. Self-Monitoring DHT 24 aso.neighbors = {} function aso:onTimer() neighbors = comet.lookup() self.neighbors[comet.systemTime()] = neighbors end

Example Measurement: Vuze Node Lifetimes 25 Vuze Node Lifetime (hours) External measurement Comet Internal measurement

Outline Motivation Architecture Evaluation Conclusions 26

Conclusions Extensibility allows a shared storage system to support applications with different needs Comet is an extensible DHT that allows per-application customizations  Limited interfaces, language sandboxing, and resource and communication limits  Opens DHTs to a new set of stronger applications Extensibility is likely useful in data centers (e.g., S3):  Assured delete  Logging and forensics 27  Storage location awareness  Popularity