Intel Research Petros Maniatis, IRB Declarative Overlays Petros Maniatis joint work with Tyson Condie, David Gay, Joseph M. Hellerstein, Boon Thau Loo,

Slides:



Advertisements
Similar presentations
P2: Implementing Declarative Overlays
Advertisements

Network Layer Delivery Forwarding and Routing
Zhongxing Telecom Pakistan (Pvt.) Ltd
Distributed Systems Architectures
Chapter 7 System Models.
© 2008 Pearson Addison Wesley. All rights reserved Chapter Seven Costs.
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
Cognitive Radio Communications and Networks: Principles and Practice By A. M. Wyglinski, M. Nekovee, Y. T. Hou (Elsevier, December 2009) 1 Chapter 12 Cross-Layer.
Copyright © 2011, Elsevier Inc. All rights reserved. Chapter 6 Author: Julia Richards and R. Scott Hawley.
1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 3 CPUs.
Properties Use, share, or modify this drill on mathematic properties. There is too much material for a single class, so you’ll have to select for your.
OSPF 1.
Declarative Networking: Language, Execution and Optimization Boon Thau Loo 1, Tyson Condie 1, Minos Garofalakis 2, David E. Gay 2, Joseph M. Hellerstein.
Implementing Declarative Overlays Timothy Roscoe Joint work with Boon Thau Loo, Tyson Condie, Joseph M. Hellerstein, Petros Maniatis, Ion Stoica Intel.
Intel Research Timothy Roscoe P2: Implementing Declarative Overlays Timothy Roscoe Boon Thau Loo, Tyson Condie, Petros Maniatis, Ion Stoica, David Gay,
Implementing Declarative Overlays Boon Thau Loo 1 Tyson Condie 1, Joseph M. Hellerstein 1,2, Petros Maniatis 2, Timothy Roscoe 2, Ion Stoica 1 1 University.
Declarative Networking Mothy Joint work with Boon Thau Loo, Tyson Condie, Joseph M. Hellerstein, Petros Maniatis, Ion Stoica Intel Research and U.C. Berkeley.
UNITED NATIONS Shipment Details Report – January 2006.
RXQ Customer Enrollment Using a Registration Agent (RA) Process Flow Diagram (Move-In) Customer Supplier Customer authorizes Enrollment ( )
1 Hyades Command Routing Message flow and data translation.
Correctness of Gossip-Based Membership under Message Loss Maxim GurevichIdit Keidar Technion.
Communicating over the Network
1.
Internet Indirection Infrastructure (i3 ) Ion Stoica, Daniel Adkins, Shelley Zhuang, Scott Shenker, Sonesh Surana UC Berkeley SIGCOMM 2002 Presented by:
Week 2 The Object-Oriented Approach to Requirements
Chapter 11: Models of Computation
Turing Machines.
Table 12.1: Cash Flows to a Cash and Carry Trading Strategy.
Local Area Networks - Internetworking
The Platform as a Service Model for Networking Eric Keller, Jennifer Rexford Princeton University INM/WREN 2010.
Chapter 17 Linked Lists.
Abstract Data Types and Algorithms
Data Structures Using C++
TCP/IP Protocol Suite 1 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Chapter 2 The OSI Model and the TCP/IP.
Microsoft Access.
EIS Bridge Tool and Staging Tables September 1, 2009 Instructor: Way Poteat Slide: 1.
Chapter 10: Virtual Memory
Developing a Project Plan CHAPTER SIX Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin.
Chapter 6: Developing a Project Plan
Project Management 6e..
1 University of Utah – School of Computing Computer Science 1021 "Thinking Like a Computer"
Developing the Project Plan
CS 6143 COMPUTER ARCHITECTURE II SPRING 2014 ACM Principles and Practice of Parallel Programming, PPoPP, 2006 Panel Presentations Parallel Processing is.
IP Multicast Information management 2 Groep T Leuven – Information department 2/14 Agenda •Why IP Multicast ? •Multicast fundamentals •Intradomain.
Benchmark Series Microsoft Excel 2013 Level 2
 Copyright I/O International, 2013 Visit us at: A Feature Within from Item Class User Friendly Maintenance  Copyright.
Basel-ICU-Journal Challenge18/20/ Basel-ICU-Journal Challenge8/20/2014.
1..
CONTROL VISION Set-up. Step 1 Step 2 Step 3 Step 5 Step 4.
Executional Architecture
The Design and Implementation of Declarative Networks Boon Thau Loo University of Pennsylvania, University of California-Berkeley * *This dissertation.
Chapter 10: The Traditional Approach to Design
Systems Analysis and Design in a Changing World, Fifth Edition
Chapter 12 Working with Forms Principles of Web Design, 4 th Edition.
Essential Cell Biology
© Ericsson Interception Management Systems, 2000 CELLNET Drop Administering IMS Database Module Objectives To add a network elements to the database.
PSSA Preparation.
Essential Cell Biology
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 13 Slide 1 Application architectures.
Topic 16 Sorting Using ADTs to Implement Sorting Algorithms.
Energy Generation in Mitochondria and Chlorplasts
© Paradigm Publishing, Inc Access 2010 Level 2 Unit 2Advanced Reports, Access Tools, and Customizing Access Chapter 8Integrating Access Data.
Techniques for proving programs with pointers A. Tikhomirov.
User Defined Functions Lesson 1 CS1313 Fall User Defined Functions 1 Outline 1.User Defined Functions 1 Outline 2.Standard Library Not Enough #1.
1 Decidability continued…. 2 Theorem: For a recursively enumerable language it is undecidable to determine whether is finite Proof: We will reduce the.
Implementing declarative overlays Boom Thau Loo Tyson Condie Joseph M. Hellerstein Petros Maniatis Timothy Roscoe Ion Stoica.
Implementing Declarative Overlays From two talks by: Boon Thau Loo 1 Tyson Condie 1, Joseph M. Hellerstein 1,2, Petros Maniatis 2, Timothy Roscoe 2, Ion.
Building diagnosable distributed systems Petros Maniatis Intel Research Berkeley ICSI – Security Crystal Ball.
Using Queries for Distributed Monitoring and Forensics Atul Singh Rice University Peter Druschel Max Planck Institute for Software Systems Timothy Roscoe.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
Presentation transcript:

Intel Research Petros Maniatis, IRB Declarative Overlays Petros Maniatis joint work with Tyson Condie, David Gay, Joseph M. Hellerstein, Boon Thau Loo, Raghu Ramakrishnan, Sean Rhea, Timothy Roscoe, Atul Singh, Ion Stoica IRB, UC Berkeley, U Wisconsin, Rice

Petros Maniatis, IRB Intel Research 2 Overlays Everywhere… Overlay: the routing and message forwarding component of any self-organizing distributed system Internet Overlay

Petros Maniatis, IRB Intel Research 3 Overlays Everywhere… Many examples: Many examples: Internet Routing, multicast Internet Routing, multicast Content delivery, file sharing, DHTs, Google Content delivery, file sharing, DHTs, Google Microsoft Exchange (for load balancing) Microsoft Exchange (for load balancing) Tibco (technology bridging) Tibco (technology bridging) Overlays are a fundamental tool for repurposing communication infrastructures Overlays are a fundamental tool for repurposing communication infrastructures Get a bunch of friends together and build your own ISP (Internet evolvability) Get a bunch of friends together and build your own ISP (Internet evolvability) You dont like Internet Routing? Make up your own rules (RON) You dont like Internet Routing? Make up your own rules (RON) Paranoid? Run a VPN in the wide area Paranoid? Run a VPN in the wide area Intrusion detection with friends (FTN, Polygraph) Intrusion detection with friends (FTN, Polygraph) Have your assets discover each other (iAMT) Have your assets discover each other (iAMT) Internet Overlay Distributed systems innovation needs overlays

Petros Maniatis, IRB Intel Research 4 If only it werent so hard In theory In theory Figure out right properties Figure out right properties Get the algorithms and protocols Get the algorithms and protocols Implement them Implement them Tune them Tune them Test them Test them Debug them Debug them Repeat Repeat But in practice But in practice No global view No global view Wrong choice of algorithms Wrong choice of algorithms Incorrect implementation Incorrect implementation Psychotic timeouts Psychotic timeouts Partial failures Partial failures Impaired introspection Impaired introspection Homicidal boredom Homicidal boredom Its hard enough as it is Do I also need to reinvent the wheel every time?

Petros Maniatis, IRB Intel Research 5 Our Goal Make overlay development more accessible to developers of distributed applications Make overlay development more accessible to developers of distributed applications Specify overlay at a high-level Specify overlay at a high-level Automatically translate specification into executable Automatically translate specification into executable Hide everything they dont want to touch Hide everything they dont want to touch Enjoy performance that is good enough Enjoy performance that is good enough Do for networked systems what the relational revolution did for databases Do for networked systems what the relational revolution did for databases

Petros Maniatis, IRB Intel Research 6 Enter P2: Semantics Distributed state Distributed state Distributed soft state in relational tables, holding tuples of values Distributed soft state in relational tables, holding tuples of values route (S, D, H) route (S, D, H) Non-stored information passes around as event tuple streams Non-stored information passes around as event tuple streams message (X, D) message (X, D) Overlay specification in declarative logic language (OverLog) Overlay specification in declarative logic language (OverLog) :-,, …,. :-,, …,. Location place individual tuples at specific nodes Location place individual tuples at specific nodes D) :- D, H), D). D) :- D, H), D). (a, y, c) (a, z, r) (a, z, t) z) z) z)

Petros Maniatis, IRB Intel Research 7 Enter P2: Dataflow Specification automatically translated to a dataflow graph Specification automatically translated to a dataflow graph C++ dataflow elements (akin to Click elements) C++ dataflow elements (akin to Click elements) Implement Implement relational operators (joins, selections, projections) relational operators (joins, selections, projections) flow operators (multiplexers, demultiplexers, queues) flow operators (multiplexers, demultiplexers, queues) network operators (congestion control, retry, rate limitation) network operators (congestion control, retry, rate limitation) Interlinked via asynchronous push or pull typed flows Interlinked via asynchronous push or pull typed flows Pull carries a callback from the puller in case it fails Pull carries a callback from the puller in case it fails Push always succeeds, but halts subsequent pushes Push always succeeds, but halts subsequent pushes Execution engine runs the dataflow graph Execution engine runs the dataflow graph Simple FIFO event scheduler (a la libasync) for I/O, alarms, deferred execution, etc. Simple FIFO event scheduler (a la libasync) for I/O, alarms, deferred execution, etc. demux A distributed query processor to maintain overlays

Petros Maniatis, IRB Intel Research 8 Example: Ring Routing Every node has an address (e.g., IP address) and an identifier (large random) Every node has an address (e.g., IP address) and an identifier (large random) Every object has an identifier Every object has an identifier Order nodes and objects into a ring by their identifiers Order nodes and objects into a ring by their identifiers Objects served by their successor node Objects served by their successor node Every node knows its successor on the ring Every node knows its successor on the ring To find object K, walk around the ring until I locate Ks immediate successor node To find object K, walk around the ring until I locate Ks immediate successor node

Petros Maniatis, IRB Intel Research 9 Example: Ring Routing How do I find the responsible node for a given key k? n.lookup(k) if k in (n, n.successor) return n.successor else return n.successor. lookup(k)

Petros Maniatis, IRB Intel Research 10 Ring State n.lookup(k) if k in (n, n.successor) return n.successor else return n.successor. lookup(k) Node state tuples node(NAddr, N) successor(NAddr, Succ, SAddr) Transient event tuples lookup (NAddr, Req, K)

Petros Maniatis, IRB Intel Research 11 Pseudocode to OverLog n.lookup(k) if k in (n, n.successor] return n.successor else return n.successor. lookup(k) Node state tuples node(NAddr, N) successor(NAddr, Succ, SAddr) Transient event tuples lookup (NAddr, Req, K) R1 response (Req, K, SAddr) :- lookup (NAddr, Req, K), node (NAddr, N), succ (NAddr, Succ, SAddr), K in (N, Succ].

Petros Maniatis, IRB Intel Research 12 Pseudocode to OverLog n.lookup(k) if k in (n, n.successor] return n.successor else return n.successor. lookup(k) Node state tuples node(NAddr, N) successor(NAddr, Succ, SAddr) Transient event tuples lookup (NAddr, Req, K) R1 response (Req, K, SAddr) :- lookup (NAddr, Req, K), node (NAddr, N), succ (NAddr, Succ, SAddr), K in (N, Succ]. R2 lookup (SAddr, Req, K) :- lookup (NAddr, Req, K), node (NAddr, N), succ (NAddr, Succ, SAddr), K not in (N, Succ].

Petros Maniatis, IRB Intel Research 13 Location Specifiers n.lookup(k) if k in (n, n.successor] return n.successor else return n.successor. lookup(k) Node state tuples node(NAddr, N) successor(NAddr, Succ, SAddr) Transient event tuples lookup (NAddr, Req, K) R1 K, SAddr) :- Req, K), N), Succ, SAddr), K in (N, Succ]. R2 Req, K) :- Req, K), N), Succ, SAddr), K not in (N, Succ].

Petros Maniatis, IRB Intel Research 14 From OverLog to Dataflow R1 K, SI) : - R, K), N), S, SI), K in (N, S]. R2 R, K) :- R, K), N), S, SI), K not in (N, S].

Petros Maniatis, IRB Intel Research 15 From OverLog to Dataflow R1 K, SI) : - R, K), N), S, SI), K in (N, S]. R2 R, K) :- R, K), N), S, SI), K not in (N, S].

Petros Maniatis, IRB Intel Research 16 From OverLog to Dataflow R1 K, SI) : - R, K), N), S, SI), K in (N, S]. R2 R, K) :- R, K), N), S, SI), K not in (N, S].

Petros Maniatis, IRB Intel Research 17 From OverLog to Dataflow R1 K, SI) : - R, K), N), S, SI), K in (N, S]. R2 R, K) :- R, K), N), S, SI), K not in (N, S].

Petros Maniatis, IRB Intel Research 18 From OverLog to Dataflow R1 K, SI) : - R, K), N), S, SI), K in (N, S]. R2 R, K) :- R, K), N), S, SI), K not in (N, S].

Petros Maniatis, IRB Intel Research 19 From OverLog to Dataflow R1 K, SI) : - R, K), N), S, SI), K in (N, S]. R2 R, K) :- R, K), N), S, SI), K not in (N, S].

Petros Maniatis, IRB Intel Research 20 From OverLog to Dataflow R1 K, SI) : - R, K), N), S, SI), K in (N, S]. R2 R, K) :- R, K), N), S, SI), K not in (N, S].

Petros Maniatis, IRB Intel Research 21 From OverLog to Dataflow R1 K, SI) : - R, K), N), S, SI), K in (N, S]. R2 R, K) :- R, K), N), S, SI), K not in (N, S].

Petros Maniatis, IRB Intel Research 22 From OverLog to Dataflow One rule strand per OverLog rule Rule order is immaterial Rule strands could execute in parallel

Petros Maniatis, IRB Intel Research 23 Transport and App Logic

Petros Maniatis, IRB Intel Research 24 A Bit of Chord

Petros Maniatis, IRB Intel Research 25 Chord on P2 Full specification of ToN Chord Full specification of ToN Chord Multiple successors Multiple successors Stabilization Stabilization Failure recovery Failure recovery Optimized finger maintenance Optimized finger maintenance 46 OverLog rules 46 OverLog rules (1 USletter page, 10pt font) (1 USletter page, 10pt font) How do we know it works? How do we know it works? Same high-level properties Same high-level properties Logarithmic overlay diameter Logarithmic overlay diameter Logarithmic state size Logarithmic state size Consistent routing with churn Consistent routing with churn Comparable performance to hand- coded implementations Comparable performance to hand- coded implementations

Petros Maniatis, IRB Intel Research 26 Lookup length in hops (no churn)

Petros Maniatis, IRB Intel Research 27 Maintenance bandwidth (no churn)

Petros Maniatis, IRB Intel Research 28 Lookup Latency (no churn)

Petros Maniatis, IRB Intel Research 29 Lookup Latency (with churn)

Petros Maniatis, IRB Intel Research 30 Lookup Consistency (with churn) Consistent fraction: size fraction of largest result cluster Consistent fraction: size fraction of largest result cluster k lookups, different sources, same destination k lookups, different sources, same destination

Petros Maniatis, IRB Intel Research 31 Maintenance bandwidth (churn)

Petros Maniatis, IRB Intel Research 32 But Still a Research Prototype Bugs still creep up (algorithmic logic / P2 impl.) Bugs still creep up (algorithmic logic / P2 impl.) Multi-resolution system introspection Multi-resolution system introspection Application-specific network tuning, auto or otherwise still needed Application-specific network tuning, auto or otherwise still needed Component-based reconfigurable transports Component-based reconfigurable transports Logical duplications ripe for removal Logical duplications ripe for removal Factorizations and Cost-based optimizations Factorizations and Cost-based optimizations

Petros Maniatis, IRB Intel Research System Introspection Two unique opportunities Two unique opportunities Transparent execution tracing Transparent execution tracing A distributed query processor on all system state A distributed query processor on all system state

Petros Maniatis, IRB Intel Research 34 Execution Tracing and Logging Execution tracing/logging happens externally to system specification Execution tracing/logging happens externally to system specification At pseudo-code granularity: logical stepping At pseudo-code granularity: logical stepping Why did rule R7 trigger? Under what preconditions? Why did rule R7 trigger? Under what preconditions? Every rule execution (input and outputs) is exported as a table Every rule execution (input and outputs) is exported as a table –ruleExec(Rule, InTuple, OutTuple, OutNode, Time) At dataflow granularity: intermediate representation stepping At dataflow granularity: intermediate representation stepping Why did that tuple expire? What dropped from that queue? Why did that tuple expire? What dropped from that queue? Every dataflow element execution exported as a table, flows tapped and exported Every dataflow element execution exported as a table, flows tapped and exported –queueExec(…), roundRobinExec(…), … Transparent logging by the execution engine Transparent logging by the execution engine No need to insert printfs and hope for the best No need to insert printfs and hope for the best Can traverse execution graph for particular system events Can traverse execution graph for particular system events Its preconditions, and their preconditions, and so on across the net Its preconditions, and their preconditions, and so on across the net

Petros Maniatis, IRB Intel Research 35 Distributed Query Processing Once you have a distributed query processor, lots of things fall off the back of the truck Once you have a distributed query processor, lots of things fall off the back of the truck Overlay invariant monitoring: a distributed watchpoint Overlay invariant monitoring: a distributed watchpoint Whats the average path length? Whats the average path length? Is routing consistent? Is routing consistent? Pattern matching on distributed execution graph Pattern matching on distributed execution graph Is a routing entry gossiped in a cycle? Is a routing entry gossiped in a cycle? How many lookup failures were caused by stale routing state? How many lookup failures were caused by stale routing state? What are the nodes with best-successor in-degree > 1? What are the nodes with best-successor in-degree > 1? Which bits of state only occur when a lookup fails somewhere? Which bits of state only occur when a lookup fails somewhere? Monitoring disparate overlays / systems together Monitoring disparate overlays / systems together When overlay A does this, what is overlay B doing? When overlay A does this, what is overlay B doing? When overlay A does this, what is the network, average CPU, … doing? When overlay A does this, what is the network, average CPU, … doing?

Petros Maniatis, IRB Intel Research Reconfigurable Transport New lease on life of an old idea! New lease on life of an old idea! Dataflow paradigm thins out layer boundaries Dataflow paradigm thins out layer boundaries Mix and match transport facilities (retries, congestion control, rate limitation, buffering) Mix and match transport facilities (retries, congestion control, rate limitation, buffering) Spread bits of transport through the application to suit application requirements Spread bits of transport through the application to suit application requirements Move buffering before computation Move buffering before computation Move retries before route selection Move retries before route selection Use single congestion control across all destinations Use single congestion control across all destinations Express transport spec at high-level Express transport spec at high-level Packetize all msgs to same dest together, but send acks separately Packetize all msgs to same dest together, but send acks separately Packetize updates but not acks Packetize updates but not acks

Petros Maniatis, IRB Intel Research Automatic Optimization Optimize within rules Optimize within rules Selects before joins, join ordering Selects before joins, join ordering Optimize across rules & queries Optimize across rules & queries Common subexpression elimination Common subexpression elimination Optimize across nodes Optimize across nodes Send the smallest relation over the network Send the smallest relation over the network Caching of intermediate results Caching of intermediate results Optimize scheduling Optimize scheduling Prolific rules before deadbeats Prolific rules before deadbeats

Petros Maniatis, IRB Intel Research 38 What We Dont Know (Yet) The limits of first-order logic The limits of first-order logic Already pushing through to second-order, to do introspection Already pushing through to second-order, to do introspection Can be awkward to translate inherently imperative constructs, etc. if-then-else / loops Can be awkward to translate inherently imperative constructs, etc. if-then-else / loops The limits of the dataflow model The limits of the dataflow model Control vs. data flow Control vs. data flow Can we eliminate (most) queues? If not, whats the point? Can we eliminate (most) queues? If not, whats the point? Can we do concurrency control for parallel execution? Can we do concurrency control for parallel execution? The limits of automation The limits of automation Can we (ever) do better than hand-coded implementations? Does it matter? Can we (ever) do better than hand-coded implementations? Does it matter? How good is good enough? How good is good enough? Will designers settle for auto-generation? DBers did, but this is a different community Will designers settle for auto-generation? DBers did, but this is a different community The limits of static checking The limits of static checking Can we keep the semantics simple enough for existing checks (termination, safety, …) to still work automatically? Can we keep the semantics simple enough for existing checks (termination, safety, …) to still work automatically?

Petros Maniatis, IRB Intel Research 39 Related Work Early work on executable protocol specification Early work on executable protocol specification Esterel, Estelle, LOTOS (finite state machine specs) Esterel, Estelle, LOTOS (finite state machine specs) Morpheus, Prolac (domain-specific, OO) Morpheus, Prolac (domain-specific, OO) RTAG (grammar model) RTAG (grammar model) Click Click Dataflow approach for routing stacks Dataflow approach for routing stacks Larger elements, more straightforward scheduling Larger elements, more straightforward scheduling Deductive / active databases Deductive / active databases

Petros Maniatis, IRB Intel Research 40 Summary Overlays enable distributed system innovation Overlays enable distributed system innovation Wed better make them easier to build, reuse, understand Wed better make them easier to build, reuse, understand P2 enables P2 enables High-level overlay specification in OverLog High-level overlay specification in OverLog Automatic translation of specification into dataflow graph Automatic translation of specification into dataflow graph Execution of dataflow graph Execution of dataflow graph Explore and Embrace the trade-off between fine-tuning and ease of development Explore and Embrace the trade-off between fine-tuning and ease of development Get the full immersion treatment in our papers at SIGCOMM and SOSP 05 Get the full immersion treatment in our papers at SIGCOMM and SOSP 05

Petros Maniatis, IRB Intel Research 41 Questions (a few to get you started) Who cares about overlays? Who cares about overlays? Logic? You mean Prolog? Eeew! Logic? You mean Prolog? Eeew! This language is really ugly. Discuss. This language is really ugly. Discuss. But what about security? But what about security? Is anyone ever going to use this? Is anyone ever going to use this? Is this as revolutionary and inspired as it looks? Is this as revolutionary and inspired as it looks?