Ensemble and Beyond Presentation to David Tennenhouse, DARPA ITO Ken Birman Dept. of Computer Science Cornell University.

Slides:



Advertisements
Similar presentations
Ranveer Chandra Ramasubramanian Venugopalan Ken Birman
Advertisements

Reliable Multicast for Time-Critical Systems Mahesh Balakrishnan Ken Birman Cornell University.
A component- and message-based architectural style for GUI software
Remote Procedure Call (RPC)
JazzEnsemble: A Group Communication Middleware for MANET Roy Friedman Technion Israel.
Reliable Group Communication Quanzeng You & Haoliang Wang.
Lab 2 Group Communication Andreas Larsson
Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.
Ken Birman Cornell University. CS5410 Fall
Technical Architectures
Virtual Synchrony Ki Suh Lee Some slides are borrowed from Ken, Jared (cs ) and Justin (cs )
Stuart AllenMark Bickford Robert Constable (PI) Christoph KreitzLori LorigoRobbert Van Renesse Secure software infrastructure Logic Programming Communications.
Internet Networking Spring 2006 Tutorial 12 Web Caching Protocols ICP, CARP.
Rheeve: A Plug-n-Play Peer- to-Peer Computing Platform Wang-kee Poon and Jiannong Cao Department of Computing, The Hong Kong Polytechnic University ICDCSW.
Algorithm for Virtually Synchronous Group Communication Idit Keidar, Roger Khazan MIT Lab for Computer Science Theory of Distributed Systems Group.
Software Connectors. Attach adapter to A Maintain multiple versions of A or B Make B multilingual Role and Challenge of Software Connectors Change A’s.
CS514: Intermediate Course in Operating Systems Professor Ken Birman Vivek Vishnumurthy: TA.
The Architecture Design Process
Group Communication Phuong Hoai Ha & Yi Zhang Introduction to Lab. assignments March 24 th, 2004.
1 Spring Semester 2007, Dept. of Computer Science, Technion Internet Networking recitation #13 Web Caching Protocols ICP, CARP.
Distributed Systems 2006 Retrofitting Reliability* *With material adapted from Ken Birman.
Distributed Systems 2006 Group Membership * *With material adapted from Ken Birman.
Group Communication Robbert van Renesse CS614 – Tuesday Feb 20, 2001.
CS514: Intermediate Course in Operating Systems Professor Ken Birman Ben Atkin: TA Lecture 21 Nov. 7.
Composition Model and its code. bound:=bound+1.
The Horus and Ensemble Projects: Accomplishments and Limitations Ken Birman, Robert Constable, Mark Hayden, Jason Hickey, Christoph Kreitz, Robbert van.
CS514: Intermediate Course in Operating Systems Professor Ken Birman Ben Atkin: TA Lecture 24: Nov. 16.
Transis 1 Fault Tolerant Video-On-Demand Services Tal Anker, Danny Dolev, Idit Keidar, The Transis Project.
Architectural Design Establishing the overall structure of a software system Objectives To introduce architectural design and to discuss its importance.
Beyond DHTML So far we have seen and used: CGI programs (using Perl ) and SSI on server side Java Script, VB Script, CSS and DOM on client side. For some.
Presentation on Osi & TCP/IP MODEL
Ensemble: A Tool for Building Highly Assured Networks Professor Kenneth P. Birman Cornell University
Computer Science Open Research Questions Adversary models –Define/Formalize adversary models Need to incorporate characteristics of new technologies and.
Wireless Networks Breakout Session Summary September 21, 2012.
SPREAD TOOLKIT High performance messaging middleware Presented by Sayantam Dey Vipin Mehta.
Lab 2 Group Communication Farnaz Moradi Based on slides by Andreas Larsson 2012.
Consistent and Efficient Database Replication based on Group Communication Bettina Kemme School of Computer Science McGill University, Montreal.
Advanced Computer Networks Topic 2: Characterization of Distributed Systems.
Farnaz Moradi Based on slides by Andreas Larsson 2013.
9 Systems Analysis and Design in a Changing World, Fourth Edition.
2007/1/15http:// Lightweight Probabilistic Broadcast M2 Tatsuya Shirai M1 Dai Saito.
Ensemble Fault-Tolerance Security Adaptation. The Horus and Ensemble Projects Accomplishments and Limitations Kent Birman, Bob Constable, Mayk Hayden,
11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.
November NC state university Group Communication Specifications Gregory V Chockler, Idit Keidar, Roman Vitenberg Presented by – Jyothish S Varma.
Totally Ordered Broadcast in the face of Network Partitions [Keidar and Dolev,2000] INF5360 Student Presentation 4/3-08 Miran Damjanovic
Scalable Group Communication for the Internet Idit Keidar MIT Lab for Computer Science Theory of Distributed Systems Group.
Architecture View Models A model is a complete, simplified description of a system from a particular perspective or viewpoint. There is no single view.
The Totem Single-Ring Ordering and Membership Protocol Y. Amir, L. E. Moser, P. M Melliar-Smith, D. A. Agarwal, P. Ciarfella.
V1.7Fault Tolerance1. V1.7Fault Tolerance2 A characteristic of Distributed Systems is that they are tolerant of partial failures within the distributed.
Slingshot: Time-Critical Multicast for Clustered Applications Mahesh Balakrishnan Stefan Pleisch Ken Birman Cornell University.
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT By Jyothsna Natarajan Instructor: Prof. Yanqing Zhang Course: Advanced Operating Systems.
Group Communication Theresa Nguyen ICS243f Spring 2001.
Software Connectors Acknowledgement: slides mostly from Software Architecture: Foundations, Theory, and Practice; Richard N. Taylor, Nenad Medvidovic,
Building reliable, high- performance communication systems from components Xiaoming Liu, Christoph Kreitz, Robbert van Renesse, Jason Hickey, Mark Hayden,
Discussing “Developing Secure Systems with UMLSec” 15 FEB Joe Combs.
PROCESS RESILIENCE By Ravalika Pola. outline: Process Resilience  Design Issues  Failure Masking and Replication  Agreement in Faulty Systems  Failure.
CS 425 / ECE 428 Distributed Systems Fall 2015 Indranil Gupta (Indy) Lecture 9: Multicast Sep 22, 2015 All slides © IG.
Trustworthy Conferencing via Domain-specific Modeling and Low Latency Reliable Protocols Joe Hoffert, Douglas Schmidt (Vanderbilt University); Mahesh Balakrishnan,
1 Group Communications: Host Group and IGMP Dr. Rocky K. C. Chang 19 March, 2002.
Replication & Fault Tolerance CONARD JAMES B. FARAON
The consensus problem in distributed systems
Algorithm for Virtually Synchronous Group Communication
Distributed Systems – Paxos
Reliable group communication
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT -Sumanth Kandagatla Instructor: Prof. Yanqing Zhang Advanced Operating Systems (CSC 8320)
Software Defined Networking (SDN)
Providing Secure Storage on the Internet
Active replication for fault tolerance
Presentation transcript:

Ensemble and Beyond Presentation to David Tennenhouse, DARPA ITO Ken Birman Dept. of Computer Science Cornell University

Quick Timeline Cornell has developed 3 generations of reliable group communication technology –Isis Toolkit: –Horus System: –Ensemble System: Today starting a major shift in emphasis –Spinglass Project: 1999-

Questions to consider Have these projects been successful? What is the future of Ensemble if we move to a new and different focus? Nature of the new opportunity we now perceive

Timeline Isis Horus Ensemble Introduced reliability into group computing Virtual synchrony execution model Fairly elaborate, monolithic, but adequate speed Many transition successes New York, Swiss Stock Exchanges French Air Traffic Control console system Southwestern Bell Telephone network mgt. Hiper-D (next generation AEGIS)

Virtual Synchrony Model crash G 0 ={p,q} G 1 ={p,q,r,s} G 2 ={q,r,s} G 3 ={q,r,s,t} pqrstpqrst r, s request to join r,s added; state xfer t added, state xfer t requests to join p fails

Why a “model”? Models can be reduced to theory – we can prove the properties of the model, and can decide if a protocol achieves it Enables rigorous application-level reasoning Otherwise, the application must guess at possible misbehaviors and somehow overcome them

French ATC system (simplified) Controllers Air Traffic Database (flight plans, etc) X.500 Directory Radar Onboard

A center contains... Perhaps 50 “teams” of 3-5 controllers each Each team supported by workstation cluster Cluster-style database server has flight plan information Radar server distributes real-time updates Connections to other control centers (40 or so in all of Europe, for example)

Process groups arise here: Cluster of servers running critical database server programs Cluster of controller workstations support ATC by teams of controllers Radar must send updates to the relevant group of control consoles Flight plan updates must be distributed to the “downstream” control centers

Use of our model? French government knows requirements for safety in ATC application With our model, we can reduce their need to a formal set of statements This lets us establish that our solution will really be safe in their setting Contrast with usual ad-hoc methodologies...

Timeline Isis Horus Ensemble Simpler, faster group communication system Uses a modular layered architecture. Layers are “compiled,” headers compressed for speed Supports dynamic adaptation and real-time apps Partitionable version of virtual synchrony Transitioned primarily through Stratus Computer Phoenix system Basis of Stratus f.tol. Proposal to OMG

Layered Microprotocols in Horus Interface to Horus is extremely flexible Horus manages group abstraction group semantics (membership, actions, events) defined by stack of modules encrypt vsync filter sign ftol Ensemble stacks plug-and-play modules to give design flexibility to developer

Processes Communicate Through Identical Multicast Protocol Stacks encrypt vsync ftol encrypt vsync ftol encrypt vsync ftol

Superimposed Groups in Application With Multiple Subsystems encrypt vsync ftol encrypt vsync ftol encrypt vsync ftol encrypt vsync ftol encrypt vsync ftol encrypt vsync ftol Magenta group for video communication Orange for control and coordination

Timeline Isis Horus Ensemble Horus-like stacking architecture, equally fast Includes an innovative group-key mechanism for secure group multicast and key management Uses high level language and can be formally proved correct, an unexpected and major success Many early transition successes SC-21, Quorum via collaboration with BBN Nortel, STC: potential commercial users Discussions with MS (COM+), Sun (RMI.next): could be basis of standards.

Proving Ensemble Correct Unlike Isis and Horus, Ensemble is coded in a language with strong semantics (ML) So we took a spec. of virtual synchrony from MIT’s IOA group (Nancy Lynch) And are actually able to prove that our code implements the spec. and that the spec captures the virtual synchrony property!

What Next? Continue some work with Ensemble –Keep it alive, support and extend it –Play an active role in transition –Assist standards efforts But shift in focus to a completely new effort –Emphasize adaptive behavior, extreme scalability, robustness against local disruption –Fits “Intrinisically Survivable Systems” initiative

Throughput Stability: Achilles Heel of Group Multicast When scaled to even modest environments, overheads of virtual synchrony become a problem –One serious challenge involves management of group membership information –But multicast throughput also becomes unstable with high data rates, large system size, too. Stability of protocols like SRM unknown

Stock Exchange Problem: Vsync. multicast is too “fragile” Most members are healthy…. … but one is slow

Figure 1: Multicast throughput in an 8-member group perturbed by transient failures Ideal Actual

Bimodal Multicast in Spinglass A new family of protocols with stable throughput, extremely scalable, fixed and low overhead per process and per message Gives tunable probabilistic guarantees Includes a membership protocol and a multicast protocol Requires some very weak QoS assumptions

Start by using unreliable multicast to rapidly distribute the message. But some messages may not get through, and some processes may be faulty. So initial state involves partial distribution of multicast(s)

Periodically (e.g. every 100ms) each process sends a digest describing its state to some randomly selected group member. The digest identifies messages. It doesn’t include them.

Recipient checks the gossip digest against its own history and solicits a copy of any missing message from the process that sent the gossip

Processes respond to solicitations received during a round of gossip by retransmitting the requested message. The round lasts much longer than a typical RPC time.

Figure 5: Graphs of analytical results

Spinglass: Summary of objectives Radically different approach yields stable, scalable protocols with steady throughput Small footprint, tunable to match conditions Completely asynchronous, hence demands new style of application development But opens the door to a new lightweight reliability technology supporting large autonomous environments that adapt