The Horus and Ensemble Projects: Accomplishments and Limitations Ken Birman, Robert Constable, Mark Hayden, Jason Hickey, Christoph Kreitz, Robbert van.

Slides:



Advertisements
Similar presentations
Dr. Multicast for Data Center Communication Scalability Ymir Vigfusson Hussam Abu-Libdeh Mahesh Balakrishnan Ken Birman Cornell University Yoav Tock IBM.
Advertisements

1 From Grids to Service-Oriented Knowledge Utilities research challenges Thierry Priol.
Reliable Multicast for Time-Critical Systems Mahesh Balakrishnan Ken Birman Cornell University.
System Area Network Abhiram Shandilya 12/06/01. Overview Introduction to System Area Networks SAN Design and Examples SAN Applications.
© Chinese University, CSE Dept. Software Engineering / Software Engineering Topic 1: Software Engineering: A Preview Your Name: ____________________.
Reliable Group Communication Quanzeng You & Haoliang Wang.
Distributed Processing, Client/Server, and Clusters
Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.
Copyright 2009 FUJITSU TECHNOLOGY SOLUTIONS PRIMERGY Servers and Windows Server® 2008 R2 Benefit from an efficient, high performance and flexible platform.
Ken Birman Cornell University. CS5410 Fall
Technical Architectures
Stuart AllenMark Bickford Robert Constable (PI) Christoph KreitzLori LorigoRobbert Van Renesse Secure software infrastructure Logic Programming Communications.
Rheeve: A Plug-n-Play Peer- to-Peer Computing Platform Wang-kee Poon and Jiannong Cao Department of Computing, The Hong Kong Polytechnic University ICDCSW.
Algorithm for Virtually Synchronous Group Communication Idit Keidar, Roger Khazan MIT Lab for Computer Science Theory of Distributed Systems Group.
Software Engineering and Middleware: a Roadmap by Wolfgang Emmerich Ebru Dincel Sahitya Gupta.
1 Spring Semester 2007, Dept. of Computer Science, Technion Internet Networking recitation #13 Web Caching Protocols ICP, CARP.
Internet Networking Spring 2002 Tutorial 13 Web Caching Protocols ICP, CARP.
Distributed Systems 2006 Group Membership * *With material adapted from Ken Birman.
Stuart AllenMark Bickford Robert Constable Richard Eaton Christoph KreitzLori Lorigo Secure software infrastructure Logic Programming Communications Advances.
Distributed Systems 2006 Virtual Synchrony* *With material adapted from Ken Birman.
Emerging Research Dimensions in IT Security Dr. Salar H. Naqvi Senior Member IEEE Research Fellow, CoreGRID Network of Excellence European.
SM3121 Software Technology Mark Green School of Creative Media.
CS514: Intermediate Course in Operating Systems Professor Ken Birman Ben Atkin: TA Lecture 24: Nov. 16.
Scalable Server Load Balancing Inside Data Centers Dana Butnariu Princeton University Computer Science Department July – September 2010 Joint work with.
How WebMD Maintains Operational Flexibility with NoSQL Rajeev Borborah, Sr. Director, Engineering Matt Wilson – Director, Production Engineering – Consumer.
Adapting Legacy Computational Software for XMSF 1 © 2003 White & Pullen, GMU03F-SIW-112 Adapting Legacy Computational Software for XMSF Elizabeth L. White.
Technology Overview. Agenda What’s New and Better in Windows Server 2003? Why Upgrade to Windows Server 2003 ?  From Windows NT 4.0  From Windows 2000.
Web services: Why and How OOPSLA 2001 F. Curbera, W.Nagy, S.Weerawarana Nclab, Jungsook Kim.
Ensemble: A Tool for Building Highly Assured Networks Professor Kenneth P. Birman Cornell University
Microsoft Active Directory(AD) A presentation by Robert, Jasmine, Val and Scott IMT546 December 11, 2004.
Low-Power Wireless Sensor Networks
IMPROUVEMENT OF COMPUTER NETWORKS SECURITY BY USING FAULT TOLERANT CLUSTERS Prof. S ERB AUREL Ph. D. Prof. PATRICIU VICTOR-VALERIU Ph. D. Military Technical.
Computer Science Open Research Questions Adversary models –Define/Formalize adversary models Need to incorporate characteristics of new technologies and.
Architecting Web Services Unit – II – PART - III.
Uncovering the Multicore Processor Bottlenecks Server Design Summit Shay Gal-On Director of Technology, EEMBC.
SPREAD TOOLKIT High performance messaging middleware Presented by Sayantam Dey Vipin Mehta.
Architectures of distributed systems Fundamental Models
Chapter 4 Realtime Widely Distributed Instrumention System.
Consistent and Efficient Database Replication based on Group Communication Bettina Kemme School of Computer Science McGill University, Montreal.
Heavy and lightweight dynamic network services: challenges and experiments for designing intelligent solutions in evolvable next generation networks Laurent.
April 26, CSE8380 Parallel and Distributed Processing Presentation Hong Yue Department of Computer Science & Engineering Southern Methodist University.
Ensemble and Beyond Presentation to David Tennenhouse, DARPA ITO Ken Birman Dept. of Computer Science Cornell University.
A Novel Multicast Routing Protocol for Mobile Ad Hoc Networks Zeyad M. Alfawaer, GuiWei Hua, and Noraziah Ahmed American Journal of Applied Sciences 4:
Improving the Efficiency of Fault-Tolerant Distributed Shared-Memory Algorithms Eli Sadovnik and Steven Homberg Second Annual MIT PRIMES Conference, May.
Fault-Tolerant Parallel and Distributed Computing for Software Engineering Undergraduates Ali Ebnenasir and Jean Mayo {aebnenas, Department.
Server to Server Communication Redis as an enabler Orion Free
Distributed Information Systems. Motivation ● To understand the problems that Web services try to solve it is helpful to understand how distributed information.
MAPLD 2005/254C. Papachristou 1 Reconfigurable and Evolvable Hardware Fabric Chris Papachristou, Frank Wolff Robert Ewing Electrical Engineering & Computer.
Ensemble Fault-Tolerance Security Adaptation. The Horus and Ensemble Projects Accomplishments and Limitations Kent Birman, Bob Constable, Mayk Hayden,
11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.
1 Recommendations Now that 40 GbE has been adopted as part of the 802.3ba Task Force, there is a need to consider inter-switch links applications at 40.
Group member: Kai Hu Weili Yin Xingyu Wu Yinhao Nie Xiaoxue Liu Date:2015/10/
LRPC Firefly RPC, Lightweight RPC, Winsock Direct and VIA.
Scalable Group Communication for the Internet Idit Keidar MIT Lab for Computer Science Theory of Distributed Systems Group.
Architecture View Models A model is a complete, simplified description of a system from a particular perspective or viewpoint. There is no single view.
Computer Simulation of Networks ECE/CSC 777: Telecommunications Network Design Fall, 2013, Rudra Dutta.
Group Communication Theresa Nguyen ICS243f Spring 2001.
Building reliable, high- performance communication systems from components Xiaoming Liu, Christoph Kreitz, Robbert van Renesse, Jason Hickey, Mark Hayden,
March 2004 At A Glance The AutoFDS provides a web- based interface to acquire, generate, and distribute products, using the GMSEC Reference Architecture.
 Attacks and threats  Security challenge & Solution  Communication Infrastructure  The CA hierarchy  Vehicular Public Key  Certificates.
1 Group Communication based on Standard Interfaces Matthias Wiesmann ✶, Xavier Défago ✧, André Schiper ✶ ✶ Swiss Federal Institute of Technology at Lausanne.
Improve the Performance, Scalability, and Reliability of Applications in the Cloud with jetNEXUS Load Balancer for Microsoft Azure MICROSOFT AZURE ISV.
Replication & Fault Tolerance CONARD JAMES B. FARAON
The consensus problem in distributed systems
Zueyong Zhu† and J. William Atwood‡
Software Defined Networking (SDN)
Software Defined Networking (SDN)
Human Complexity of Software
Presentation transcript:

The Horus and Ensemble Projects: Accomplishments and Limitations Ken Birman, Robert Constable, Mark Hayden, Jason Hickey, Christoph Kreitz, Robbert van Renesse, Ohul Rodeh, Werner Vogels Department of Computer Science Cornell University

January, 2000 Cornell Presentation at DISCEX Reliable Distributed Computing: Increasingly urgent, still unsolved  Distributed computing has swept the world Impact has become revolutionary Vast wave of applications migrating to networks Already as critical a national infrastructure as water, electricity, or telephones  Yet distributed systems remain Unreliable, prone to inexplicable outages Insecure, easily attacked Difficult (and costly) to program, bug-prone

January, 2000 Cornell Presentation at DISCEX A National Imperative  Potential for catastrophe cited by DARPA ISAT study commissioned by Anita Jones (1985, I briefed the findings, became basis for refocusing of much of ITO under Howard Frank) PCCIP report, PTAC NAS study of trust in cyberspace  Need a quantum improvement in technologies, packaged in easily used, practical forms

January, 2000 Cornell Presentation at DISCEX Quick Timeline  Cornell has developed 3 generations of reliable group communication technology Isis Toolkit: Horus System: Ensemble System:  Today engaged in a new effort reflecting a major shift in emphasis Spinglass Project: 1999-

January, 2000 Cornell Presentation at DISCEX Questions to consider  Have these projects been successful? What did we do? How can impact be quantified? What limitations did we encounter?  How is industry responding?  What next?

January, 2000 Cornell Presentation at DISCEX Timeline Isis Horus Ensemble Introduced reliability into group computing Virtual synchrony execution model Elaborate, monolithic, but adequate speed Many transition successes New York, Swiss Stock Exchanges French Air Traffic Control console system Southwestern Bell Telephone network mgt. Hiper-D (next generation AEGIS)

January, 2000 Cornell Presentation at DISCEX Virtual Synchrony Model crash G 0 ={p,q} G 1 ={p,q,r,s} G 2 ={q,r,s} G 3 ={q,r,s,t} pqrstpqrst r, s request to join r,s added; state xfer t added, state xfer t requests to join p fails

January, 2000 Cornell Presentation at DISCEX Why a “model”?  Models can be reduced to theory – we can prove the properties of the model, and can decide if a protocol achieves it  Enables rigorous application-level reasoning  Otherwise, the application must guess at possible misbehaviors and somehow overcome them

January, 2000 Cornell Presentation at DISCEX Virtual Synchrony  Became widely accepted – basis of literally dozens of research systems and products worldwide  Seems to be the only way to solve problems based on replication  Very fast in small systems but faces scaling limitations in large ones

January, 2000 Cornell Presentation at DISCEX How Do We Use The Model?  Makes it easy to reason about the state of a distributed computation  Allows us to replicate data or computation for fault-tolerance (or because multiple users share same data)  Can also replicate security keys, do load- balancing, synchronization…

January, 2000 Cornell Presentation at DISCEX French ATC system (simplified) Controllers Air Traffic Database (flight plans, etc) X.500 Directory Radar Onboard

January, 2000 Cornell Presentation at DISCEX A center contains...  Perhaps 50 “teams” of 3-5 controllers each  Each team supported by workstation cluster  Cluster-style database server has flight plan information  Radar server distributes real-time updates  Connections to other control centers (40 or so in all of Europe, for example)

January, 2000 Cornell Presentation at DISCEX Process groups arise here:  Cluster of servers running critical database server programs  Cluster of controller workstations support ATC by teams of controllers  Radar must send updates to the relevant group of control consoles  Flight plan updates must be distributed to the “downstream” control centers

January, 2000 Cornell Presentation at DISCEX Role For Virtual Synchrony?  French government knows requirements for safety in ATC application  With our model, we can reduce their need to a formal set of statements  This lets us establish that our solution will really be safe in their setting  Contrast with usual ad-hoc methodologies...

January, 2000 Cornell Presentation at DISCEX More Isis Users  New York Stock Exchange  Swiss Stock Exchange  Many VLSI Fabrication Facilities  Many telephony control applications  Hiper-D – an AEGIS rebuild prototype  Various NSA and military applications  Architecture contributed to SC-21/DD-21

January, 2000 Cornell Presentation at DISCEX Timeline Isis Horus Ensemble Simpler, faster group communication system Uses a modular layered architecture. Layers are “compiled,” headers compressed for speed Supports dynamic adaptation and real-time apps Partitionable version of virtual synchrony Transitioned primarily through Stratus Computer  Phoenix product, for telecommunications

January, 2000 Cornell Presentation at DISCEX Layered Microprotocols in Horus Interface to Horus is extremely flexible Horus manages group abstraction group semantics (membership, actions, events) defined by stack of modules encrypt filter sign ftol Ensemble stacks plug-and-play modules to give design flexibility to developer vsync

January, 2000 Cornell Presentation at DISCEX Layered Microprotocols in Horus Interface to Horus is extremely flexible Horus manages group abstraction group semantics (membership, actions, events) defined by stack of modules encrypt filter sign ftol Ensemble stacks plug-and-play modules to give design flexibility to developer vsync

January, 2000 Cornell Presentation at DISCEX Layered Microprotocols in Horus Interface to Horus is extremely flexible Horus manages group abstraction group semantics (membership, actions, events) defined by stack of modules encrypt filter sign Ensemble stacks plug-and-play modules to give design flexibility to developer vsync ftol

January, 2000 Cornell Presentation at DISCEX Group Members Use Identical Multicast Protocol Stacks encrypt vsync ftol encrypt vsync ftol encrypt vsync ftol

January, 2000 Cornell Presentation at DISCEX encrypt vsync ftol encrypt vsync ftol encrypt vsync ftol With Multiple Stacks, Multiple Properties encrypt vsync ftol encrypt vsync ftol encrypt vsync ftol

January, 2000 Cornell Presentation at DISCEX Timeline Isis Horus Ensemble Horus-like stacking architecture, equally fast Includes an group-key mechanism for secure group multicast and key management Uses high level language, can be formally proved, an unexpected and major success Many early transition successes  DD-21, Quorum via collaboration with BBN  Nortel, STC: commercial users  Discussions with MS (COM+): could be basis of standards.

January, 2000 Cornell Presentation at DISCEX Proving Ensemble Correct  Unlike Isis and Horus, Ensemble is coded in a language with strong semantics (ML)  So we took a spec. of virtual synchrony from MIT’s IOA group (Nancy Lynch)  And are actually able to prove that our code implements the spec. and that the spec captures the virtual synchrony property!

January, 2000 Cornell Presentation at DISCEX Why is this important?  If we use Ensemble to secure keys, our proof is a proof of security of the group keying protocol…  And the proof extends not just to the algorithm but also to the actual code implementing it  These are the largest formal proofs every undertaken!

January, 2000 Cornell Presentation at DISCEX Why is this feasible?  Power of the NuPRL system: a fifth generation theorem proving technology  Simplifications gained through modularity: compositional code inspires a style of compositional proof  Ensemble itself is unusually elegant, protocols are spare and clear

January, 2000 Cornell Presentation at DISCEX Other Accomplishments  An automated optimization technology Often, a simple modular protocol becomes complex when optimized for high performance Our approach automates optimization: the basic protocol is only coded once and we work with a single, simple, clear version Optimizer works almost as well as hand- optimization and can be invoked at runtime

January, 2000 Cornell Presentation at DISCEX Optimization encrypt vsync ftol Original code is simple but inefficient Optimized is provably the same yet inefficiencies are eliminated

January, 2000 Cornell Presentation at DISCEX Other Accomplishments  Real-Time Fault-Tolerant Clusters Problem originated in AEGIS tracking server Need a scalable, fault-tolerant parallel server with rapid real-time guarantees

January, 2000 Cornell Presentation at DISCEX AEGIS Problem Emulate this…With this… Tracking Server 100ms deadline

January, 2000 Cornell Presentation at DISCEX Other Accomplishments  Real-Time Fault-Tolerant Clusters Problem originated in AEGIS tracking server Need a scalable, fault-tolerant parallel server with rapid real-time guarantees With Horus, we achieved 100ms response time, even when nodes crash, scalability to 64 nodes or more, load balancing and linear speedups  Our approach emerged as one of the major themes in SC-21, which became DD-21

January, 2000 Cornell Presentation at DISCEX Other Accomplishments  A flexible, object-oriented toolkit Standardizes the sorts of things programmers do most often Programmers are able to work with high level abstractions rather than being forced to reimplement common tools, like replicated data, each time they are needed Embedding into NT COM architecture

January, 2000 Cornell Presentation at DISCEX Security Architecture  Group key management  Fault-tolerant, partitionable  Currently exploring a very large scale configuration that would permit rapid key refresh and revocation even with millions of users  All provably correct

January, 2000 Cornell Presentation at DISCEX Transition Paths?  Through collaboration with BBN, delivered to DARPA QUOIN effort  Part of DD-21 architecture  Strong interest in industry, good prospects for “major vendor” offerings within a year or two  A good success for Cornell and DARPA

January, 2000 Cornell Presentation at DISCEX What Next?  Continue some work with Ensemble Research focuses on proof of replication stack Otherwise, keep it alive, support and extend it Play an active role in transition Assist standards efforts  But shift in focus to a completely new effort Emphasize adaptive behavior, extreme scalability, robustness against local disruption Fits “Intrinisically Survivable Systems” initiative

January, 2000 Cornell Presentation at DISCEX Throughput Stability: Achilles Heel of Group Multicast  When scaled to even modest environments, overheads of virtual synchrony become a problem One serious challenge involves management of group membership information But multicast throughput also becomes unstable with high data rates, large system size, too.  A problem in every protocol we’ve studied including other “scalable, reliable” protocols

January, 2000 Cornell Presentation at DISCEX Thoughput Scenario Most members are healthy…. … but one is slow

January, 2000 Cornell Presentation at DISCEX Throughput as one member of a multicast group is "perturbed" by forcing it to sleep for varying amounts of time Virtually synchronous Ensemble multicast protocols Degree of slowdown Group throughput (healthy members) 32 group members 64 group members 96 group members

January, 2000 Cornell Presentation at DISCEX Bimodal Multicast in Spinglass  A new family of protocols with stable throughput, extremely scalable, fixed and low overhead per process and per message  Gives tunable probabilistic guarantees  Includes a membership protocol and a multicast protocol  Ideal match for small, nomadic devices

January, 2000 Cornell Presentation at DISCEX slowdown average throughput slowdown average throughput Throughput with 25% Slow processes

January, 2000 Cornell Presentation at DISCEX Spinglass: Summary of objectives  Radically different approach yields stable, scalable protocols with steady throughput  Small footprint, tunable to match conditions  Completely asynchronous, hence demands new style of application development  But opens the door to a new lightweight reliability technology supporting large autonomous environments that adapt

January, 2000 Cornell Presentation at DISCEX Conclusions  Cornell: leader in reliable distributed computing  High impact on important DoD problems, such as AEGIS (DD-21), QUOIN, NSA intelligence gathering, many other applications  Demonstrated modular plug-and-play protocols that perform well and can be proved correct  Transition into standard, off the shelf O/S  Spinglass – the next major step forward