Performance and Availability in Wide-Area Service Composition Bhaskaran Raman ICEBERG, EECS, U.C.Berkeley Presentation at Siemens, June 2001.

Slides:



Advertisements
Similar presentations
Service Encapsulation in ICEBERG Bhaskaran Raman ICEBERG, EECS, U.C.Berkeley Presentation at Ericsson, Sweden, June 2001.
Advertisements

Service Composition Scenarios for Next Generation Networks Bhaskaran Raman, ICEBERG, EECS, U.C.Berkeley Presentation at Siemens, Munich, June 2001.
The Impact of Policy and Topology on Internet Routing Convergence NANOG 20 October 23, 2000 Abha Ahuja InterNap *In collaboration with.
Availability and Performance in Wide-Area Service Composition Bhaskaran Raman EECS, U.C.Berkeley July 2002.
Packet Switching COM1337/3501 Textbook: Computer Networks: A Systems Approach, L. Peterson, B. Davie, Morgan Kaufmann Chapter 3.
Lecture 6 Overlay Networks CPE 401/601 Computer Network Systems slides are modified from Jennifer Rexford.
Wide-Area Service Composition: Evaluation of Availability and Scalability Bhaskaran Raman SAHARA, EECS, U.C.Berkeley Provider Q Texttoaudio Provider R.
Configurable restoration in overlay networks Matthew Caesar, Takashi Suzuki.
Efficient Streaming for Delay-tolerant Multimedia Applications Saraswathi Krithivasan Advisor: Prof. Sridhar Iyer.
1 In VINI Veritas: Realistic and Controlled Network Experimentation Jennifer Rexford with Andy Bavier, Nick Feamster, Mark Huang, and Larry Peterson
Application layer (continued) Week 4 – Lecture 2.
15-441: Computer Networking Lecture 26: Networking Future.
1 Resource Management in IP Telephony Networks Matthew Caesar, Dipak Ghosal, Randy H. Katz {mccaesar,
1 Switching and Forwarding Bridges and Extended LANs.
New Routing Architectures Jennifer Rexford Advanced Computer Networks Tuesdays/Thursdays 1:30pm-2:50pm.
(c) Anirban Banerjee, Winter 2005, CS-240, 2/1/2005. The Impact of Internet Policy and Topology on Delayed Routing convergence C. Labovitz, A. Ahuja, R.
Wide-Area Service Composition: Availability, Performance, and Scalability Bhaskaran Raman SAHARA, EECS, U.C.Berkeley SAHARA Retreat, Jan 2002.
Delayed Internet Routing Convergence Craig Labovitz, Abha Ahuja, Abhijit Bose, Farham Jahanian Presented By Harpal Singh Bassali.
Configurable restoration in overlay networks Matthew Caesar, Takashi Suzuki.
Internet-Scale Research at Universities Panel Session SAHARA Retreat, Jan 2002 Prof. Randy H. Katz, Bhaskaran Raman, Z. Morley Mao, Yan Chen.
RRAPID: Real-time Recovery based on Active Probing, Introspection, and Decentralization Takashi Suzuki Matthew Caesar.
Component-Based Routing for Mobile Ad Hoc Networks Chunyue Liu, Tarek Saadawi & Myung Lee CUNY, City College.
1 Switching and Forwarding Bridges and Extended LANs.
Problem Definition Data path –Created by the Automatic Path Creation (APC) component –Service: program with well-defined interface –Operator: stateless.
Availability in Wide-Area Service Composition Bhaskaran Raman and Randy H. Katz SAHARA, EECS, U.C.Berkeley.
CS448 Computer Networking Chapter 1 Introduction to Computer Networks Instructor: Li Ma Office: NBC 126 Phone: (713)
1 Meeyoung Cha, Sue Moon, Chong-Dae Park Aman Shaikh Placing Relay Nodes for Intra-Domain Path Diversity To appear in IEEE INFOCOM 2006.
1 Pertemuan 20 Teknik Routing Matakuliah: H0174/Jaringan Komputer Tahun: 2006 Versi: 1/0.
CS An Overlay Routing Scheme For Moving Large Files Su Zhang Kai Xu.
An Efficient Topology-Adaptive Membership Protocol for Large- Scale Cluster-Based Services Jingyu Zhou * §, Lingkun Chu*, Tao Yang* § * Ask Jeeves §University.
An Architecture for Optimal and Robust Composition of Services across the Wide-Area Internet Bhaskaran Raman Qualifying Examination Proposal Feb 12, 2001.
© 2006 Cisco Systems, Inc. All rights reserved.Cisco Public 1 Version 4.0 Identifying Application Impacts on Network Design Designing and Supporting Computer.
“Intra-Network Routing Scheme using Mobile Agents” by Ajay L. Thakur.
A Framework for Highly-Available Cascaded Real-Time Internet Services Bhaskaran Raman Qualifying Examination Proposal Feb 12, 2001 Examination Committee:
A Framework for Highly-Available Session-Oriented Internet Services Bhaskaran Raman, Prof. Randy H. Katz {bhaskar, The ICEBERG Project.
Establishing Connections Networking Modes: When you are evaluating a network, you concentrate on circuit switching versus packet switching. But it's also.
Introduction 1-1 EKT355/4 ADVANCED COMPUTER NETWORK MISS HASNAH AHMAD School of Computer & Communication Engineering.
Sharing Information across Congestion Windows CSE222A Project Presentation March 15, 2005 Apurva Sharma.
1 Mobile ad hoc networking with a view of 4G wireless: Imperatives and challenges Myungchul Kim Tel:
A Framework for Highly-Available Real-Time Internet Services Bhaskaran Raman, EECS, U.C.Berkeley.
A Routing Underlay for Overlay Networks Akihiro Nakao Larry Peterson Andy Bavier SIGCOMM’03 Reviewer: Jing lu.
Ch 1. Computer Networks and the Internet Myungchul Kim
Resilient Overlay Networks By David Andersen, Hari Balakrishnan, Frans Kaashoek, and Robert Morris MIT RON Paper from ACM Oct Advanced Operating.
ﺑﺴﻢﺍﷲﺍﻠﺭﺣﻣﻥﺍﻠﺭﺣﻳﻡ. Group Members Nadia Malik01 Malik Fawad03.
A comparison of overlay routing and multihoming route control Hayoung OH
Wide-Area Service Composition: Performance, Availability and Scalability Bhaskaran Raman SAHARA, EECS, U.C.Berkeley Presentation at Ericsson, Jan 2002.
Netprog: Routing and the Network Layer1 Routing and the Network Layer (ref: Interconnections by Perlman)
A Utility-based Approach to Scheduling Multimedia Streams in P2P Systems Fang Chen Computer Science Dept. University of California, Riverside
Latency & Scaling Issues in Mobile-IP Sreedhar Mukkamalla Bhaskaran Raman.
1 Wide Area Network Emulation on the Millennium Bhaskaran Raman Yan Chen Weidong Cui Randy Katz {bhaskar, yanchen, wdc, Millennium.
COS 420 Day 15. Agenda Finish Individualized Project Presentations on Thrusday Have Grading sheets to me by Friday Group Project Discussion Goals & Timelines.
CS 6401 Overlay Networks Outline Overlay networks overview Routing overlays Resilient Overlay Networks Content Distribution Networks.
Service Composition: Breakout Session Summary Randy Katz David Culler Summary: Bhaskaran Raman.
L Subramanian*, I Stoica*, H Balakrishnan +, R Katz* *UC Berkeley, MIT + USENIX NSDI’04, 2004 Presented by Alok Rakkhit, Ionut Trestian.
Network Computing Laboratory Load Balancing and Stability Issues in Algorithms for Service Composition Bhaskaran Raman & Randy H.Katz U.C Berkeley INFOCOM.
A Framework for Composing Services Across Independent Providers in the Wide-Area Internet Bhaskaran Raman Qualifying Examination Proposal Feb 12, 2001.
1 Traffic Engineering By Kavitha Ganapa. 2 Introduction Traffic engineering is concerned with the issue of performance evaluation and optimization of.
Internet Traffic Engineering Motivation: –The Fish problem, congested links. –Two properties of IP routing Destination based Local optimization TE: optimizing.
Placing Relay Nodes for Intra-Domain Path Diversity Meeyoung Cha Sue Moon Chong-Dae Park Aman Shaikh Proc. of IEEE INFOCOM 2006 Speaker 游鎮鴻.
Network Layer Lecture Network Layer Design Issues.
1 Scalability and Accuracy in a Large-Scale Network Emulator Nov. 12, 2003 Byung-Gon Chun.
Towards an integrated multimedia service hosting overlay Dongyan Xu Xuxian Jiang Proceedings of the 12th annual ACM international conference on Multimedia.
Fall, 2001CS 6401 Switching and Routing Outline Routing overview Store-and-Forward switches Virtual circuits vs. Datagram switching.
Design Decisions / Lessons Learned
COS 461: Computer Networks
Packet Switching Outline Store-and-Forward Switches
Routing and the Network Layer (ref: Interconnections by Perlman
Host and Small Network Relaying Howard C. Berkowitz
Research Issues in Middleware (Bhaskar)
Presentation transcript:

Performance and Availability in Wide-Area Service Composition Bhaskaran Raman ICEBERG, EECS, U.C.Berkeley Presentation at Siemens, June 2001

The Case for Services "Service and content providers play an increasing role in the value chain. The dominant part of the revenues moves from the network operator to the content provider. It is expected that value-added data services and content provisioning will create the main growth." Subscriber user Service broker Service mgt. Access network operator Core network operator Value added service providers Value added service providers Value added service providers Content providers Content providers Content providers Access Networks Cellular systems Cordless (DECT) Bluetooth DECT data Wireless LAN Wireless local loop SatelliteCableDSL

Service Composition Provider Q Texttospeech Provider R Cellular Phone repository Provider A Video-on-demand server Provider B Thin Client Provider A Provider B Replicated instances Transcoder

Service Composition Operational model: –Service providers deploy different services at various network locations –Next generation portals compose services Quickly enable new functionality on new devices Possibly through SLAs –Code is NOT mobile (mutually untrusting service providers) Composition across –Service providers –Wide-area Notion of service-level path

Wide-Area Service Composition: Performance and Availability Performance: Choice of Service Instances Availability: Detecting and Handling Failures

Related Work Service composition is complex: –Service discovery, Interface definitions, Semantics of composition Previous efforts have addressed: –Semantics and interface definitions COTS (Stanford), Future Computing Environments (G. Tech) –Fault tolerance composition within a single cluster TACC (Berkeley) –Performance constrained choice of service, but not for composed services SPAND (Berkeley), Harvest (Colorado), Tapestry/CAN (Berkeley), RON (MIT) None address wide-area network performance or failure issues for long-lived composed sessions

Our Architecture Internet Service cluster: compute cluster capable of running services Peering: monitoring & cascading Destination Source Composed services Hardware platform Peering relations, Overlay network Service clusters Logical platform Application plane Handling failures Service-level path creation Service location Network performance Detection Recovery

Architecture: Advantages Overlay nodes are clusters –Compute platform for services –Hierarchical monitoring Within cluster – for process/machine failures Across clusters – for network path failures –Aggregated monitoring Amortized overhead

The Overlay Network Peering relations, Overlay network Handling failures Service-level path creation The overlay network provides the context for service-level path creation and failure handling

Service-Level Path Creation Connection-oriented network –Explicit session setup stage –There’s “switching” state at the intermediate nodes Need a connection-less protocol for connection setup Need to keep track of three things: –Network path liveness –Metric information (latency/bandwidth) for optimality decisions –Where services are located

Service-Level Path Creation Three levels of information exchange –Network path liveness Low overhead, but very frequent –Metric information: latency/bandwidth Higher overhead, not so frequent Bandwidth changes only once in several minutes [Balakrishnan’97] Latency changes appreciably only once in about an hour [Acharya’96] –Information about location of services in clusters Bulky, but does not change very often (once in a few weeks, or months) Could also use independent service location mechanism

Service Level Path Creation Link-state algorithm to exchange information –Lesser overhead of individual measurement  finer time-scale of measurement –Service-level path created at entry node –Link-state because it allows all-pair-shortest-path calculation in the graph

Service Level Path Creation Two ideas: –Path caching Remember what previous clients used Another use of clusters –Dynamic path optimization Since session-transfer is a first-order feature First path created need not be optimal

Session Recovery: Design Tradeoffs End-to-end vs. local-link End-to-end: –Pre-establishment possible –But, failure information has to propagate –And, performance of alternate path could have changed Local-link: –No need for information to propagate –But, additional overhead Overlay n/w Handling failures Service-level path creation Service location Network performance Detection Recovery Finding entry/exit

The Overlay Topology: Design Factors How many nodes? –Large number of nodes  lesser latency overhead –But scaling concerns Where to place nodes? –Close to edges so that hosts have points of entry and exit close to them –Close to backbone to take advantage of good connectivity Who to peer with? –Nature of connectivity –Least sharing of physical links among overlay links

Failure detection in the wide-area: Analysis Provider A Video-on-demand server Provider B Thin Client Transcoder Peering relations, Overlay network Handling failures Service-level path creation Service location Network performance Detection Recovery

Failure detection in the wide-area: Analysis What are we doing? –Keeping track of the liveness of the WA Internet path Why is it important? –10% of Internet paths have 95% availability [Labovitz’99] –BGP could take several minutes to converge [Labovitz’00] –These could significantly affect real-time sessions based on service-level paths Why is it challenging? –Is there a notion of “failure”? –Given Internet cross-traffic and congestion? –What if losses could last for any duration with equal probability?

Failure detection: the trade-off Monitoring for liveness of path using keep-alive heartbeat Time Failure: detected by timeout Timeout period Time False-positive: failure detected incorrectly  unnecessary overhead Timeout period There’s a trade-off between time-to-detection and rate of false-positives

UDP-based keep-alive stream Geographically distributed hosts: –Berkeley, Stanford, UIUC, TU-Berlin, UNSW –Some trans-oceanic links, some within the US UDP heart-beat every 300ms between pairs Measure gaps between receipt of successive heart-beats

UDP-based keep-alive stream gaps above 900ms 6/11 False-positive rate: 6/11

UDP Experiments: What do we conclude? Significant number of outages > 30 seconds –Of the order of once a day But, 1.8 second outage  30 second outage with 50% prob. –If we react to 1.8 second outages by transferring a session can have much better availability than what’s possible today Provider A Video-on-demand server Provider B Thin Client Transcoder

UDP Experiments: What do we conclude? 1.8 seconds good enough for non-interactive applications –On-demand video/audio usually have 5-10 second buffers anyway 1.8 seconds not good for interactive/live applications –But definitely better than having the entire session cut- off

Overhead of Overlay Network: Preliminary Evaluation Overhead of routing over the Overlay Network –As opposed to using the underlying physical network Estimate routing overhead by using simulation and network model Need placement strategy: assume placement near core Overhead is a function of number of overlay nodes Result: overhead of overlay network is negligible for a size of 5% (200/4000 nodes) Number of IP-Address-Prefixes on the Internet: 100,000  5% is 5000

Research Methodology Connection-oriented overlay network of clusters Session-transfer on failure Aggregation – amortization of overhead Simulation –Routing overhead –Effect of size of overlay Implementation –MP3 music for GSM cellular-phones –Codec service for IP- telephony Wide-area monitoring trade- offs –How quickly can failures be detected? –Rate of false- positives Evaluation Analysis Design

Research Methodology: Metrics and Approach Metrics: Overhead, Scalability, Stability Approach for evaluation: –Simulation –Trace-based emulation Leverage the Millennium testbed Hundreds of fast, well-connected cluster machines Can emulate wide-area network based on traces/models –Real implementation testbed Possible collaboration?

Summary Logical overlay network of service clusters –Middleware platform for service deployment –Optimal service-level path creation –Failure detection and recovery Failures can be detected in O(1sec) over the wide- area –Useful for many applications Number of overlay nodes required seems reasonable –O(1000s) for minimal latency overhead Several interesting issues to look at: –Overhead, Scalability, Stability

References [Labovitz’99] C. Labovitz, A. Ahuja, and F. Jahanian, “Experimental Study of Internet Stability and Wide-Area Network Failures”, Proc. Of FTCS’99 [Labovitz’00] C. Labovitz, A. Ahuja, A. Bose, and F. Jahanian, “Delayed Internet Routing Convergence”, Proc. SIGCOMM’00 [Acharya’96] A. Acharya and J. Saltz, “A Study of Internet Round- Trip Delay”, Technical Report CS-TR-3736, U. of Maryland [Yajnik’99] M. Yajnik, S. Moon, J. Kurose, and D. Towsley, “Measurement and Modeling of the Temporal Dependence in Packet Loss”, Proc. INFOCOM’99 [Balakrishnan’97] H. Balakrishnan, S. Seshan, M. Stemm, and R. H. Katz, “Analyzing Stability in Wide-Area Network Performance”, Proc. SIGMETRICS’97