Wide-Area Service Composition: Evaluation of Availability and Scalability Bhaskaran Raman SAHARA, EECS, U.C.Berkeley Provider Q Texttoaudio Provider R.

Slides:



Advertisements
Similar presentations
Availability and Performance in Wide-Area Service Composition Bhaskaran Raman EECS, U.C.Berkeley July 2002.
Advertisements

IPv6 Multihoming Support in the Mobile Internet Presented by Paul Swenson CMSC 681, Fall 2007 Article by M. Bagnulo et. al. and published in the October.
Topologically-Aware Overlay Construction and Server Selection Sylvia Ratnasamy, Mark Handly, Richard Karp and Scott Shenker Presented by Shreeram Sahasrabudhe.
Routing: Cores, Peers and Algorithms
Towards Virtual Routers as a Service 6th GI/ITG KuVS Workshop on “Future Internet” November 22, 2010 Hannover Zdravko Bozakov.
Configurable restoration in overlay networks Matthew Caesar, Takashi Suzuki.
The War Between Mice and Elephants Presented By Eric Wang Liang Guo and Ibrahim Matta Boston University ICNP
1 Internet Networking Spring 2004 Tutorial 13 LSNAT - Load Sharing NAT (RFC 2391)
1 In VINI Veritas: Realistic and Controlled Network Experimentation Jennifer Rexford with Andy Bavier, Nick Feamster, Mark Huang, and Larry Peterson
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Informed Detour Selection Helps Reliability Boulat A. Bash.
Wide-Area Service Composition: Availability, Performance, and Scalability Bhaskaran Raman SAHARA, EECS, U.C.Berkeley SAHARA Retreat, Jan 2002.
Configurable restoration in overlay networks Matthew Caesar, Takashi Suzuki.
Internet-Scale Research at Universities Panel Session SAHARA Retreat, Jan 2002 Prof. Randy H. Katz, Bhaskaran Raman, Z. Morley Mao, Yan Chen.
The SAHARA Four-Layer Model; Case-studies in Composition
Rethinking Internet Traffic Management: From Multiple Decompositions to a Practical Protocol Jiayue He Princeton University Joint work with Martin Suchara,
RRAPID: Real-time Recovery based on Active Probing, Introspection, and Decentralization Takashi Suzuki Matthew Caesar.
Lesson 1: Configuring Network Load Balancing
COS 461: Computer Networks
1 Spring Semester 2007, Dept. of Computer Science, Technion Internet Networking recitation #12 LSNAT - Load Sharing NAT (RFC 2391)
Problem Definition Data path –Created by the Automatic Path Creation (APC) component –Service: program with well-defined interface –Operator: stateless.
Availability in Wide-Area Service Composition Bhaskaran Raman and Randy H. Katz SAHARA, EECS, U.C.Berkeley.
Jennifer Rexford Princeton University MW 11:00am-12:20pm Wide-Area Traffic Management COS 597E: Software Defined Networking.
Connecting LANs, Backbone Networks, and Virtual LANs
Chapter 2 The Infrastructure. Copyright © 2003, Addison Wesley Understand the structure & elements As a business student, it is important that you understand.
Presentation on Osi & TCP/IP MODEL
What is a Protocol A set of definitions and rules defining the method by which data is transferred between two or more entities or systems. The key elements.
JMS Compliance in NaradaBrokering Shrideep Pallickara, Geoffrey Fox Community Grid Computing Laboratory Indiana University.
CS An Overlay Routing Scheme For Moving Large Files Su Zhang Kai Xu.
Protocol Layering Chapter 10. Looked at: Architectural foundations of internetworking Architectural foundations of internetworking Forwarding of datagrams.
An Overlay Architecture for High Quality VoIP Streams IEEE Trans. on Multimedia 2006 R 翁郁婷 R 周克遠.
Institute of Computer and Communication Network Engineering OFC/NFOEC, 6-10 March 2011, Los Angeles, CA Lessons Learned From Implementing a Path Computation.
An Architecture for Optimal and Robust Composition of Services across the Wide-Area Internet Bhaskaran Raman Qualifying Examination Proposal Feb 12, 2001.
Chapter 2 – X.25, Frame Relay & ATM. Switched Network Stations are not connected together necessarily by a single link Stations are typically far apart.
Routing Protocol Evaluation David Holmer
A Framework for Highly-Available Cascaded Real-Time Internet Services Bhaskaran Raman Qualifying Examination Proposal Feb 12, 2001 Examination Committee:
1 Review - OSI Model n OSI Reference Model u represents the communications process. u 7 layers: physical, data link, network, transport, session, presentation.
A Framework for Highly-Available Session-Oriented Internet Services Bhaskaran Raman, Prof. Randy H. Katz {bhaskar, The ICEBERG Project.
Sami Al-wakeel 1 Data Transmission and Computer Networks The Switching Networks.
6/1/991 Internetworking connectionless and connection-oriented networks Malathi Veeraraghavan Mark Karol Polytechnic UniversityBell Laboratories
A Framework for Highly-Available Real-Time Internet Services Bhaskaran Raman, EECS, U.C.Berkeley.
Computer Networks with Internet Technology William Stallings
Impact of Topology on Overlay Multicast Suat Mercan.
Performance and Availability in Wide-Area Service Composition Bhaskaran Raman ICEBERG, EECS, U.C.Berkeley Presentation at Siemens, June 2001.
A comparison of overlay routing and multihoming route control Hayoung OH
Wide-Area Service Composition: Performance, Availability and Scalability Bhaskaran Raman SAHARA, EECS, U.C.Berkeley Presentation at Ericsson, Jan 2002.
A Light-Weight Distributed Scheme for Detecting IP Prefix Hijacks in Real-Time Lusheng Ji†, Joint work with Changxi Zheng‡, Dan Pei†, Jia Wang†, Paul Francis‡
Latency & Scaling Issues in Mobile-IP Sreedhar Mukkamalla Bhaskaran Raman.
1 Computer Communication & Networks Lecture 21 Network Layer: Delivery, Forwarding, Routing Waleed.
Evolving Toward a Self-Managing Network Jennifer Rexford Princeton University
1 Wide Area Network Emulation on the Millennium Bhaskaran Raman Yan Chen Weidong Cui Randy Katz {bhaskar, yanchen, wdc, Millennium.
1 ECEN “Internet Protocols and Modeling”, Spring 2011 Slide 5.
Interconnect Networks Basics. Generic parallel/distributed system architecture On-chip interconnects (manycore processor) Off-chip interconnects (clusters.
COS 420 Day 15. Agenda Finish Individualized Project Presentations on Thrusday Have Grading sheets to me by Friday Group Project Discussion Goals & Timelines.
CS 6401 Overlay Networks Outline Overlay networks overview Routing overlays Resilient Overlay Networks Content Distribution Networks.
Spring 2000CS 4611 Routing Outline Algorithms Scalability.
Network Computing Laboratory Load Balancing and Stability Issues in Algorithms for Service Composition Bhaskaran Raman & Randy H.Katz U.C Berkeley INFOCOM.
A Framework for Composing Services Across Independent Providers in the Wide-Area Internet Bhaskaran Raman Qualifying Examination Proposal Feb 12, 2001.
6/1/991 Internetworking connectionless and connection-oriented networks Malathi Veeraraghavan Mark Karol Polytechnic UniversityBell Labs.
1 Scalability and Accuracy in a Large-Scale Network Emulator Nov. 12, 2003 Byung-Gon Chun.
Towards an integrated multimedia service hosting overlay Dongyan Xu Xuxian Jiang Proceedings of the 12th annual ACM international conference on Multimedia.
Network Processing Systems Design
Fall, 2001CS 6401 Switching and Routing Outline Routing overview Store-and-Forward switches Virtual circuits vs. Datagram switching.
Voice Over Internet Protocol Nelson Kattula Computer Science, Masters.
Network Models.
What is a Protocol A set of definitions and rules defining the method by which data is transferred between two or more entities or systems. The key elements.
Replication Middleware for Cloud Based Storage Service
The Internet and HTTP and DNS Examples
COS 461: Computer Networks
Control-Data Plane Separation
Presentation transcript:

Wide-Area Service Composition: Evaluation of Availability and Scalability Bhaskaran Raman SAHARA, EECS, U.C.Berkeley Provider Q Texttoaudio Provider R Cellular Phone repository Provider A Video-on-demand server Provider B Thin Client Transcoder

Problem Statement and Goals Goals –Performance: Choose set of service instances –Availability: Detect and handle failures quickly –Scalability: Internet- scale operation Problem Statement –Path could stretch across –multiple service providers –multiple network domains –Inter-domain Internet paths: –Poor availability [Labovitz’99] –Poor time-to-recovery [Labovitz’00] –Take advantage of service replicas Provider A Video-on-demand server Provider B Thin Client Transcoder Related Work –TACC: composition within cluster –Web-server choice: SPAND, Harvest –Routing around failures: Tapestry, RON We address: wide-area n/w perf., failure issues for long-lived composed sessions

Is “quick” failure detection possible? What is a “failure” on an Internet path? –Outage periods happen for varying durations Study outage periods using traces –12 pairs of hosts Berkeley, Stanford, UIUC, UNSW (Aus), TU-Berlin (Germany) Results could be skewed due to Internet2 backbone? –Periodic UDP heart-beat, every 300 ms –Study “gaps” between receive-times Results: –Short outage ( sec)  Long outage (> 30 sec) Sometimes this is true over 50% of the time –False-positives are rare: O(once an hour) at most –Similar results with ping-based study using ping-servers –Take away: okay to react to short outage periods, by switching service-level path

UDP-based keep-alive stream HB destinationHB sourceTotal timeNum. False positives Num. Failures BerkeleyUNSW130:48: UNSWBerkeley130:51:4598 BerkeleyTU-Berlin130:49:46278 TU-BerlinBerkeley130:50: TU-BerlinUNSW130:48: UNSWTU-Berlin130:46:38245 BerkeleyStanford124:21: StanfordBerkeley124:21:1926 StanfordUIUC89:53:1741 UIUCStanford76:39:10741 BerkeleyUIUC89:54:1165 UIUCBerkeley76:39:4035 Acknowledgements: Mary Baker, Mema Roussopoulos, Jayant Mysore, Roberto Barnes, Venkatesh Pranesh, Vijaykumar Krishnaswamy, Holger Karl, Yun-Shen Chang, Sebastien Ardon, Binh Thai

Architecture Composed services Hardware platform Peering relations, Overlay network Service clusters Logical platform Application plane Service cluster: compute cluster capable of running services Internet Peering: exchange perf. info. Destination Source Finding Overlay Entry/Exit Location of Service Replicas Service-Level Path Creation, Maintenance, and Recovery Link-State Propagation At-least -once UDP Perf. Meas. Liveness Detection Functionalities at the Cluster-Manager

Evaluation What is the effect of recovery mechanism on application? –Text-to-Speech application –Two possible places of failure 20-node overlay network One service instance for each service Deterministic failure for 10sec during session Metric: gap between arrival of successive audio packets at the client What is the scaling bottleneck? –Parameter: #client sessions across peering clusters Measure of instantaneous load when failure occurs –5000 client sessions in 20-node overlay network –Deterministic failure of 12 different links (12 data-points in graph) –Metric: average time-to-recovery Leg-2 Leg-1 Texttoaudio Text Source End-Client Request-response protocol Data (text, or RTP audio) Keep-alive soft-state refresh Application soft-state (for restart on failure)

Recovery of Application Session: CDF of gaps>100ms Recovery time: 822 ms (quicker than leg-2 due to buffer at text-to-audio service) Recovery time: 2963 ms Recovery time: 10,000 ms Jump at ms: due to synch. text-to-audio processing (impl. artefact) 1 1

Average Time-to-Recovery vs. Instantaneous Load Two services in each path Two replicas per service Each data-point is a separate run End-to-End recovery algorithm High variance due to varying path length Load: 1,480 paths on failed link Avg. path recovery time: 614 ms 2 2

Results: Discussion Recovery after failure (leg-2): 2,963 = 1,800 + O(700) + O(450) –1,800 ms: timeout to conclude failure –700 ms: signaling to setup alternate path –450 ms: recovery of application soft-state: re-process current sentence Without recovery algorithm: takes as long as failure duration O(3 sec) recovery –Can be completely masked with buffering –Interactive apps: still much better than without recovery Quick recovery possible since failure information does not have to propagate across network 12 th data point (instantaneous load of 1,480) stresses emulator limits –1,480 translates to about 700 simul. paths per cluster- manager –In comparison, our text-to-speech implementation can support O(15) clients per machine Other scaling limits? Link-state floods? Graph computation?

Summary Service Composition: flexible service creation We address performance, availability, scalability Initial analysis: Failure detection -- meaningful to timeout in O( sec) Design: Overlay network of service clusters Evaluation: results so far –Good recovery time for real-time applications: O(3 sec) –Good scalability -- minimal additional provisioning for cluster managers Ongoing work: –Overlay topology issues: how many nodes, peering –Stability issues Feedback, Questions? Presentation made using VMWare Evaluation Analysis Design

Emulation Testbed App Lib Node 1 Node 2 Node 3 Node 4 Rule for 1  2 Rule for 1  3 Rule for 3  4 Rule for 4  3 Emulator Operational limits of emulator: 20,000 pkts/sec, for upto 500 byte pkts, 1.5GHz Pentium-4