Wide-Area Service Composition: Availability, Performance, and Scalability Bhaskaran Raman SAHARA, EECS, U.C.Berkeley SAHARA Retreat, Jan 2002.

Slides:



Advertisements
Similar presentations
Using Network Virtualization Techniques for Scalable Routing Nick Feamster, Georgia Tech Lixin Gao, UMass Amherst Jennifer Rexford, Princeton University.
Advertisements

Dynamic Replica Placement for Scalable Content Delivery Yan Chen, Randy H. Katz, John D. Kubiatowicz {yanchen, randy, EECS Department.
Internet Indirection Infrastructure (i3 ) Ion Stoica, Daniel Adkins, Shelley Zhuang, Scott Shenker, Sonesh Surana UC Berkeley SIGCOMM 2002 Presented by:
Availability and Performance in Wide-Area Service Composition Bhaskaran Raman EECS, U.C.Berkeley July 2002.
On the Effectiveness of Measurement Reuse for Performance-Based Detouring David Choffnes Fabian Bustamante Fabian Bustamante Northwestern University INFOCOM.
Wide-Area Service Composition: Evaluation of Availability and Scalability Bhaskaran Raman SAHARA, EECS, U.C.Berkeley Provider Q Texttoaudio Provider R.
Configurable restoration in overlay networks Matthew Caesar, Takashi Suzuki.
Small-world Overlay P2P Network
1 In VINI Veritas: Realistic and Controlled Network Experimentation Jennifer Rexford with Andy Bavier, Nick Feamster, Mark Huang, and Larry Peterson
Peer-to-Peer Based Multimedia Distribution Service Zhe Xiang, Qian Zhang, Wenwu Zhu, Zhensheng Zhang IEEE Transactions on Multimedia, Vol. 6, No. 2, April.
Metrics for Evaluating ICEBERG ICEBERG Retreat Breakout Session Jan 11, 2000 Coordinators: Chen-Nee Chuah & Jimmy Shih.
PROXY FOR CONNECTIVITY We consider the k shortest edge disjoint paths between a pair of nodes and define a hyperlink, whose ‘connectivity’ is defined as:
G Robert Grimm New York University Scalable Network Services.
Resilient Overlay Networks David Anderson, Hari Balakrishnan, Frank Kaashoek and Robert Morris. MIT Laboratory for Computer Science
A General approach to MPLS Path Protection using Segments Ashish Gupta Ashish Gupta.
Configurable restoration in overlay networks Matthew Caesar, Takashi Suzuki.
A Network Measurement Architecture for Adaptive Networked Applications Mark Stemm* Randy H. Katz Computer Science Division University of California at.
1 End-to-End Detection of Shared Bottlenecks Sridhar Machiraju and Weidong Cui Sahara Winter Retreat 2003.
Introspective Replica Management Yan Chen, Hakim Weatherspoon, and Dennis Geels Our project developed and evaluated a replica management algorithm suitable.
Internet-Scale Research at Universities Panel Session SAHARA Retreat, Jan 2002 Prof. Randy H. Katz, Bhaskaran Raman, Z. Morley Mao, Yan Chen.
Rethinking Internet Traffic Management: From Multiple Decompositions to a Practical Protocol Jiayue He Princeton University Joint work with Martin Suchara,
RRAPID: Real-time Recovery based on Active Probing, Introspection, and Decentralization Takashi Suzuki Matthew Caesar.
COS 461: Computer Networks
A General approach to MPLS Path Protection using Segments Ashish Gupta Ashish Gupta.
.NET Mobile Application Development Introduction to Mobile and Distributed Applications.
Problem Definition Data path –Created by the Automatic Path Creation (APC) component –Service: program with well-defined interface –Operator: stateless.
Availability in Wide-Area Service Composition Bhaskaran Raman and Randy H. Katz SAHARA, EECS, U.C.Berkeley.
Multimedia and Mobile communications Laboratory Augmenting Mobile 3G Using WiFi Aruna Balasubramanian, Ratul Mahajan, Arun Venkataramani Jimin.
Microkernels, virtualization, exokernels Tutorial 1 – CSC469.
SensIT PI Meeting, January 15-17, Self-Organizing Sensor Networks: Efficient Distributed Mechanisms Alvin S. Lim Computer Science and Software Engineering.
What is a Protocol A set of definitions and rules defining the method by which data is transferred between two or more entities or systems. The key elements.
CS An Overlay Routing Scheme For Moving Large Files Su Zhang Kai Xu.
Institute of Computer and Communication Network Engineering OFC/NFOEC, 6-10 March 2011, Los Angeles, CA Lessons Learned From Implementing a Path Computation.
An Architecture for Optimal and Robust Composition of Services across the Wide-Area Internet Bhaskaran Raman Qualifying Examination Proposal Feb 12, 2001.
Adaptive Failover Mechanism Motivation End-to-end connectivity can suffer during net failures Internet path outage detection and recovery is slow (shown.
A Framework for Highly-Available Cascaded Real-Time Internet Services Bhaskaran Raman Qualifying Examination Proposal Feb 12, 2001 Examination Committee:
A Framework for Highly-Available Session-Oriented Internet Services Bhaskaran Raman, Prof. Randy H. Katz {bhaskar, The ICEBERG Project.
1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant.
Architectures of distributed systems Fundamental Models
A Framework for Highly-Available Real-Time Internet Services Bhaskaran Raman, EECS, U.C.Berkeley.
A Routing Underlay for Overlay Networks Akihiro Nakao Larry Peterson Andy Bavier SIGCOMM’03 Reviewer: Jing lu.
Computer Networks with Internet Technology William Stallings
Performance and Availability in Wide-Area Service Composition Bhaskaran Raman ICEBERG, EECS, U.C.Berkeley Presentation at Siemens, June 2001.
ﺑﺴﻢﺍﷲﺍﻠﺭﺣﻣﻥﺍﻠﺭﺣﻳﻡ. Group Members Nadia Malik01 Malik Fawad03.
A comparison of overlay routing and multihoming route control Hayoung OH
Wide-Area Service Composition: Performance, Availability and Scalability Bhaskaran Raman SAHARA, EECS, U.C.Berkeley Presentation at Ericsson, Jan 2002.
Latency & Scaling Issues in Mobile-IP Sreedhar Mukkamalla Bhaskaran Raman.
1 Wide Area Network Emulation on the Millennium Bhaskaran Raman Yan Chen Weidong Cui Randy Katz {bhaskar, yanchen, wdc, Millennium.
COS 420 Day 15. Agenda Finish Individualized Project Presentations on Thrusday Have Grading sheets to me by Friday Group Project Discussion Goals & Timelines.
CS 6401 Overlay Networks Outline Overlay networks overview Routing overlays Resilient Overlay Networks Content Distribution Networks.
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT By Jyothsna Natarajan Instructor: Prof. Yanqing Zhang Course: Advanced Operating Systems.
L Subramanian*, I Stoica*, H Balakrishnan +, R Katz* *UC Berkeley, MIT + USENIX NSDI’04, 2004 Presented by Alok Rakkhit, Ionut Trestian.
Spring 2000CS 4611 Routing Outline Algorithms Scalability.
Network Computing Laboratory Load Balancing and Stability Issues in Algorithms for Service Composition Bhaskaran Raman & Randy H.Katz U.C Berkeley INFOCOM.
A Framework for Composing Services Across Independent Providers in the Wide-Area Internet Bhaskaran Raman Qualifying Examination Proposal Feb 12, 2001.
OverQos: An Overlay based Architecture for Enhancing Internet Qos L Subramanian*, I Stoica*, H Balakrishnan +, R Katz* *UC Berkeley, MIT + USENIX NSDI’04,
1 Scalability and Accuracy in a Large-Scale Network Emulator Nov. 12, 2003 Byung-Gon Chun.
Towards an integrated multimedia service hosting overlay Dongyan Xu Xuxian Jiang Proceedings of the 12th annual ACM international conference on Multimedia.
Ad Hoc Wireless Routing Different from routing in the “wired” world Desirable properties of a wireless routing protocol –Distributed operation –Loop freedom.
What is a Protocol A set of definitions and rules defining the method by which data is transferred between two or more entities or systems. The key elements.
Accelerating Peer-to-Peer Networks for Video Streaming
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT -Sumanth Kandagatla Instructor: Prof. Yanqing Zhang Advanced Operating Systems (CSC 8320)
COS 561: Advanced Computer Networks
Architectures of distributed systems Fundamental Models
Architectures of distributed systems Fundamental Models
Architectures of distributed systems
Architectures of distributed systems Fundamental Models
Control-Data Plane Separation
Presentation transcript:

Wide-Area Service Composition: Availability, Performance, and Scalability Bhaskaran Raman SAHARA, EECS, U.C.Berkeley SAHARA Retreat, Jan 2002

Service Composition: Motivation Provider Q Texttospeech Provider R Cellular Phone repository Provider A Video-on-demand server Provider B Thin Client Provider A Provider B Replicated instances Transcoder Service-Level Path Other examples: ICEBERG, IETF OPES’00

In this work: Problem Statement and Goals Goals –Performance: Choose set of service instances –Availability: Detect and handle failures quickly –Scalability: Internet-scale operation Problem Statement –Path could stretch across –multiple service providers –multiple network domains –Inter-domain Internet paths: –Poor availability [Labovitz’99] –Poor time-to-recovery [Labovitz’00] –Take advantage of service replicas

In this work: Assumptions and Non- goals Operational model: –Service providers deploy different services at various network locations –Next generation portals compose services –Code is NOT mobile (mutually untrusting service providers) We do not address service interface issue Assume that service instances have no persistent state –Not very restrictive [OPES’00]

Related Work Other efforts have addressed: –Semantics and interface definitions OPES (IETF), COTS (Stanford) –Fault tolerant composition within a single cluster TACC (Berkeley) –Performance constrained choice of service, but not for composed services SPAND (Berkeley), Harvest (Colorado), Tapestry/CAN (Berkeley), RON (MIT) None address wide-area network performance or failure issues for long-lived composed sessions

Solution: Requirements Failure detection/liveness tracking –Server, Network failures Performance information collection –Load, Network characteristics Service location Global information is required –Hop-by-hop approach will not work

Design challenges Scalability and Global information –Information about all service instances, and network paths in- between should be known Quick failure detection and recovery –Internet dynamics  intermittent congestion

Is “quick” failure detection possible? What is a “failure” on an Internet path? –Outage periods happen for varying durations Study outage periods using traces –12 pairs of hosts –Periodic UDP heart-beat, every 300 ms –Study “gaps” between receive-times Main results: –Short outage ( sec)  Long outage (> 30 sec) Sometimes this is true over 50% of the time –False-positives are rare: O(once an hour) at most –Okay to react to short outage periods, by switching service-level path

Towards an Architecture Service execution platforms –For providers to deploy services –First-party, or third-party service platforms Overlay network of such execution platforms –Collect performance information –Exploit redundancy in Internet paths

Architecture Internet Service cluster: compute cluster capable of running services Peering: exchange perf. info. Destination Source Composed services Hardware platform Peering relations, Overlay network Service clusters Logical platform Application plane Overlay size: how many nodes? –Akamai: O(10,000) nodes Cluster  process/machine failures handled within

Key Design Points Overlay size: –Could grow much slower than #services, or #clients –How many nodes? A comparison: Akamai cache servers O(10,000) nodes for Internet-wide operation Overlay network is virtual-circuit based: –“Switching-state” at each node E.g. Source/Destination of RTP stream, in transcoder –Failure information need not propagate for recovery Problem of service-location separated from that of performance and liveness Cluster  process/machine failures handled within

Software Architecture Finding Overlay Entry/ExitLocation of Service Replicas Service-Level Path Creation, Maintenance, Recovery Link-State Propagation At-least -once UDP Perf. Meas. Liveness Detection Peer-Peer Layer Link-State Layer Service-Composition Layer Functionalities at the Cluster-Manager

Layers of Functionality Why Link-State? –Need full graph information –Also, quick propagation of failure information –Link-state flood overheads? Service-Composition layer: –Algorithm for service-composition Modified version of Dijkstra’s –Computational overheads? –Signaling for path creation, recovery Downstream to upstream

Evaluation What is the effect of recovery mechanism on application? What is the scaling bottleneck? –Overheads: Signaling messages during path recovery Link-state floods Graph computations Testbed: –Emulation platform –On Millennium cluster of workstations

Evaluation: Emulation Testbed Idea: Use real implementation, emulate the wide-area network behavior (NistNET) Opportunity: Millennium cluster App Lib Node 1 Node 2 Node 3 Node 4 Rule for 1  2 Rule for 1  3 Rule for 3  4 Rule for 4  3 Emulator

Evaluation: Recovery of Application Session Text-to-Speech application Two possible places of failure Leg-2 Leg-1 Texttoaudio Text Source End-Client Request-response protocol Data (text, or RTP audio) Keep-alive soft-state refresh Application soft-state (for restart on failure)

Evaluation: Recovery of Application Session Setup: –20-node overlay network Generate 6,510-node physical network using GT-ITM Choose 20 nodes at random –Latency variation: Base value of one-way latency from edge weights Variation in accordance with: RTT spikes are isolated [Acharya’96] –Failures: Deterministic failure for 10sec during session Application metric: gap between arrival of successive audio packets at the client

Recovery of application: Results SetupGap seen at application Failure of leg-2; with recovery2,963 ms Failure of leg-2; no recovery10,000 ms

Recovery of Application Session: CDF of gaps>100ms Recovery time: 822 ms (quicker than leg-2 due to buffer at text-to-audio service) Recovery time: 2963 ms Recovery time: 10,000 ms Jump at ms: due to synch. text-to-audio processing (impl. artefact)

Discussion Recovery after failure of leg-2 –Breakup: 2,963 = 1,800 + O(700) + O(450) –1,800 ms: timeout to conclude failure –700 ms: signaling to setup alternate path –450 ms: recovery of application soft-state Re-processing current sentence Without recovery algo.: takes as long as failure duration O(3 sec) recovery –Can be completely masked with buffering –Interactive apps: still much better than without recovery Why is quick recovery possible? –Failure information does not have to propagate across network –Overlay network is a virtual-circuit based network

Evaluation: Scaling Scaling bottleneck: –Simultaneous recovery of all client sessions on a failed overlay link Setup: –20-node overlay network –5,000 service-level paths –Latency variation: same as earlier –Deterministic failure of 12 different links 12 data points on the graph –Metric: average time-to-recovery, of all paths failed

Average Time-to-Recovery vs. Instantaneous Load Why high variance? Load: 1,480 paths on failed link Avg. path recovery time: 614 ms

CDF of recovery times of all failed paths Flat regions: due to UDP-retransmit Emulator was losing packets Emulator limit: 20,000 pkts/sec Working on removing this bottleneck…

Perc. of paths above threshold recovery time

Scaling: Discussion Can recover at least 1,500 paths without hitting bottlenecks –How many client sessions per cluster-manager? Compute using #nodes, #edges in graph Translates to about 700 simultaneous client sessions per cluster-manager –In comparison, our text-to-speech implementation can support O(15) clients per machine –Minimal additional provisioning for cluster-manager

Time-to-recovery, with varying outage periods 85% of paths recovered within 1.5 sec 87 outages > 1.8 sec 67 outages > 5 sec 34 outages > 10 sec 23 outages > 25 sec 24,181 path recoveries 5,000 paths; 15 min run

Other Scaling Bottlenecks? Link-state floods: –Twice for each failure –For a 1,000-node graph Estimate #edges = 10,000 –Failures (>1.8 sec outage): O(once an hour) in the worst case –Only about 6 floods/second in the entire network! Graph computation: –O(k*E*log(N)) computation time; k = #services composed –For 6,510-node network, this takes 50ms –Huge overhead, but: path caching helps

Summary Service Composition: flexible service creation We address performance, availability, scalability Initial analysis: Failure detection -- meaningful to timeout in O( sec) Design: Overlay network of service clusters Evaluation: results so far –Good recovery time for real-time applications: O(3 sec) –Good scalability -- minimal additional provisioning for cluster managers Ongoing work: –Overlay topology issues: how many nodes, peering –Stability issues Feedback, Questions? Presentation made using VMWare Evaluation Analysis Design

References [OPES’00] A. Beck and et.al., “Example Services for Network Edge Proxies”, Internet Draft, draft-beck-opes-esfnep-01.txt, Nov 2000 [Labovitz’99] C. Labovitz, A. Ahuja, and F. Jahanian, “Experimental Study of Internet Stability and Wide-Area Network Failures”, Proc. Of FTCS’99 [Labovitz’00] C. Labovitz, A. Ahuja, A. Bose, and F. Jahanian, “Delayed Internet Routing Convergence”, Proc. SIGCOMM’00 [Acharya’96] A. Acharya and J. Saltz, “A Study of Internet Round- Trip Delay”, Technical Report CS-TR-3736, U. of Maryland [Yajnik’99] M. Yajnik, S. Moon, J. Kurose, and D. Towsley, “Measurement and Modeling of the Temporal Dependence in Packet Loss”, Proc. INFOCOM’99 [Balakrishnan’97] H. Balakrishnan, S. Seshan, M. Stemm, and R. H. Katz, “Analyzing Stability in Wide-Area Network Performance”, Proc. SIGMETRICS’97