Capsule Placement in the Service Platform Bhuvan Urgaonkar Timothy Roscoe Systems Group, Sprint ATL.

Slides:



Advertisements
Similar presentations
Performance Testing - Kanwalpreet Singh.
Advertisements

Distributed Processing, Client/Server and Clusters
Scheduling in Web Server Clusters CS 260 LECTURE 3 From: IBM Technical Report.
System Area Network Abhiram Shandilya 12/06/01. Overview Introduction to System Area Networks SAN Design and Examples SAN Applications.
WHAT IS AN OPERATING SYSTEM? An interface between users and hardware - an environment "architecture ” Allows convenient usage; hides the tedious stuff.
Xen , Linux Vserver , Planet Lab
Reliability on Web Services Presented by Pat Chan 17/10/2005.
Cache Coherent Distributed Shared Memory. Motivations Small processor count –SMP machines –Single shared memory with multiple processors interconnected.
Distributed components
6/9/2015B.Ramamurthy1 Process Description and Control B.Ramamurthy.
Process Description and Control
Page 1 Processes and Threads Chapter Processes 2.2 Threads 2.3 Interprocess communication 2.4 Classical IPC problems 2.5 Scheduling.
OPERATING SYSTEM OVERVIEW
8. Fault Tolerance in Software
Scheduling with Optimized Communication for Time-Triggered Embedded Systems Slide 1 Scheduling with Optimized Communication for Time-Triggered Embedded.
A Mobile Agent Infrastructure for QoS Negotiation of Adaptive Distributed Applications Roberto Speicys Cardoso & Fabio Kon University of São Paulo – USP.
1/28/2004CSCI 315 Operating Systems Design1 Operating System Structures & Processes Notice: The slides for this lecture have been largely based on those.
Chapter 1 and 2 Computer System and Operating System Overview
1 Service Scheduler in a Trustworthy Web Server Yinong Chen.
A Routing Control Platform for Managing IP Networks Jennifer Rexford Princeton University
The new The new MONARC Simulation Framework Iosif Legrand  California Institute of Technology.
Computer Organization and Architecture
DISTRIBUTED COMPUTING
Charging Models for Data Centers Bhuvan Urgaonkar The Penn State University Bhuvan Urgaonkar The Penn State University.
February 11, 2003Ninth International Symposium on High Performance Computer Architecture Memory System Behavior of Java-Based Middleware Martin Karlsson,
CS364 CH08 Operating System Support TECH Computer Science Operating System Overview Scheduling Memory Management Pentium II and PowerPC Memory Management.
Computer Science Cataclysm: Policing Extreme Overloads in Internet Applications Bhuvan Urgaonkar and Prashant Shenoy University of Massachusetts.
Computer Science 1 Resource Overbooking and Application Profiling in Shared Hosting Platforms Bhuvan Urgaonkar Prashant Shenoy Timothy Roscoe † UMASS Amherst.
Self-Adaptive QoS Guarantees and Optimization in Clouds Jim (Zhanwen) Li (Carleton University) Murray Woodside (Carleton University) John Chinneck (Carleton.
Dynamic and Decentralized Approaches for Optimal Allocation of Multiple Resources in Virtualized Data Centers Wei Chen, Samuel Hargrove, Heh Miao, Liang.
Institute of Computer and Communication Network Engineering OFC/NFOEC, 6-10 March 2011, Los Angeles, CA Lessons Learned From Implementing a Path Computation.
1 Lecture 20: Parallel and Distributed Systems n Classification of parallel/distributed architectures n SMPs n Distributed systems n Clusters.
1 System Models. 2 Outline Introduction Architectural models Fundamental models Guideline.
Location Based Information Service using CORBA CS597 Direct Reading Madhu Narayanan & Rahul Vaghela Advisor: Dr. Yugi Lee.
1 Introduction to Middleware. 2 Outline What is middleware? Purpose and origin Why use it? What Middleware does? Technical details Middleware services.
Tony McGregor RIPE NCC Visiting Researcher The University of Waikato DAR Active measurement in the large.
DONE-08 Sizing and Performance Tuning N-Tier Applications Mike Furgal Performance Manager Progress Software
Mobile Middleware for Energy-Awareness Wei Li
Advanced Computer Networks Topic 2: Characterization of Distributed Systems.
ECE200 – Computer Organization Chapter 9 – Multiprocessors.
4 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved. Computer Software Chapter 4.
MapReduce and GFS. Introduction r To understand Google’s file system let us look at the sort of processing that needs to be done r We will look at MapReduce.
Computer Science 1 Resource Overbooking and Application Profiling in Shared Hosting Platforms Bhuvan Urgaonkar Prashant Shenoy Timothy Roscoe † UMASS Amherst.
I/O Computer Organization II 1 Interconnecting Components Need interconnections between – CPU, memory, I/O controllers Bus: shared communication channel.
OPERATING SYSTEMS CS 3530 Summer 2014 Systems with Multi-programming Chapter 4.
1: Operating Systems Overview 1 Jerry Breecher Fall, 2004 CLARK UNIVERSITY CS215 OPERATING SYSTEMS OVERVIEW.
Grid Computing Framework A Java framework for managed modular distributed parallel computing.
International Symposium on Grid Computing (ISGC-07), Taipei - March 26-29, 2007 Of 16 1 A Novel Grid Resource Broker Cum Meta Scheduler - Asvija B System.
Zurich Research Laboratory IBM Zurich Research Laboratory Adaptive End-to-End QoS Guarantees in IP Networks using an Active Network Approach Roman Pletka.
03/03/051 Performance Engineering of Software and Distributed Systems Research Activities at IIT Bombay Varsha Apte March 3 rd, 2005.
1 Process Description and Control Chapter 3. 2 Process A program in execution An instance of a program running on a computer The entity that can be assigned.
Cloud Computing – UNIT - II. VIRTUALIZATION Virtualization Hiding the reality The mantra of smart computing is to intelligently hide the reality Binary->
Danilo Florissi, Yechiam Yemini (YY), Sushil da Silva, Hao Huang Columbia University, New York, NY 10027
(re)-Architecting cloud applications on the windows Azure platform CLAEYS Kurt Technology Solution Professional Microsoft EMEA.
 Cloud Computing technology basics Platform Evolution Advantages  Microsoft Windows Azure technology basics Windows Azure – A Lap around the platform.
Two New UML Diagram Types Component Diagram Deployment Diagram.
Oracle Solaris Zones Study Purpose Only
Chapter 16: Distributed System Structures
Building a Database on S3
Cloud Web Filtering Platform
Process Description and Control
Service-Oriented Computing: Semantics, Processes, Agents
Process Description and Control
Chapter 13: I/O Systems I/O Hardware Application I/O Interface
The Performance and Scalability of the back-end DAQ sub-system
Introduction To Distributed Systems
Service-Oriented Computing: Semantics, Processes, Agents
Chapter 13: I/O Systems I/O Hardware Application I/O Interface
Presentation transcript:

Capsule Placement in the Service Platform Bhuvan Urgaonkar Timothy Roscoe Systems Group, Sprint ATL

Service Platform: an overview Processors High speed interconnect Internet Management/Control Unit

Service Platform: Goals Sell the platform’s resources Manage the resources efficiently Provide performance guarantees to customers Start or stop services within minutes

Services and Capsules Services: –web/game/streaming servers –service provider pays the platform Capsules –Def: Component of a service that should run on a single node –e.g.: consider a replicated web server

Nucleus Node specific control/management software: –Capsule creation, destruction –Health information (process liveness) –Resource parameters (memory, CPU, network bandwidth etc.)

Control Plane Capsule Placement Flow Placement Node, network, service monitoring Deployed Service Database Billing

Outline of this talk Service Platform: an overview Quality of Service Capsule Placement Design of the Placement unit Conclusions and future work

QoS Representations Application level –e.g., 50 transactions per sec Contract level –e.g., “something like a 300 MHz Pentium II” Platform level –e.g., ? Node level –e.g., weights, priorities etc.

Translation between QoS levels Application level => Contract level –Application specific, customer’s problem Contract level => Platform level –More a business problem Platform level => Node level –OS dependent

Capsule Placement: Desirables Maximize revenue! Aware of the “importance” of services. Overbooking. Exploit known workload characteristics. Adapt to changes in workload? Fast.

Stages in hosting a service Requirement specification Placement Deployment Activation

Requirement Specification Contract level representation –Many possibilities: 300 MHz PII, best effort or a CPU instruction token bucket. Platform level representation –Must be uniform across the platform. –(rate, burst, ovb tolerance, arch, OS)

Translation to Node level Reservation based scheduler –map (rate, burst) to (period, slice) –bigger burst => bigger period Proportional share scheduler –burst ? –weight in proportion to rate Priority based scheduler –no easy mapping

Placement Find the set of feasible nodes –Compatible architecture and OS –No overbooking tolerances violated Pick one node from this set –Best Fit –Worst Fit –Random Select –Close Overbooking

Placement: Example capsules nodes abc N1N4N3N2 One possible placement: (a, N1), (b, N2), (c, N3)

Deployment and Activation Deployment: The process of preparing a capsule for execution on a node. –Why ? e.g., need to download some files before starting –the control plane sends all information to deploy the capsule Activation: Starting a deployed service

Capsule State Diagram deployedundeployed deploying undeploying activating deactivating active

Example Message Exchange Control PlaneNucleus deployed svc cap state svc cap deployed deployed svc cap Instruct nucleus to deploy a capsule, start timer No response! Send again Starts deploying the capsule Still deploying Done deploying, send status message Deployed before timeout, instruct nucleus to activate activated svc cap Starts activating the capsule...

Placement Unit Architecture Listen for new requests Event QueueMessage Queue Dispatch EventsListen to nuclei Events due to new requests Events due to msgs from nucleiMessages from nuclei

Database Consistency Transactions and exceptions –e.g: try: transaction_begin () deploy_service (svc): transaction_commit () except: transaction_abort ()

Performance Time to compute placement: 1-2 sec => time to deploy usually much larger Comparison of heuristics –experiments with following workloads 1-3 capsules, CPU requirement 0-10%, wide range of overbooking tolerances –Random Select admitted most # services, Best Fit admitted least –But … more investigation needed

Summary QoS representation for CPU requirements of services. Implementation of placement unit. Some simple experiments to deploy and activate services.

Unfinished... Experiments: –heuristics better suited to specific workloads. –Scalability and efficiency of the system. Integration of placement unit with rest of the Control Plane Handling various failures Extend to multiple resources - much harder than a single resource!