Presentation is loading. Please wait.

Presentation is loading. Please wait.

Cellular Disco: resource management using virtual clusters on shared memory multiprocessors Published in ACM 1999 by K.Govil, D. Teodosiu,Y. Huang, M.

Similar presentations


Presentation on theme: "Cellular Disco: resource management using virtual clusters on shared memory multiprocessors Published in ACM 1999 by K.Govil, D. Teodosiu,Y. Huang, M."— Presentation transcript:

1 Cellular Disco: resource management using virtual clusters on shared memory multiprocessors Published in ACM 1999 by K.Govil, D. Teodosiu,Y. Huang, M. Rosenblum. Presenter: Soumya Eachempati

2 Motivation Large scale shared-Memory Multiprocessors –Large number of CPUs (32-128) –NUMA Architectures Off-the-shelf OS not scalable –Cannot handle large number of resources –Memory management not optimized for NUMA –No fault containment

3 Existing Solutions Hardware partitioning –Provides fault containment –Rigid resource allocation –Low resource utilization –Cannot dynamically adapt to workload New Operating System –Provides flexibility and efficient resource management. –Considerable effort and time Goal: To exploit hardware resources to the fullest with minimal effort while improving flexibility and fault-tolerance.

4 Solution: DISCO(VMM) –Virtual Machine monitor –Addresses NUMA awareness issues and scalability Issues not dealt by DISCO: –Hardware fault tolerance/containment –Resource management policies

5 Cellular DISCO Approach: Convert Multiprocessor machine into a Virtual Cluster Advantages: –Inherits the benefits of DISCO –Can support legacy OS transparently –Combines the goodness of H/W Partitioning and new OS. –Provides fault containment –Fine grained resource sharing –Less effort than developing an OS

6 Cellular DISCO Internally structured into semi-independent cells. Much less development effort compared to HIVE No performance loss - with fault containment. WARRANTED DESIGN DECISION: Code of Cellular DISCO is correct.

7 Cellular Disco Architecture

8 Resource Management Over-commits resources Gives flexibility to adjust fraction of resources assigned to VM. Restrictions on resource allocation due to fault containment. Both CPU and memory load balancing under constraints. –Scalability –Fault containment –Avoid contention First touch allocation, dynamic migration, replication of hot memory pages

9 Hardware Virtualization VM’s interface mimics the underlying H/W. Virtual Machine Resources (User-defined) –VCPUs, memory, I/O devices(physical) Physical vs. machine resources(allocated dynamically - priority of VM) –VCPUs - CPUs –Physical - machine pages VMM intercepts privileged instructions –3 modes - user & supervisor(guest OS), kernel(VMM). –Supervisor mode all memory accesses are mapped. Allocates machine memory to back the physical memory. Pmap and memmap data structure. Second level software TLB(L2TLB).

10 Hardware fault containment

11 VMM - software fault containment. Cell Inter-cell communication –Inter-processor RPC –Messages - no need for locking since serialized. –Shared memory for some data structures(pmap, memmap). –Low latency, exactly once semantics Trusted system software layer - enables us to use shared memory.

12 Implementation 1: MIPS R10000 32-processor SGI Origin 2000 Piggybacked on IRIX 6.4(Host OS) Guest OS - IRIX 6.2 Spawns Cellular DISCO(CD) as a multi- threaded kernel process. –Additional overhead < 2%(time spent in host IRIX) –No fault isolation: IRIX kernel is monolithic Solution: Some host OS support needed-one copy of host OS per cell.

13 I/O Request execution Cellular Disco piggybacked on IRIX kernel

14 32 - MIPS R10000

15 Characteristics of workloads Database - decision support workload Pmake - IO intensive workload Raytrace - CPU intensive Web - kernel intensive web-server workload.

16 Virtualization Overheads

17 Fault-containment Overheads Left bar - single cell config Right bar - 8 cell system.

18 CPU Management Load Balancing mechanisms: –Three types of VCPU migrations - Intra-node, Inter-node, Inter-cell. –Intra node - loss of CPU cache affinity –Inter node - cost of copying L2TLB, higher long term cost. –Inter cell - loss of both cache and node affinity, increases fault vulnerability. Alleviates penalty by replicating pages. Load balancing policies - idle (local load stealer) and periodic (global redistribution) balancers. Each CPU has local run queue of VCPUs. Gang-scheduling –Run all VCPUs of a VM simultaneously.

19 Load Balancing Low contention distributed data structure - load tree. Contention on higher level nodes List of cells vulnerable to - VCPU. Heavy loaded - idle balancer not enough Local periodic balancer for 8 CPU region.

20 CPU Scheduling and Results Scheduling - highest-priority gang runnable VCPU that has been waiting. Sends out RPC. 3 configs: 32- processors. a)One VM - 8 VCPUs--8 process raytrace. b)4 VMs c)8 VMs (total of 64 VCPUs). Pmap migrated only when all VCPUs are migrated out of a cell. Data pages also migrated for independence

21 Memory Management Each cell has its own freelist of pages indexed by the home node. Page allocation request –Satisfied from local node –Else satisfied from same cell –Else borrowed from another cell Memory balancing –Low memory threshold for borrowing and lending –Each VM has priority list of lender cells

22 Memory Paging Page Replacement –Second-chance FIFO Avoids double paging overheads. Tracking used pages –Use annotated OS routines Page Sharing –Explicit marking of shared pages Redundant Paging –Avoids by trapping every access to virtual paging disk

23 Implementation 2: FLASH Simulation FLASH has hardware fault recovery support Simulation of FLASH architecture on SimOS Use Fault injector –Power failure –Link failure –Firmware failure (?) Results: 100% fault containment

24 Fault Recovery Hardware support needed –Determine what resources are operational –Reconfigure the machine to use good resources Cellular Disco recovery –Step 1: All cells agree on a liveset of nodes –Step 2: Abort RPCs/messages to dead cells –Step 3: Kill VMs dependent on failed cells

25 Fault-recovery Times Recovery times higher for larger memory –Requires memory scanning for fault detections

26 Summary Virtual Machine Monitor –Flexible Resource Management –Legacy OS support Cellular Disco –Cells provide fault-containment –Create Virtual Cluster –Need hardware support


Download ppt "Cellular Disco: resource management using virtual clusters on shared memory multiprocessors Published in ACM 1999 by K.Govil, D. Teodosiu,Y. Huang, M."

Similar presentations


Ads by Google