Download presentation
Presentation is loading. Please wait.
1
Virtuoso: Distributed Computing Using Virtual Machines Peter A. Dinda Prescience Lab Department of Computer Science Northwestern University http://plab.cs.northwestern.edu
2
2 People and Acknowledgements Students –Ashish Gupta, Ananth Sundararaj, Bin Lin, Alex Shoykhet, Jack Lange, Dong Lu, Jason Skicewicz, Brian Cornell Collaborators –In-Vigo project at University of Florida Renato Figueiredo, Jose Fortes Funders/Gifts –NSF through several awards, VMWare
3
3 Outline Motivation and context Virtuoso model Virtual networking –Its central importance Application traffic load measurement and topology inference Understanding user comfort with resource borrowing –User-centric resource control Related work Conclusions
4
4 How do we deliver arbitrary amounts of computational power to ordinary people?
5
5 Distributed and Parallel Computing Interactive Applications
6
6 How do we deliver arbitrary amounts of computational power to ordinary people? Distributed and Parallel Computing Interactive Applications
7
7 IBM xSeries virtual cluster (64 CPUs), 1 TB RAID Northwestern Internet Interactivity Environment Cluster, CAVE (~90 CPUs), 8 TB RAID 2 Distributed Optical Testbed Clusters IBM xSeries (14-28 CPUs), 1 TB RAID Nortel Optera Metro Edge Optical Router Distributed Optical Testbed (DOT) Private Optical Network DOT clusters with optical connectivity IBM xSeries (14-28 CPUs), 1 TB RAID: Argonne, U.Chicago, IIT, NCSA, others
8
8 Grid Computing “Flexible, secure, coordinated resource sharing among dynamic collections of individuals, institutions, and resources” I. Foster, C. Kesselman, S. Tuecke, The Anatomy of the Grid: Enabling Scalable Virtual Organizations, International J. Supercomputer Applications, 15(3), 2001 Globus, Condor/G, Avaki, EU DataGrid SW, …
9
9 Complexity from User’s Perspective Process or job model –Lots of complex state: connections, special shared libraries, licenses, file descriptors Operating system specificity –Perhaps even version-specific –Symbolic supercomputer example Need to buy into some Grid API Install and learn potentially complex Grid software
10
10 Users already know how to deal with this complexity at another level
11
11 Complexity from Resource Owner’s Perspective Install and learn potentially complex Grid software Deal with local accounts and privileges –Associated with global accounts or certificates Protection/Isolation Support users with different OS, library, license, etc, needs.
12
12 Virtual Machines Language-oriented VMs –Abstract interpreted machine, JIT Compiler, large library –Examples: UCSD p-system, Java VM,.NET VM Application-oriented VMs –Redirect library calls to appropriate place –Examples: Entropia VM Virtual servers –Kernel makes it appear that a group of processes are running on a separate instance of the kernel or run OS at user-level on top of itself –Examples: Ensim, Virtuozzo, UML, VServer, FreeVSD … Microkernels designed to host OSes –Xeno VM Virtual machine monitors (VMMs) –Raw machine is the abstraction –VM represented by a single image –Examples: IBM’s VM, VMWare, Virtual PC/Server, Plex/86, SIMICS, Hypervisor, DesQView/TaskView. VM/386
13
13 VMWare GSX VM
14
14 Isn’t It Going to Be Too Slow? ApplicationResourceExecTime (10^3 s) Overhead SpecHPC Seismic (serial, medium) Physical16.4N/A VM, local16.6 1.2% VM, Grid virtual FS 16.8 2.0% SpecHPC Climate (serial, medium) Physical9.31N/A VM, local9.68 4.0% VM, Grid virtual FS 9.70 4.2% Experimental setup: physical: dual Pentium III 933MHz, 512MB memory, RedHat 7.1, 30GB disk; virtual: Vmware Workstation 3.0a, 128MB memory, 2GB virtual disk, RedHat 2.0 NFS-based grid virtual file system between UFL (client) and NWU (server) Small relative virtualization overhead; compute-intensive Relative overheads < 5%
15
15 Isn’t It Going To Be Too Slow? Synthetic benchmark: exponentially arrivals of compute bound tasks, background load provided by playback of traces from PSC Relative overheads < 10%
16
16 Isn’t It Going To Be Too Slow? Virtualized NICs have very similar bandwidth, slightly higher latencies –J. Sugerman, G. Venkitachalam, B-H Lim, “Virtualizing I/O Devices on VMware Workstation’s Hosted Virtual Machine Monitor”, USENIX 2001 Disk-intensive workloads (kernel build, web service): 30% slowdown –S. King, G. Dunlap, P. Chen, “OS support for Virtual Machines”, USENIX 2003 However: May not scale with faster NIC or disk
17
17 Won’t Migration Be Too Slow? Appears daunting Memory + disk! Nonetheless –Stanford Collective: 20 minutes at DSL speeds! Sapuntzakis, et al, OSDI 2002, very deep work Wide variety of techniques –Intel/CMU ISR: 2.5-30 seconds from distributed file system at LAN speeds –Our work: 2-400 seconds with rsync on LAN (Shoykhet) –Current project: versioning file system (Cornell, Patel)
18
18 Virtuoso Approach: Lower level of abstraction –Raw machines connected to user’s network Mechanism: Virtual machine monitors Our Focus: Middleware support to hide complexity –Ordering, instantiation, migration of machines –Virtual networking and remote devices –Connectivity to remote files, machines –Information services –Monitoring and prediction –Resource control
19
19 The Virtuoso Model 1.User orders raw machine(s) Specifies hardware and performance Basic software installation available OS, libraries, licenses, etc. 2.Virtuoso creates raw image and returns reference Image contains disk, memory, configuration, etc. 3.User “powers up” machine 4.Virtuoso chooses provider Information service 5.Virtuoso migrates image to provider Efficient network transfer rsync, demand paging, versioned filesystems
20
20 User Configuring a New VM
21
21 The Virtuoso Model 6.Provider instantiates machine Virtual networking ties machine back to user’s home network Remote device support makes user’s desktop’s devices available on remote VM Remote display support gives user the console of the machine (VNC) Resource control to give user expected performance 7.User goes to his network admin to get address, routing for his new machine 8.User customizes machine Feeds in CDs, floppies, ftp, up2date, etc.
22
22 VM Running with Browser Console Display
23
23 The Virtuoso Model 9.User uses machine Shutdown, hibernate, power-off, throw away 10.Virtuoso continuously monitors and adapts Virtual network as a monitoring platform Various mechanisms, all invisible to user Migrating the machine Routing traffic between machines Virtual network topology Predictive scheduling versus reservations Various goals Price Interactivity Direct User Feedback R. Figueiredo, P. Dinda, J. Fortes, A Case For Grid Computing on Virtual Machines, ICDCS 2003
24
24 Context Virtuoso Virtualized Audio Clairvoyance User Comfort URGIS Interactive HPC Exemplar Application RTSA/Maestro A Framework for Distributed Computing Using Virtual Machines Measuring, Inferring, and Predicting Dynamic Resource and Application Behavior Measuring Human Comfort Achieving Human Comfort Representing and Querying the Computing Environment as a Whole
25
25 Outline Motivation and context Virtuoso model Virtual networking –Its central importance Application traffic load measurement and topology inference Understanding user comfort with resource Borrowing –User-centric resource control Related work Conclusions
26
26 Why Virtual Networking? (with Sundararaj) A machine is suddenly plugged into your network. What happens? –Does it get an IP address? –Is it a routeable address? –Does firewall let its traffic through? –To any port? How do we make virtual machine hostile environments as friendly as the user’s LAN?
27
27 A Layer 2 Virtual Network (VLAN) for the User’s Virtual Machines Why Layer 2? –Protocol agnostic –Mobility –Simple to understand –Ubiquity of Ethernet on end-systems What about scaling? –Number of VMs limited –Hierarchical routing possible because MAC addresses can be assigned hierarchically A. Sundararaj, P. Dinda, Towards Virtual Networks for Virtual Machine Grid Computing, USENIX VM 2004
28
28 A Simple Layer 2 Virtual Network ClientServer Remote VM Physical NIC VM monitor Virtual NIC Physical NIC SSH Hostile Remote NetworkFriendly Local Network
29
29 A Simple Layer 2 Virtual Network ClientServer Remote VM Physical NIC VM monitor Virtual NIC Physical NIC SSH Hostile Remote NetworkFriendly Local Network
30
30 A Simple Layer 2 Virtual Network ClientServer Remote VM Physical NIC VM monitorvnetd Virtual NIC Physical NIC UDP, TCP, TCP/SSL, or SSH tunnel Hostile Remote NetworkFriendly Local Network
31
31 More Details Host VM Proxy VNET Client vmnet0 ethx ethz“eth0” VNET ethy “eth0” Client LAN IP Network Ethernet Packet Tunneled over TCP/SSL Connection Ethernet Packet Captured by Promiscuous Packet Filter Ethernet Packet Injected Directly into VM interface “Host Only” Network VNET 0.9 available from http://virtuoso.cs.northwestern.edu
32
32 Initial Performance Results (LAN) Faster than NAT approach Lots of room for improvement This version you can download and use right now
33
33 An Overlay Network Vnetds and connections form an overlay network for routing traffic among virtual machines and the user’s home network Links can added or removed on demand Forwarding rules can be added or removed on demand
34
34 Bootstrapping the Virtual Network Star topology always possible Connecting from client must have been possible Better topology may be possible Depends on security at each site Topology may change Virtual machines can migrate VM Vnetd
35
35 VM Layer Vnetd Layer Physical Layer
36
36 VM Layer Vnetd Layer Physical Layer Application communication topology and traffic load; application processor load
37
37 VM Layer Vnetd Layer Physical Layer Application communication topology and traffic load; application processor load Network bandwidth and latency; sometimes topology
38
38 VM Layer Vnetd Layer Physical Layer Application communication topology and traffic load; application processor load Network bandwidth and latency; sometimes topology Vnetd layer can collect all this information as a side effect of packet transfers
39
39 VM Layer Vnetd Layer Physical Layer Application communication topology and traffic load; application processor load Network bandwidth and latency; sometimes topology Vnetd layer can collect all this information as a side effect of packet transfers and invisibly act
40
40 VM Layer Vnetd Layer Physical Layer Application communication topology and traffic load; application processor load Network bandwidth and latency; sometimes topology Vnetd layer can collect all this information as a side effect of packet transfers and invisibly act VM Migration
41
41 VM Layer Vnetd Layer Physical Layer Application communication topology and traffic load; application processor load Network bandwidth and latency; sometimes topology Vnetd layer can collect all this information as a side effect of packet transfers and invisibly act VM Migration Topology change
42
42 VM Layer Vnetd Layer Physical Layer Application communication topology and traffic load; application processor load Network bandwidth and latency; sometimes topology Vnetd layer can collect all this information as a side effect of packet transfers and invisibly act VM Migration Topology change Routing change
43
43 VM Layer Vnetd Layer Physical Layer Application communication topology and traffic load; application processor load Network bandwidth and latency; sometimes topology Vnetd layer can collect all this information as a side effect of packet transfers and invisibly act VM Migration Topology change Routing change Reservation
44
44 Outline Motivation and context Virtuoso model Virtual networking –Its central importance Application traffic load measurement and topology inference Understanding user comfort with resource borrowing –User-centric resource control Related work Conclusions
45
45 Application Traffic Load Measurement and Topology Inference (With Gupta) Parallel and distributed applications display particular communication patterns on particular topologies –Intensity of communication can also vary from node to node or time to time. –Combined representation: Traffic Load Matrix VNET already sees every packet sent or received by a VM Can we use this information to compute a global traffic load matrix? Can we eliminate irrelevant communication from matrix to get at application topology?
46
46 Overall Steps Low level inter-VM traffic monitoring within VNET Compute rows and columns of traffic matrix for local VMs Reduction to a global inter-VM traffic load matrix Matrix denoising to determine application topology Offline to online
47
47 Traffic Monitoring and Reduction Host VM VNET vmnet0 ethz“eth0” “Host Only” Network Ethernet Packet Format: SRC|DEST|TYPE|DATA (size) VMTrafficMatrix[SRC][DEST]+=size Each VM on the host contributes a row and column to the VM traffic matrix Global reduction to find overall matrix, broadcast back to VNETs Each VNET daemon has a view of the global network load Packets observed here
48
48 Denoising The Matrix Throw away irrelevant communication –ARPs, DNS, ssh, etc. Find maximum entry, a Eliminate all entries below alpha*a Very simple, but seems to work very well for BSP parallel applications Remains to be seen how general it is
49
49 Offline Results: Synthetic Benchmark
50
50 NAS IS Benchmark
51
51 NAS IS Benchmark h1h2h3h4h5h6h7h8 h1 19.019.619.219.618.813.719.3 h222.6 10.710.810.710.99.710.5 h322.28.78 11.210.410.110.5 h422.48.99.5 11.110.810.610.2 h522.310.09.519.72 11.710.911.9 h624.08.910.79.910.8 12.212.1 h723.210.09.79.510.310.2 12.0 h824.911.211.011.811.511.210.7 *numbers indicate MB of data transferred.
52
52 Online Challenges When to start? When to stop? –Traffic matrix may not be stationary! Synchronized monitoring –All must start and stop together
53
53 When To Start? When to Stop? Reactive MechanismsProactive Mechanisms Start when traffic rate exceeds threshold Stop when traffic rate exceeds a second threshold Non-uniform discrete event sampling Provide support for queries by external agent Keep multiple copies of the matrix, one for each resolution (1s, 2s, 4s, etc) What is the Traffic Matrix from the last time there was at least one high rate source? What is the Traffic Matrix for the last n seconds ?
54
54 Overheads (100 mbit LAN) Essentially zero latency impact 4.2 % throughput reduction versus VNET A. Gupta, P. Dinda, Inferring the Topology and Traffic Load of Parallel Programs Running In a Virtual Machine Environment, In Submission.
55
55 Online: NAS IS on 4 VMs
56
56 Outline Motivation and context Virtuoso model Virtual networking –Its central importance Application traffic load measurement and topology inference Understanding user comfort with resource borrowing –User-centric resource control Related work Conclusions
57
57 Why Understand User Comfort With Resource Borrowing? (With Gupta, Lin) Provider supports both interactive and batch VMs Provider controls resources –WFQ (Ensim) –Priority (our nascent work) –Periodic real-time schedule (our plans) How to use control to provide good interactive performance cheaply?
58
58 Why Understand User Comfort With Resource Borrowing? Interactive user specifies peak resource demand for his VM What level of resource borrowing is he willing to tolerate? Similar question in SETI@Home style distributed parallel computing
59
59 Understanding User Comfort System Windows-based distributed system for measuring user comfort with resource borrowing Borrowing = degree of contention –CPU Bandwidth –Disk Bandwidth –Memory pages 1.0 contention for CPU ~ user’s tasks run half as fast
60
60 Understanding User Comfort System Server Client Registration (Machine info) Hot Sync (Result Post) Hot Sync (Testcase Request) Hot Sync (Testcases) Local Result Store Local Testcase Store Global Result Store Global Testcase Store http://comfort.cs.northwestern.edu
61
61 Controlled Study 30+ people, ~90 minutes each 4 application tasks Word, Powerpoint, IE, Quake Ramp, step, and blank testcases {CPU, Disk, Memory} X {Step, Ramp} + 2 blanks
62
62 A. Gupta, B. Lin, P. Dinda, Measuring and Understanding User Comfort with Resource Borrowing, HPDC 2004.
63
63 Insights Users surprisingly tolerant, particularly for disk and memory borrowing Context is critical, user self-classification much less so Frog-in-the-pot only occasionally true
64
64 Using User Feedback Directly Discomfort feedback as congestion indication a la TCP Reno Rate => VM Priority Adaptive gain control for congestion avoidance phase –Target: maintain stable time between feedback events Somewhat promising, but very initial results B. Lin, P. Dinda, D. Lu, User-driven Scheduling Of Interactive Virtual Machines, In Submission.
65
65 Outline Motivation and context Virtuoso model Virtual networking –Its central importance Application traffic load measurement and topology inference Understanding user comfort with resource borrowing –User-centric resource control Related work Conclusions
66
66 Related Work Collective / Capsule Computing (Stanford) –VMM, Migration/caching, Hierarchical image files Denali (U. Washington) –Highly scalable VMMs (1000s of VMMs per node) CoVirt (U. Michigan) Xenoserver (Cambridge) SODA (Purdue) –Virtual Server, fast deployment of services Internet Suspend/Resume (Intel Labs Pittsburgh / CMU) Ensim –Virtual Server, widely used for web site hosting –WFQ-based resource control released into open-source Linux kernel Virtouzzo (SWSoft) –Ensim competitor Available VMMs: IBM’s VM, VMWare, Virtual PC/Server, Plex/86, SIMICS, Hypervisor, DesQView/TaskView. VM/386
67
67 Conclusions and Status Virtual machines on virtual networks as the abstraction for distributed computing Virtual network as a fundamental layer for measurement and adaptation Virtuoso prototype running on our cluster 1 st generation VNET released. 2 nd generation in progress, versioning file system released
68
68 For More Information Prescience Lab –http://plab.cs.northwestern.edu Virtuoso –http://virtuoso.cs.northwestern.edu Join our user comfort study! –http://comfort.cs.northwestern.edu
69
69 Papers R. Figueiredo, P. Dinda, J. Fortes, A Case For Grid Computing on Virtual Machines, ICDCS 2003 A. Gupta, B. Lin, P. Dinda, Understanding User Comfort With Resource Borrowing, HPDC 2004 A. Sundararaj, P. Dinda, Towards Virtual Networks for Virtual Machine Grid Computing, USENIX VM 2004. B. Cornell, P. Dinda, F. Bustamante, Wayback: A User- level Versioning File System For Linux, USENIX 2004. A. Sundararaj, P. Dinda, Exploring Inference-based Monitoring of Virtual Machine Resources, In Submission. A. Gupta, P. Dinda, Inferring the Topology and Traffic Load of Parallel Programs Running In a Virtual Machine Environment, In Submission. B. Lin, P. Dinda, User-driven Scheduling of Interactive Virtual Machines, In Submission.
70
70 Migration (With Shoykhet)
71
71 Migration (With Shoykhet)
72
72 Resource Control Owner has an interest in controlling how much and when compute time is given to a virtual machine Our approach: A language for expressing these constraints, and compilation to real-time schedules, proportional share, etc. Very early stages. Trying to avoid kernel modifications.
73
73 Front Page
74
74 Provider Registering A Machine
75
75 Provider Machine List
76
76 User Configuring a New VM
77
77 Options For Registered VM
78
78 Registered VM Configuration
79
79 User Selects Physical Machine
80
80 VM Running with Browser Console Display
81
81 Options for a Suspended Machine
82
82 Choosing A Machine To Migrate To
83
83 Specifics of This Talk Virtuoso Overview Virtual Networking Application Traffic Characterization and Topology Inference Understanding User Comfort With Resource Borrowing –User-centric Resource Control Virtuoso Virtualized Audio Clairvoyance User Comfort URGIS RTSA/Maestro
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.