Department of Computer Science Northwestern University Inferring the Topology and Traffic Load of Parallel Programs in a VM environment Ashish Gupta Peter Dinda Department of Computer Science Northwestern University Introduce the project.
Overview Motivation behind parallel programs in a VM environment Goal: To infer the communication behavior Offline implementation Evaluating with parallel benchmarks Online Monitoring in a VM environment Conclusions More specific
Virtuoso: A VM based abstraction for a Grid environment
Add: we have all this wonderful toys, environment where we can run parallel applications.. Lot of complexity. Resources are out there , but increbvidly complicated : Virtuso: provide a clean abtraction on top of all this
A distributed computing environment based on Virtual Machines Motivation A distributed computing environment based on Virtual Machines Raw machines connected to user’s network Our Focus: Middleware support to hide the Grid complexity Virtuoso project. Efficient execution of parallel applications means better performance, cost and resource utilization for both the user and the service providers
Can ssh into the machine They can use the group of machines as if they would use a local cluster…. The middleware is repsonsible for making it look like a cluster
A distributed computing environment based on Virtual Machines Motivation A distributed computing environment based on Virtual Machines Raw machines connected to user’s network Our Focus: Middleware support to hide the Grid complexity Our goal here: Efficient execution of Parallel applications in such an environment Virtuoso project. Efficient execution of parallel applications means better performance, cost and resource utilization for both the user and the service providers Add: our goal
Intelligent Placement and of parallel applications Application Behavior Intelligent Placement and virtual networking of parallel applications By combining the knowledge of these two aspects, we are in a position to make decisions. Our decision making capability depends on two factors: VM Encapsulation provide flexibility in placement and migration of parallel applications on the network Virtual Networks : can create custom routing and topologies on the underlying physical network The goal of this project focus on the upper right hand part: understanding parallel application communication behavior So what does the behavior look like: Parallel Applications communicate and compute. Communicate according to certain topologies. Also bandwidth VM Encapsulation Virtual Networks With VNET
Abstraction: A set of VMs on same Layer 2 network Virtual Ethernet LAN VNET Abstraction: A set of VMs on same Layer 2 network Virtual Ethernet LAN
An online topology inference framework for a VM environment Goal of this project ? Application Topology Low Level Traffic Monitoring An online topology inference framework for a VM environment Add: Application topology
Design an offline framework Evaluate with parallel benchmarks Approach Design an offline framework Evaluate with parallel benchmarks What is offline ? To test hypothesis Using tcpdump If successful, design an online framework for VMs
An offline topology inference framework Goal: A test-bed for traffic monitoring and evaluating topology inference methods
The offline method Synced Parallel Traffic Monitoring Traffic Filtering and Matrix Generation Wrote a set of perl scripts for each step to automate the whole process. Manually can be painful. Use tcpdump to capture traffic for each host Then we can do the require filtering to remove any non-application projects Do flow aggregation to generate a traffic matrix for the hosts Add: the steps are simple, explain the filterig and algorithm here. Matrix Analysis and Topology Characterization
The offline method Synced Parallel Traffic Monitoring Traffic Filtering and Matrix Generation Then after we get the traffic matrix, we need to have algorithms to infer the topology from the traffic matrix, leaving out the noise For this there are many ways, like pruning, pattern detection etc. Currently a simple linear scaling method is used to infer the topology. Explain the technique Matrix Analysis and Topology Characterization
The offline method Synced Parallel Traffic Monitoring Traffic Filtering and Matrix Generation A simple visualization module was added which can visualize the inferred topology. Using SAMBA from Georgia Tech. Example Matrix Analysis and Topology Characterization
The offline method Synced Parallel Traffic Monitoring Traffic Filtering and Matrix Generation A simple visualization module was added which can visualize the inferred topology. Using SAMBA from Georgia Tech. Example Matrix Analysis and Topology Characterization
The offline method PVMPOV Inference Synced Parallel Traffic Monitoring Traffic Filtering and Matrix Generation A simple visualization module was added which can visualize the inferred topology. Using SAMBA from Georgia Tech. Example Matrix Analysis and Topology Characterization PVMPOV Inference
Infer.pl Synced Parallel Traffic Monitoring Traffic Filtering and Matrix Generation Explain the visualization generated. Finally a one step process: Doitall [parallel program name] and it went through all the steps and displayed the infered topology Add: pvmpov running Matrix Analysis and Topology Characterization Infer.pl
Parallel Benchmarks Evaluation Goal: To test the practicality of low level traffic based inference
Parallel Benchmarks used Synthetic benchmarks: Patterns N-dimensional mesh-neighbor N-dimensional toroid-neighbor N-dimensional hypercubes Tree reduction All-to-All Scheduling mechanism to generate deadlock free and efficient schemes The patterns application was custom - written Used PVM: Parallel virtual Machine A parallel application development library Speak: Our implementation of these communication patterns if efficient and deadlock free 1 2 3
Application benchmarks NAS PVM benchmarks Popular benchmarks for parallel computing 5 benchmarks PVM-POV : Distributed Ray Tracing Many others possible… The inference not PVM specific Applicable to all communication . e.g. MPI, even non-parallel apps Computational and communication characteristics in computational aerodynamics
Patterns application 3-D Toroid 3-D Hypercube 2-D Mesh Reduction Tree Add: identify the topologies Reduction Tree All-to-All
PVM NAS benchmarks Parallel Integer Sort
Traffic Matrix for PVM IS benchmark This traffic matrix shows how placement of host1 is crucial on the network. High bandwidth connectivity should be there Also the virtual network should facilitate the communication between other hosts and host1 Also the decisions need to be dynamic since the properties of the physical network may change Speak: which host talks to which host……
Traffic Matrix for PVM IS benchmark This traffic matrix shows how placement of host1 is crucial on the network. High bandwidth connectivity should be there Also the virtual network should facilitate the communication between other hosts and host1 Also the decisions need to be dynamic since the properties of the physical network may change Speak: host1 is communication is intensive: we can see it sends a lot from tehc column, and receives a lot as from the row Placement of host1 is crucial on the network
An Online Topology Inference Framework: VTTIF Goal: To automatically detect, monitor and report the global traffic matrix for a set of VMs running on a overlay network
VNET Overall Design Abstraction: A set of VMs on same Layer 2 network Virtual Ethernet LAN The framework can aggregate all traffic for all periods of time. Need to detect only interesting communication behavior so that appropriate action can be taken. Add: what is vnet ?
A VNET virtual layer A Virtual LAN over wide area VNET Layer Physical Layer
Extend VNET to include the required features The Challenge here Overall Design VNET Abstraction: A set of VMs on same Layer 2 network Extend VNET to include the required features Monitoring at Ethernet packet level The Challenge here Lacks manual control Detecting interesting parallel program communication ? The framework can aggregate all traffic for all periods of time. Need to detect only interesting communication behavior so that appropriate action can be taken. Add: what is vnet ?
Detecting interesting phenomenon Reactive Mechanisms Proactive Mechanisms Like a Burglar Alarm Video Surveillance Certain address properties Based on Traffic rate Etc. Provide support for queries by external agent Rate based monitoring Non-uniform discrete event sampling What is the Traffic Matrix for the last n seconds ? Explain both the mechanisms with examples
VM Network Scheduling Agent Physical Host VNET daemon VNET overlay network Traffic Analyzer Rate based Change detection Traffic Matrix Query Agent To other VNET daemons Add: text contrast is bad, text is small VM Network Scheduling Agent
Traffic Matrix Aggregation Each VNET daemon keeps track of local traffic matrix Need to aggregate this information for a global view When the rate falls, the local daemons push the traffic matrix (When do you push the traffic matrix ?) Operation is associative: reduction trees for scalability Scalability issues Performance is ok in latency and bandwidth tests. The proxy daemon
Used 4 Virtual Machines over VNET NAS IS benchmark Evaluation Used 4 Virtual Machines over VNET NAS IS benchmark We run the parallel program and the Framework automatically detects the communication and outputs the aggregated traffic matrix when the communication ends The traffic matrix was much larger than that from the physical hosts. Reasons: packets dropped in tcpdump in the offline case.
Possible to infer the topology low level traffic monitoring Conclusions Possible to infer the topology with low level traffic monitoring A Traffic Inference Framework for Virtual Machines Ready to move on to future steps: Adaptation for Performance Add: current and future work…. Add: the numbers on adaptation Add: questions/advertizing…
Capabilities for dynamic adaptation into VNET Current Work Capabilities for dynamic adaptation into VNET Spatial Inference Network Adaptation for Improved Performance Prelim Results: Improved performance upto 40% in execution time Looking into benefits of Dynamic Adaptation
plab.cs.northwestern.edu http://virtuoso.cs.northwestern.edu For more information http://virtuoso.cs.northwestern.edu VNET is available for download PLAB web site: plab.cs.northwestern.edu