Presentation is loading. Please wait.

Presentation is loading. Please wait.

Simulation, Emulation Sathish Vadhiyar Sources / Credits: Microgrid, Simgrid.

Similar presentations


Presentation on theme: "Simulation, Emulation Sathish Vadhiyar Sources / Credits: Microgrid, Simgrid."— Presentation transcript:

1 Simulation, Emulation Sathish Vadhiyar Sources / Credits: Microgrid, Simgrid

2 Importance Needed for characterizing behavior of Grid systems in the future During development period, to test methodologies under repeatable conditions For simulating “what if” scenarios Needed when there is no real grid. Needed in India

3 MicroGrid Enables systematic design and evaluation of middleware, applications, and network services for computational Grid. Provides an environment for scientific and repeatable experiments. Microgrid can also predict performance on futuristic and fictional topologies Features Enables use of Globus applications without change by virtualizing execution environment providing the illusion of virtual Grid. Enables use of Globus applications without change by virtualizing execution environment providing the illusion of virtual Grid. Uses global virtual time to preserve simulation accuracy Uses global virtual time to preserve simulation accuracy Provides basic resource simulation models for computing, memory and networking Provides basic resource simulation models for computing, memory and networking

4 Virtualizing resources Uses mapping table for mapping from virtual IP address to physical IP address Intercepts relevant library calls Gethostbyname Gethostbyname Bind, send, receive Bind, send, receive Process creation – process created through Globus resource management functions Process creation – process created through Globus resource management functions User will be logged in directly to a physical host and submit jobs to virtual hosts Globus gatekeeper, job managers and client hosts run on virtual hosts All socket interfaces and information services are also virtualized

5 Global Coordination Simulation Rate – rate at which simulator runs. How much of real cpu is simulator using. Minimum feasible simulation rate depending on desired virtual resources and actual capacities of physical resources Minimum value of SR over all resources – fastest rate at which simulation can be run in a functionally correct manner

6 Simulation Rate Examples Given physical = 1 GHz, virtual = 2 GHz, simulation rate cannot be less than 2. Otherwise you will be guaranteeing more than 100% CPU usage ! Given physical = 2 GHz, virtual = 1 GHz, simulation rate cannot be less than 0.5. Same argument.

7 More Another parameter (say x) that determines how fast time progresses in the application Greater the value, faster the time progresses in the application Calls like gettimeofday and select use these parameters to return appropriate adjusted times Thus virtual cpu twice the speed of real cpu, simulation rate = 2, and x =2 will give ½ the time for a code fragment

8 Resource Simulation

9 Simulation rate is divided equally across all processes executing on the physical host The resulting fractions are then enforced by local MicroGrid CPU scheduler It is a scheduler daemon using signals to allocate local physical CPU capacity to local MicroGrid tasks

10 How to ensure CPU usage Naïve strategy - Calculate usage for procs. on virtual machine. Give all procs. the same usage. E.g. if (virtual / physical) is 25% and 2 procs. running on virtual machine, assign each process 10 milliseconds every 80 milliseconds. Not good An application process should always be ready to run if it has not used its available CPU slots An application process should always be ready to run if it has not used its available CPU slots A computation intensive process should be able to fully utilize the quota for virtual machine A computation intensive process should be able to fully utilize the quota for virtual machine

11 MicroGrid CPU Controller Each CPU controller on each physical host Uses SIGSTOP and SIGCONT to stop and continue processes Consists of 3 parts Live process interception – whenever a virtual process is created or destroyed on microgrid using main() or exit(), CPU controller traps it and updates its process table Live process interception – whenever a virtual process is created or destroyed on microgrid using main() or exit(), CPU controller traps it and updates its process table CPU usage monitoring – every sliding window, the controller reads CPU usage from /proc of processes in its process table CPU usage monitoring – every sliding window, the controller reads CPU usage from /proc of processes in its process table Process scheduling – the controller calculates CPU usage of each virtual host in a time window. If the amount of effective cycles exceed the speed of the virtual hosts, the controller sends SIGSTOP to all processes of the virtual hosts, otherwise, it wakes up processes and let them proceed Process scheduling – the controller calculates CPU usage of each virtual host in a time window. If the amount of effective cycles exceed the speed of the virtual hosts, the controller sends SIGSTOP to all processes of the virtual hosts, otherwise, it wakes up processes and let them proceed

12 CPU Controller

13 Determining sliding window size E - design accuracy error p - scaled virtual machine speed (fraction of physical CPU) w - the sliding window size in jiffies n - the available jiffies in a sliding window n should satisfy: w = round(n/p) and | 1 - n/(p*w) | < E Find the smallest n that satisfies equation | 1 - (n/p)/round(n/p) | < E, then find w.

14 Example Real machine – 1 GHz Virtual machine – 600 MHz Simulation rate – 2 E – 0.05 p = 600/1000 = 60%, with simulation rate 2, it is 30% real cpu Smallest n that satisfies | 1 - (10n/3) / round(10n/3) | < 0.05 Try n= 1,2,3… Here, n = 2 w = 7

15 Network Simulation Based on MaSSF – a scalable packet-level network simulator that supports direct execution of unmodified application Uses a distributed simulation engine Can model many kinds of network protocols including TCP/IP, UDP, user-defined protocols etc. Intercepts live network streams at the socket level using wrapper library called WrapSocket

16 Live traffic interception

17 Scalability Given a network topology and available cluster nodes, MaSSF partitions the virtual network to multiple blocks and assigns each block to a cluster node Every cluster node runs a discrete event simulation engine Events are exchanged among simulation engines. Cluster nodes also needs to synchronize periodically. Involves traffic

18 Scalability Hence network mapping has to be done carefully to minimize communication of simulation events between simulation engine nodes and to achieve load balance across partitions Network mapping problem modeled as graph partitioning problem – can estimate the number of simulation events on each single link and use it to calculate edge weight.

19 Improving scalability Graph partitioning for network mapping problem Input graph – traffic information (defines edge weights), network structure Input graph – traffic information (defines edge weights), network structure Constraints – weighted sum of computation and memory requirement on each simulation engine node (vertex weight) to be balanced among multiple vertices Constraints – weighted sum of computation and memory requirement on each simulation engine node (vertex weight) to be balanced among multiple vertices Objectives – communication across partitions (edge-cut) to be minimized Objectives – communication across partitions (edge-cut) to be minimized Partitioned network defines the mapping of simulated network nodes to physical resources

20 Real applications on MicroGrid - Lot more to do…

21 SimGrid You know it

22 References / Sources / Credits Validating and Scaling the MicroGrid: A Scientific Instrument for Grid Dynamics, Xin Liu, Huaxia Xia, and Andrew Chien, to appear in the Journal of Grid Computing. The MicroGrid: a Scientific Tool for Modeling Computational Grids, in Proceedings of SC2000 (Song, Liu, Jakobsen, Bhagwan, Zhang, Taura and Chien) The MicroGrid: a Scientific Tool for Modeling Computational Grids, in Proceedings of SC2000 (Song, Liu, Jakobsen, Bhagwan, Zhang, Taura and Chien) Simgrid: A Toolkit for the Simulation of Application Scheduling. CCGrid 01

23 JUNK!

24 Calls Setting up the simulated application and computation environment Setting up the simulated application and computation environment Simulating the application execution once the tasks have been assigned to resources – SG_simulate Simulating the application execution once the tasks have been assigned to resources – SG_simulate Scheduling algorithms Scheduling algorithms Based on performance prediction – SG_getPrediction Implementation of scheduling decision – SG_scheduleTaskOnResource Also supports runtime scheduling algorithms. Control must be returned from SG_simulate to scheduling algorithm itself. For work queue control is returned after each task completes. For others, user can specify how long a simulation should run before control is returned. SG_unscheduleTask can be used to modify scheduling decisions for tasks. Many API calls help the user to keep track of past scheduling decisions.

25 SG_getclock returns virtual global time Can do post mortem analysis with the help of resource usage and start and end times and compute various metrics and how the simulation behaved

26 SimGrid-2 paper Simulations allow Repeatable experiments Repeatable experiments To explore wide range of application and resource scenarios To explore wide range of application and resource scenariosSimgrid For developing and evaluating scheduling algorithms For developing and evaluating scheduling algorithms Objectives – good usability, fast simulations, configurable, tunable and extensible simulations, scalable Objectives – good usability, fast simulations, configurable, tunable and extensible simulations, scalable Aim towards simulation standardization Aim towards simulation standardization

27 Simgrid components Agent – implements scheduling algorithm, contains code, private data and location Location – where agent runs, defined by location, mail boxes for communicating with other agents and private data Task – defined by amount of computing, data size, private data Path – routing abstractions Channel – abstraction representing communication between agents

28 Simulation program steps Definition of code for each agent Modeling application Modeling application Done with MSG_Task_Get, MSG_Task_Put, MSG_Task_Execute Done with MSG_Task_Get, MSG_Task_Put, MSG_Task_Execute Creation of resources Modeling the physical platform Modeling the physical platform Hosts, links, routing table paths Hosts, links, routing table paths MSG_host_create, MSG_link_create, MSG_routing_table_set MSG_host_create, MSG_link_create, MSG_routing_table_set Creation and allocation of agents to locations Application deployment Application deployment MSG_process_create MSG_process_create Starting simulation MSG_main MSG_main

29 Resource sharing is supported by SimGrid by supporting different models FIFO FIFO FRFO FRFO SHARED – fair sharing or priority-based sharing SHARED – fair sharing or priority-based sharingChallenges Users to construct large simulated platforms Users to construct large simulated platforms To simulate the complex network contention behaviors of applications executing on these platforms To simulate the complex network contention behaviors of applications executing on these platforms

30 Modeling grid topologies Simgrid allows users to import platform descriptions obtained with Effective Network View (ENV). Thus SimGrid uses ENV and NWS to instantiate platform models which represent realistic platforms both in terms of topology and in terms of traffic.

31 Bandwidth sharing models Algorithm first considers all bottleneck links and flows on these links Assigns a bandwidth to flows on these links inversely proportional to their rtts. Algorithm reduces bandwidths on the links traversed by these flows Process repeated until bandwidths assigned to all flows Simgrid makes it possible to define two types of links: those where bandwidth is shared and those where bandwidth is not shared Good for modeling grid computing topology where local networks connected by a shared backbone

32 GridSim Individual resource brokers and central schedulers

33 Simjava Simulations in Simjava contain a number of entities each running as own threads Entities call simulation functions (sim_schedule, sim_hold, sim_wait) and events are generated.

34 Every event has source entity and destination entity

35 NPB with MicroGrid

36 Scheduling quanta length and Modeling Accuracy

37 Internal Performance NPB run on real Alpha cluster of 4 machines and on Microgrid with CPU fraction 4% The periodic execution times obtained every 1 second for alpha cluster and _? second(s) for MicroGrid Close match with root mean square percentage difference to be 3.08%


Download ppt "Simulation, Emulation Sathish Vadhiyar Sources / Credits: Microgrid, Simgrid."

Similar presentations


Ads by Google