Presentation is loading. Please wait.

Presentation is loading. Please wait.

What Mum Never Told Me about Parallel Simulation K arim Djemame Informatics Research Lab. & School of Computing University of Leeds.

Similar presentations


Presentation on theme: "What Mum Never Told Me about Parallel Simulation K arim Djemame Informatics Research Lab. & School of Computing University of Leeds."— Presentation transcript:

1 What Mum Never Told Me about Parallel Simulation K arim Djemame Informatics Research Lab. & School of Computing University of Leeds

2 Plan of the Lecture Goals Learn about issues in the design and execution of Parallel Discrete Event Simulation (PADS) Overview Discrete Event Simulation – a Review Parallel Simulation – a Definition Applications Synchonisation Algorithms Conservative Optimistic Synchronous Parallel Simulation Languages Performance Issues Conclusion

3 Why Simulation? r Mathematical models too abstract for complex systems r Building real systems with multiple configurations too expensive r Simulation is a good compromise!

4 Discrete Event Simulation (DES) a DES system can be viewed as a collection of simulated objects and a sequence of event computations Changes in state of the model occur at discrete points in time The passage of time is modelled using a simulation clock Event scheduling is the most well used provides locality in time: each event describes related actions that may all occur in a single instant The model maintains a list of events (Event List) that have been scheduled have not occurred yet

5 Processing the Event List on a Uni-processor Computer An event contains two fields of information - the event it represents (eg. arrival in a queue) - time of occurrence: time when the event should happen - also timestamp e1e1 e2e2 enen 7 9 20... EVL timeevent The event list - contains the events - is always ordered by increasing occurrence of time The events are processed sequentially by a single processor

6 Event-Driven Simulation Engine e1e1 e2e2 enen 7 9 20... EVL Remove 1 st event (lowest time of occurrence) from EVL Execute corresponding event routine; modify state (S) accordingly Based on new S, schedule new future events e1e1 e2e2 enen 7 9 20... EVL e3e3 14 e2e2 e3e3 enen 9 20... EVL (1) (2) (3)

7 Why change? It ’s so simple! r Models becomes larger and larger r The simulation time is overwhelming or the simulation is just untractable r Example: m parallel programs with millions of lines of codes, m mobile networks with millions of mobile hosts, m Networks with hundreds of complex switches, routers m multicast model with thousands of sources, m ever-growing Internet, m and much more...

8 Some Figures to Convince... r ATM network models m Simulation at the cell-level, m 200 switches m 1000 traffic sources, 50Mbits/s m 155Mbits/s links, m 1 simulation event per cell arrival. m simulation time increases as link speed increases, m usually more than 1 event per cell arrival, m how scalable is traditional simulation? More than 26 billions events to simulate 1 second! 30 hours if 1 event is processed in 1us

9 Motivation for Parallel Simulation r Sequential simulation very slow r Sequential simulation does not exploit the parallelism inherent in models So why not use multiple processors ? Variety of parallel simulation protocols Availability of parallel simulation tools to achieve a certain speedup over the sequential simulator

10 Processing the Event List on a Multi- Processor Computer The events are processed by many processors. Example: Processor 1 generates event 3 at 9 to be processed by processor 2 Processors Time p1p2 7 9 14 Event 1 Event 2 In parallel Event 3 Processor 2 has already processed event 2 at 14 Problem: - the future can affect the past ! - this is the causality problem

11 Causal Dependencies e1, 7 e2, 9 e3, 14 e4, 20 e5, 27 e6, 40 e1, 7 e2, 9 e3, 14 e4, 20 e5, 27 e6, 40 EVL Scheduled events in timestamp order Sequence ordered by causal dependencies Causal dependencies mean restrictions The sequence of events (e1, e2, e4, e6) can be executed in parallel with (e3, e5) If any event were simulated with e1: violation of causal dependencies

12 Parallel Simulation - Principles r Execution of a discrete event simulation on a parallel or distributed system with several physical processors r The simulation model is decomposed into several sub-models (Logical Processes, LP) that can be executed in parallel m spatial partitioning m LPs communicate by sending timestamped messages r Fundamental concepts m each LP can be at a different simulation time m local causality constraint: events in each LP must be executed in time stamp order

13 Parallel Simulation – example 1 logical process (LP) packetheventt parallel

14 Parallel Simulation – example 2 r Logical processes (LPs) modelling airports, air traffic sectors, aircraft, etc. r LPs interact by exchanging messages (events modelling aircraft departures, landings, etc.) LP LP LP LP LP

15 Synchronisation Mechanisms r Synchronisation Algorithms m Conservative: avoids local causality violations by waiting until it ’s safe to proceed a message or event m Optimistic: allows local causality violations but provisions are done to recover from them at runtime m Synchronous: all LPs process messages/events with the same timestamp in parallel

16 PDES Applications r VLSI circuit simulation r Parallel computing r Communication networks r Combat scenarios r Health care systems r Road traffic r Simulation of models m Queueing networks m Petri nets m Finite state machines

17 Conservative Protocols Architecture of a conservative LP The Chandy-Misra-Bryant protocol The lookahead ability

18 Architecture of a Conservative LP m LPs communicate by sending non-decreasing timestamped messages m each LP keeps a static FIFO channel for each LP with incoming communication m each FIFO channel (input channel, IC) has a clock c i that ticks according to the timestamp of the topmost message, if any, otherwise it keeps the timestamp of the last message LP B LP A LP C LP D c 1 =t B 1 tB1tB1 tB2tB2 tC3tC3 tC4tC4 tC5tC5 tD4tD4 c 2 =t C 3 c 3 =t D 3

19 A Simple Conservative Algorithm r each LP has to process event in time-stamp order to avoid local causality violations The Chandy-Misra-Bryant algorithm while (simulation is not over) { determine the IC i with the smallest C i if (IC i empty) wait for a message else { remove topmost event from IC i process event } }

20 Safe but Has to Block LP B LP A LP C LP D 36 147 10 5 IC 1 IC 2 IC 3 min IC event 1 2 3 1 4 2 5 3 BLOCK 3 6 1 7 2 9

21 Blocks and Even Deadlocks! S A B M merge point BLOCKED S sends all messages to B 4 4 4 4 4 6

22 How to Solve Deadlock: Null-Messages S A B M Use of null-messages for artificial propagation of simulation time 10 4 4 4 4 5 6 7 1 2 UNBLOCKED What frequency?

23 How to Solve Deadlock: Null-Messages a null-message indicates a Lower Bound Time Stamp minimum delay between links is 4 LP C initially at simulation time 0 119 10 7 ABC 4 LP C sends a null-message with time stamp 4 LP A sends a null-message with time stamp 8 8 LP B sends a null-message with time stamp 12 12 LP C can process event with time stamp 7 12

24 The Lookahead Ability r Null-messages are sent by an LP to indicate a lower bound time stamp on the future messages that will be sent r null-messages rely on the « lookahead » ability m communication link delays m server processing time (FIFO) r lookahead is very application model dependent and need to be explicitly identified

25 Conservative: Pros & Cons r Pros m simple, easy to implement m good performance when lookahead is large (communication networks, FIFO queue) r Cons m pessimistic in many cases m large lookahead is essential for performance m no transparent exploitation of parallelism m performances may drop even with small changes in the model (adding preemption, adding one small lookahead link…)

26 Optimistic Protocols Architecture of an optimistic LP Time Warp

27 Architecture of an Optimistic LP m LPs send timestamped messages, not necessarily in non- decreasing time stamp order m no static communication channels between LPs, dynamic creation of LPs is easy m each LP processes events as they are received, no need to wait for safe events m local causality violations are detected and corrected at runtime m Most well known optimistic mechanism: Time Warp LP B LP A LP C LP D tB1tB1 tB2tB2 tC3tC3 tC4tC4 tC5tC5 tD4tD4

28 Processing Events as They Arrive 11 LP B 13 LP D 18 LP B 22 LP C 25 LP D 28 LP C 36 LP B 32 LP D LP B LP A LP C LP D LP A processed! what to do with late messages?

29 TimeWarp Do, Undo, Redo

30 TimeWarp Rollback - How? r Late messages (stragglers) are handled with a rollback mechanism m undo false/uncorrect local computations, state saving: save the state variables of an LP reverse computation m undo false/uncorrect remote computations, anti-messages: anti-messages and (real) messages annihilate each other m process late messages m re-process previous messages: processed events are NOT discarded!

31 Need for a Global Virtual Time r Motivations m an indicator that the simulation time advances m reclaim memory (fossil collection) r Basically, GVT is the minimum of m all LPs ’ logical simulation time m timestamp of messages in transit r GVT garantees that m events below GVT are definitive events m no rollback can occur before the GVT m state points before GVT can be reclaimed m anti-messages before GVT can be reclaimed

32 Time Warp - Overheads r Periodic state savings m states may be large, very large! m copies are very costly r Periodic GVT computations m costly in a distributed architecture, m may block computations, r Rollback thrashing m cascaded rollback, no advancement! r Memory! m memory is THE limitation

33 Optimistic Mechanisms: Pros & Cons r Pros m exploits all the parallelism in the model, lookahead is less important m transparent to the end-user m can be general-purpose r Cons m very complex, needs lots of memory m large overheads (state saving, GVT, rollbacks…)

34 Mixed/Adaptive Approaches r General framework that (automatically) switches to conservative or optimistic r Adaptive approaches may determine at runtime the amount of conservatism or optimism conservativeoptimistic mixed messages performance optimistic conservative

35 Synchronous Protocols m Architecture of a synchronous LP

36 Synchronous Protocols TOUS pour UN et UN pour TOUS! The Three Musketeers Alexandre Dumas (1802 – 1870)

37 A Simple Synchronous Algorithm r avoids local causality violations r LP: same data structures of a single sequential simulator r Global clock shared among all LPS – same value r Some data structures are private LP B LP A LP C My min timestamp is 5 My min timestamp is 12 My min timestamp is 10 My min timestamp is 8 Global clock = 5

38 A Simple Synchronous Algorithm Clock = 0; while (simulation is not over) { t = minimum_timestamp(); clock = global_minimum(); simulate_events(clock); synchronise(); } Basic operations 1. Computation of Minimum timestamp – reduction operation 2. Event Consumption 3. Message distribution 4. Message Reception – barrier operation

39 Synchronous Mechanisms: Pros & Cons r Pros m simple, easy to implement m good performance if parallelism exploited with a moderate synchonisation cost r Cons m pessimistic in many cases m Worst case: simulator behaves like the sequential one m performance may drop if cost of LPs synchronisation (reduction, barrier) is high

40  PDES Simulation Languages a number of PDES languages have been developed in recent years PARSEC Compose ModSim etc Most of these languages are general purpose languages  PARSEC Developed at UCLA Parallel Computing Lab. Availability - http://pcl.cs.ucla.edu/projects/parsec/ Simplicity Efficient event scheduling mechanism. PDES Languages

41 Optimistic discrete event simulator developed by PADS group of Georgia Institute of Technology http://www.cc.gatech.edu/computing/pads/tech-parallel-gtw.html Support small granularity simulation GTW runs on shared-memory multiprocessor machines Sun Enterprise, SGI Origin TeD: Telecommunications Description Language language that has been developed mainly for modeling telecommunicating network elements and protocols Jane: simulator-independent Client/Server-based graphical interface and scripting tool for interactive parallel simulations TeD/GTW simulations can be executed using the Jane system Georgia Tech Time Warp (GTW)

42 BYOwS ! BYOwS : Build Your Own Simulator Choose a programming language C, C++, Java Learn basic MPI MPI: Message Passing Interface Point-to-Point Communication Available on the school Linux machines Implement a simple PDES protocol Case study: a simple queueing network

43 Parallel Simulation Today r Lots of algorithms have been proposed m variations on conservative and optimistic m adaptives approaches r Few end-users m Compete with sequential simulators in terms of user interface, generability, ease of use etc. r Research mainly focus on m applications, ultra-large scale simulations m tools and execution environments (clusters) m Federated simulations different simulators interoperate with each other in executing a single simulation –battle field simulation, distributed multi-user games

44 Parallel Simulation - Conclusion r Pros m reduction of the simulation time m increase of the model size r Cons m causality constraints are difficult to maintain m need of special mechanisms to synchronize the different processors m increase both the model and the simulation kernel complexity r Challenges m ease of use, transparency.

45 References r Parallel simulation m R. Fujimoto, Parallel and Distributed Simulation Systems, John Wiley & Sons, 2000 m R. Fujimoto, Parallel Discrete Event Simulation, Communications of the ACM, Vol. 33(10), Oct. 90, pp31-53 m Parallel Simulation – Links http://www.cs.utsa.edu/research/ParSim/ http://www.cs.utsa.edu/research/ParSim/


Download ppt "What Mum Never Told Me about Parallel Simulation K arim Djemame Informatics Research Lab. & School of Computing University of Leeds."

Similar presentations


Ads by Google