Parallel and Distributed Simulation Techniques Achieving speedups in model execution
The problem Generating samples significant to draw conclusions. Sequential simulation: low performance. Worse when the problem is complex. SOLUTION: to devote more resources (for instance, multiple processors). GOAL: achieving speedups to obtain results.
Decomposition alternatives Independent replications Parallelizing compilers Distributed functions Distributed Events Model decomposition
Independent replications (+) Efficient (-) Memory constraints (-) Complex models cannot divided in simpler submodels executing simultaneously
Parallel Compilers (+) Transparent to programmers (+) Existing code can be reused (-) Ignores the structure of the problem (-) Parallelism not being exploited
Distributed functions Support tasks to independent processors (+) Transparent to the user (+) No synchronization problems (-) Limited speedup (low degree of parallelism)
Distributed Events Global Event List When a processor is free, we execute the following event in the list (+) Speedups (-) Protocols to preserve consistency (-) Complex for distributed memory systems (+) Effective when much global information is shared
Model decomposition The model is divided in components loosely coupled. They interact through message passing. (+) Speedups when there is little global information. (+) Make use of the model paralelism (-) Synchronization problems
Classification of synchronization mechanisms Time driven simulation: clock advances one tick at a time. Respond to all the events corresponding to that tick. Event driven simulation: clock is advanced up to the time of simulated message. Synchronous communication: global clock is used. Every processor: same simulated time. Aynchronous communication: each processor uses its own clock.
Time driven simulation Time advances in fixed increments (ticks). Process simulates the events for the tick. Precission: short ticks. Synchronous communication Every process must finish with a tick before advancing to the next. Synchronization phase. Central/Distributed global clock.
Time Driven simulation (cont.) Asynchronous communication Higher level of concurrency. Processor cannot simulate events for a new simulated tick without knowing the previous have finished. More overhead (synchronization).
Event driven simulation Time advances due to the occurrence of events. Higher potential speedups. Synchronous Global clock: minimum time of the next event. Centralized or distributed.
Event driven simulation (Cont.) Asynchronous Local clock: minimum time for the following event. More independence between processes. Independent events: simulated in parallel Improved performance
PDES (Parallel Discrete Event Simulation) Asynchronous simulation, Event-based. Logical Processes (represent physical processes). Every LP uses a clock, a local event list, and two link lists (one input, one output). A part of the global state.
Causality Causality Errors. Cannot occur if we meet the Local Causality Constraint: each/LP processes events in increasing timestamp order. Sufficient condition (not necessary) to avoid causality errors. CONSERVATIVE strategies: avoid causality errors. OPTIMISTIC strategies: errors can occur, and they are fixed.
Conservative mechanisms (Chandy/Misra) When it is safe to process an event? A message (representing an event) with a timestamp, and NO other with a smaller timestamp can arrive: the event can be processed. While non processed message x in each/input link Clocki = min(input links’ timestamps); Process EVERY message with that timestamp; If the link with smaller timestamp does not have a message to process then Block the Logical Process.
Deadlock in pessimist strategies
Null Message strategies Null message from A to B: a “promise” that A will not send a message to B with smaller timestamp than the null message. Input links can be used to know the minimum future timestamp. When an event is processed, a Null message is sent with the time of this lower bound. The receiver computes new output bounds using this as a base, and sends it to the neighbors.
Avoiding deadlock There cannot be cycles in which the increment of the timestamp is 0. Null messages can be sent on request Better results can be obtained when you have lookahead of the values of the timestamps.
Analysing Pessimist strategies The degree of lookahead is critical. Cannot exploit parallelism to the maximum. Not robust: changes in the application represent changing the lookahead times. The programmer should provide lookahead information. Optimist: causality errors. Pessimist: reduced performance.