Download presentation
Presentation is loading. Please wait.
1
Towards a Realistic Scheduling Model Oliver Sinnen, Leonel Sousa, Frode Eika Sandnes IEEE TPDS, Vol. 17, No. 3, pp. 263-275, 2006.
2
Parallel processing is the oldest discipline in computer science – yet the general problem is far from solved
3
Why is parallel processing difficult? ”Jo flere kokker jo mere søl” –Partitioning and transforming problems –Load balancing –Inter-processor communication –Granularity –Architecture
4
Implementing parallel systems Manually –MPI –PVM –Linda Automatically –Parallelising compilers (Fortran) –Static scheduling
5
Taskgraph scheduling: Representing static computations
6
Modelling computations A=B+C Data dependencies A BC Valid sequences: CBA, BCA Invalid sequences: ABC, ACB, CAB, BAC
7
Another example A = (B-C)/D F = B+G A BCD F G
8
Scheduling
9
Static taskgraph scheduling techniques The scheduling process A BC DE A B D C E Taskgraph Allocation Schedule p1p2 time c1 c2 c3 c4 c5
10
Topological sorting –to order the vertices of a graph such that the precedence constraints are not violated All valid schedules represent a topological sort Scheduling algorithms differ in how they topologically sort the graph
11
The importance of abstraction Abstraction is important to preserve generality Too specific float sum = 0; for (int i=0;i<8;i++) { sum += a[i]; } General and flexible float sum = sumArray(a);
12
Communication
13
Communication is a major bottleneck Typically from 1:50 to 1:10,000 difference between computation and communication Communication cost not very dependent on data size. Interconnection network topology affect the overall time.
14
Scheduling work prior to 1995 Assumptions –Zero-interprocessor communication costs –Fully connected processor interconnection networks.
15
Amounts of data transfer Public transport is a good thing?
16
Data-size not is not major factor Multiple single messages Single compound message connectsendconnectsendconnectsend connectsend
17
Interconnection topology
18
Fully connected
19
The ring To send something from here.. …to here
20
Interprocessor communication Zero vs non-zero communication overheads Direct links vs connecting nodes P1P4P3P2 Bus P11P12P13P14 P21P22P23P24 P31P32P33P34 P41P42P43P44 Shared memory Bus-based multiprocessor Distributed memory Mesh multiprocessor RAM
21
Avoiding communication overheads
22
Duplication a bc aa bc a b c a b a c p1p2p1p2 t=1 t=2 t=3 t=1 t=2 1 11 11 1 1 1 1 1 1 duplication allocation
23
When considering communication overheads
24
Classic communication model: Assumptions Local communications have zero communication costs Communication is conducted by subsystem. Communication can be performed concurrently The network is fully connected
25
Implications Network contention (not modelled) –Tasks compete for communication resources Contention can be modelled: –Different types of edges –Switch verticies (in addition to processor verticies)
26
Processor involvement in communication I Two-sided involvement (TCP/IP PC-cluster)
27
Processor involvement in communication II One-sided involvement (Shared memory Cray T3E)
28
Processor involvement in communication III Third party involvement (Dedicated DMA hardware Meiko CS-2)
29
Problems All classic scheduling models assume third-party involvement. Very little hardware are equipped with dedicated hardware supporting third-party involvement. Estimated finish-times for tasks are hugely inaccurate. Scheduling algorithm are very sub- optimal.
30
Even more problems
31
Results bobcatSun E3500 3TE-900
32
The End
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.