Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS294-6 Reconfigurable Computing Day 23 November 10, 1998 Stream Processing.

Similar presentations


Presentation on theme: "CS294-6 Reconfigurable Computing Day 23 November 10, 1998 Stream Processing."— Presentation transcript:

1 CS294-6 Reconfigurable Computing Day 23 November 10, 1998 Stream Processing

2 Previously Computing Requirements SCORE –stream-based computing model –use streams for linking computations instead of shared memory locations expose parallelism freedom of sequential/spatial implementation

3 Today Streams moderately well developed for –sequential atoms in multithreaded/multiprocessor environment General DF case SDF Expression...thoughts on adapting ideas for SCORE- like execution

4 General Dataflow case Dataflow graph exposes parallelism Operators enabled as soon as data is available Captures partial ordering for computation Adaptive/tolerant to latencies in system => great for exposing parallelism

5 General Dataflow Fine-grained –expose maximum parallelism –…but rendevous/presence overhead for every operator Who runs when is unpredictable –variable latencies –variable consumption/production –=> force runtime synchronization/scheduling

6 General Dataflow What structure to exploit to reduce requirements?

7 General Dataflow What structure to exploit to reduce requirements? –Spatial operator locality most communication local (sequential) –Operation blocks only do dataflow presence on input to region of code sequential/direct computation of subgraph –all local/deterministic computations in subgraph –Cyclic/predictable dataflow?

8 Dataflow Multithreading Original DF: –synchronize per instruction Hybrid DF -> TAM –synchronize on remote memory access (msgs) –run scheduling quanta (several instructions) Multithreading –coarse-grain tasks –synchronize on input data –(also locking)

9 What to watch for With arbitrary I/O rates –unbounded buffering requirements

10 Synchronous Data Flow Restriction –number of tokens produced/consumed is constant per operator firing –these numbers known at compile time –each edge has predetermined number of initial tokens Consistent –admissible and periodic

11 SDF: Periodic Periodic –invoke each operator at least once –return to initial state (# tokens on each edge) –can determine by balance equations

12 SDF: Admissible Admissible –firing sequence not yield deadlock

13 SDF: Inadmissible

14 SDF: Admissible

15 Benefits Periodic schedules Bounded buffer requirements –Acyclic graphs optimal algorithm –Cycle NP-complete heuristic algorithm … close to optimal buffering

16 SDF Example By Balance Equations –1 A, 2 B, 4 C Firing Sequences: –ABCBCCC –ABCCBCC –ABBCCCC Buffer Costs –5 (AB=2 BC=3) –4 (AB=2 BC=2) –6 (AB=2 BC=4)

17 Scheduling (min buffer) F= fireable operator D=deferrable(F) = edge has enough tokens to fire sink While (F  ) –if ((F-D)  ) fire from F-D –else fire operator which increases number of tokens least

18 Buffer Minimization Repeat –1 A –2 B –4 C F={A}, D=  –A F={B}, D=  –B F={B,C},D={B} –C F={B,C},D={B} –C F={B}, D=  –B

19 SDF  BDF What is SDF missing? –Restricts range of expression –Allows static scheduling

20 SDF  BDF Sufficient Addition:

21 SDF  BDF BDF –SDF + switch and select operators BDF is Turing Complete

22 Expression: Block Diagram Ptolemy example from Buck’94

23 Expression: Stream Language Function AveragePairs(D: Signal returns Signal) –stream integer [(D[0]+D[1])/2] || AveragePairs(stream_rest(D)) Ex: Dennis94

24 Convert to Static Data Flow

25 Composition of Stream Operators Function Process(D:ImageStream, w:integer returns MarkStream) –let R:=for I in 1,w return array of –FourForThree(AveragePairsD[I])) end for –in PeakDetect(TwoDimFilter(R,w)) –end let end function

26 Adapting How different?

27 Adapting How different? –Expensive to change operators –Possibility of spatial pipelining of operators Operator AT Operator copies –Allow dynamic rates… violate fixed firing

28 SDF: Timeslice Multiples of repetition/firing schedule –valid for acyclic graph –require greater buffering

29 SDF: Spatial Can realize spatially Repetition/firing schedule –gives relative throughput rates –simple cases => suggest Area-Throughput points

30 Dynamic Note that adding switch/select gives general, dynamic dataflow Suggests can identify: –static regions (obey SDF restrictions) –dynamic boundaries (where dynamic operators exist) Static schedule static regions Dynamic control at boundary/invocation of static blocks

31 Dynamic Flow Rates Cannot schedule completely at compile time Use feedback to get expected flow rate –schedule like SDF –track data presence at dynamic boundaries –allow additional buffer space (overflow) –stall slower operator as necessary careful check possible deadlock conditions

32 Summary Stream datatype captures computational structure –good for spatial implementations –expose parallelism Rich experience in DF/DSP to exploit Static powerful where applicable Can still help schedule “mostly static” cases


Download ppt "CS294-6 Reconfigurable Computing Day 23 November 10, 1998 Stream Processing."

Similar presentations


Ads by Google