Presentation is loading. Please wait.

Presentation is loading. Please wait.

Compiling Application-Specific Hardware Mihai Budiu Seth Copen Goldstein Carnegie Mellon University.

Similar presentations


Presentation on theme: "Compiling Application-Specific Hardware Mihai Budiu Seth Copen Goldstein Carnegie Mellon University."— Presentation transcript:

1

2 Compiling Application-Specific Hardware Mihai Budiu Seth Copen Goldstein Carnegie Mellon University

3 Resources

4 Problems Complexity Power Global Signals Limited issue window => limited ILP We propose a scalable architecture

5 Outline Introduction ASH: Application Specific Hardware Compiling for ASH Conclusions

6 Application-Specific Hardware C program Compiler Dataflow IR Reconfigurable hardware

7 Our Solution General: applicable to today’s software - programming languages - applications Automatic: compiler-driven Scalable: - run-time: with clock, hardware - compile-time: with program size Parallelism: exploit application parallelism

8 Asynchronous Computation + data valid ack

9 New Entire C applications Dynamically scheduled circuits Custom dataflow machines - application-specific - direct execution (no interpretation) - spatial computation

10 Outline Scalability Application Specific Hardware CASH: Compiling in ASH Conclusions

11 CASH: Compiling for ASH Memory partitioning Interconnection net Circuits C Program RH

12 Primitives + Arithmetic/logic Multiplexors Merge Eta (gateway) Memory data predicates data predicate ldst

13 Forward Branches if (x > 0) y = -x; else y = b*x; * xb0 y ! -> Decoded mux Conditionals => Speculation

14 Critical Paths if (x > 0) y = -x; else y = b*x; * xb0 y ! ->

15 Lenient Operations if (x > 0) y = -x; else y = b*x; * xb0 y ! -> Solve the problem of unbalanced paths

16 ! ret i +1 < 100 0 * + sum 0 Loops int sum=0, i; for (i=0; i < 100; i++) sum += i*i; return sum; Control flow => data flow

17 Compilation Translate C to dataflow machines Optimizations software-, hardware-, dataflow-specific Expose parallelism –predication –speculation –localized synchronization –pipelining

18 Pipelining i + <= 100 1 * + sum pipelined multiplier

19 Pipelining i + <= 100 1 * + sum

20 Pipelining i + <= 100 1 * + sum

21 Pipelining i + <= 100 1 * + sum

22 Pipelining i + <= 100 1 * + sum i’s loop sum’s loop Long latency pipe

23 Pipelining i + <= 100 1 * + sum

24 Pipelining i + <= 100 1 * + sum i’s loop sum’s loop Long latency pipe predicate

25 Predicate ack edge is on the critical path. Pipelining i + <= 100 1 * + sum critical path i’s loop sum’s loop

26 Pipelining i + <= 100 1 * + sum i’s loop sum’s loop decoupling FIFO

27 Pipelining i + <= 100 1 * + sum i’s loop sum’s loop critical path decoupling FIFO

28 ASH Features What you code is what you get –no hidden control logic –lean hardware (no CAM, multi-ported files, etc.) –no global signals Compiler has complete control Dynamic scheduling => latency tolerant Natural ILP and loop pipelining

29 Conclusions ASH: compiler-synthesized hardware from HLL Exposes program parallelism Dataflow techniques applied to hardware ASH promises to scale with: – circuit speed – transistors – program size

30 Backup slides Hyperblocks Predication Speculation Memory access Procedure calls Recursive calls Resources Performance

31 Hyperblocks Procedure back

32 Predication p !p q if (p)....... q if (!p)....... hyperblock back

33 Speculation q if (!p)...... q ops w/ side-effects back

34 Memory Access back load address predicate token data Load-store queue store addresspred token data Interconnection network Memory

35 Procedure calls back Interconnection network Extract args ret resultcaller Procedure P call P args

36 Recursion recursive call save live values restore live values hyperblock stack back

37 Resources Estimated SpecINT95 and Mediabench Average < 100 bit-operations/line of code Routing resources harder to estimate Detailed data in paper back

38 Performance Preliminary comparison with 4-wide OOO Assumed same FU latencies Speed-up on kernels from Mediabench back


Download ppt "Compiling Application-Specific Hardware Mihai Budiu Seth Copen Goldstein Carnegie Mellon University."

Similar presentations


Ads by Google