Compiling Application-Specific Hardware Mihai Budiu Seth Copen Goldstein Carnegie Mellon University.

Compiling Application-Specific Hardware Mihai Budiu Seth Copen Goldstein Carnegie Mellon University

Resources

Problems Complexity Power Global Signals Limited issue window => limited ILP We propose a scalable architecture

Outline Introduction ASH: Application Specific Hardware Compiling for ASH Conclusions

Application-Specific Hardware C program Compiler Dataflow IR Reconfigurable hardware

Our Solution General: applicable to today’s software - programming languages - applications Automatic: compiler-driven Scalable: - run-time: with clock, hardware - compile-time: with program size Parallelism: exploit application parallelism

Asynchronous Computation + data valid ack

New Entire C applications Dynamically scheduled circuits Custom dataflow machines - application-specific - direct execution (no interpretation) - spatial computation

Outline Scalability Application Specific Hardware CASH: Compiling in ASH Conclusions

CASH: Compiling for ASH Memory partitioning Interconnection net Circuits C Program RH

Primitives + Arithmetic/logic Multiplexors Merge Eta (gateway) Memory data predicates data predicate ldst

Forward Branches if (x > 0) y = -x; else y = b*x; * xb0 y ! -> Decoded mux Conditionals => Speculation

Critical Paths if (x > 0) y = -x; else y = b*x; * xb0 y ! ->

Lenient Operations if (x > 0) y = -x; else y = b*x; * xb0 y ! -> Solve the problem of unbalanced paths

! ret i +1 < 100 0 * + sum 0 Loops int sum=0, i; for (i=0; i < 100; i++) sum += i*i; return sum; Control flow => data flow

Compilation Translate C to dataflow machines Optimizations software-, hardware-, dataflow-specific Expose parallelism –predication –speculation –localized synchronization –pipelining

Pipelining i + <= 100 1 * + sum pipelined multiplier

Pipelining i + <= 100 1 * + sum

Pipelining i + <= 100 1 * + sum i’s loop sum’s loop Long latency pipe

Pipelining i + <= 100 1 * + sum

Pipelining i + <= 100 1 * + sum i’s loop sum’s loop Long latency pipe predicate

Predicate ack edge is on the critical path. Pipelining i + <= 100 1 * + sum critical path i’s loop sum’s loop

Pipelining i + <= 100 1 * + sum i’s loop sum’s loop decoupling FIFO

Pipelining i + <= 100 1 * + sum i’s loop sum’s loop critical path decoupling FIFO

ASH Features What you code is what you get –no hidden control logic –lean hardware (no CAM, multi-ported files, etc.) –no global signals Compiler has complete control Dynamic scheduling => latency tolerant Natural ILP and loop pipelining

Conclusions ASH: compiler-synthesized hardware from HLL Exposes program parallelism Dataflow techniques applied to hardware ASH promises to scale with: – circuit speed – transistors – program size

Backup slides Hyperblocks Predication Speculation Memory access Procedure calls Recursive calls Resources Performance

Hyperblocks Procedure back

Predication p !p q if (p)....... q if (!p)....... hyperblock back

Speculation q if (!p)...... q ops w/ side-effects back

Memory Access back load address predicate token data Load-store queue store addresspred token data Interconnection network Memory

Procedure calls back Interconnection network Extract args ret resultcaller Procedure P call P args

Recursion recursive call save live values restore live values hyperblock stack back

Resources Estimated SpecINT95 and Mediabench Average < 100 bit-operations/line of code Routing resources harder to estimate Detailed data in paper back

Performance Preliminary comparison with 4-wide OOO Assumed same FU latencies Speed-up on kernels from Mediabench back

Compiling Application-Specific Hardware Mihai Budiu Seth Copen Goldstein Carnegie Mellon University.

Similar presentations

Presentation on theme: "Compiling Application-Specific Hardware Mihai Budiu Seth Copen Goldstein Carnegie Mellon University."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Compiling Application-Specific Hardware Mihai Budiu Seth Copen Goldstein Carnegie Mellon University.

Similar presentations

Presentation on theme: "Compiling Application-Specific Hardware Mihai Budiu Seth Copen Goldstein Carnegie Mellon University."— Presentation transcript:

Similar presentations

About project

Feedback