Peer-to-peer Hardware-Software Interfaces for Reconfigurable Fabrics Mihai Budiu Mahim Mishra Ashwin Bharambe Seth Copen Goldstein Carnegie Mellon University.

Peer-to-peer Hardware-Software Interfaces for Reconfigurable Fabrics Mihai Budiu Mahim Mishra Ashwin Bharambe Seth Copen Goldstein Carnegie Mellon University

Peer-to-peer hw/sw interfaces Reconfigurable Hardware CacheLogic Resources Galore 20022007

Peer-to-peer hw/sw interfaces Fixed Why RH: Computational Bandwidth CPU “Unbounded” RH

Peer-to-peer hw/sw interfaces Partition Application C ProgramHDL CADCompiler OS support communication Using RH Today

Peer-to-peer hw/sw interfaces Computer System Tomorrow high-ILP computation low-ILP computation + OS + VM CPURH Memory Tight coupling

Peer-to-peer hw/sw interfaces This Work HLL Program Partitioning We suggest a high-level mechanism (not a policy). CPURH Memory ccCAD

Peer-to-peer hw/sw interfaces Outline Motivation Interfacing RH & CPU Opportunities Conclusions

Peer-to-peer hw/sw interfaces Premises RH is large –can implement large program fragments RH can access memory –does not require CPU support to access data –coherent memory view with CPU RH seen through clean abstraction –interface portability

Peer-to-peer hw/sw interfaces Unit of Partitioning: Procedure library leaves recursive hot spot high ILP Program call-graph:

Peer-to-peer hw/sw interfaces Production-Quality Software int foo(….) { highly parallel computation; …. if (!r) { fprintf(stderr, “Unexpected input”); return E_BADIN; } …. }

Peer-to-peer hw/sw interfaces Peering a( ) { b( ); } b( ) { c( ); } c( ) { d( ) } d( ) { } Program CPURH a b c d

Peer-to-peer hw/sw interfaces marshalling, control transfer Stubs software procedure call hardware dependent RH “RPC” CPU a b c d b’ c’ d’

Peer-to-peer hw/sw interfaces RH a( ) { r = b’(b_args); } b’(b_args) { } CPU b Stubs a( ) { r = b(b_args); } b(b_args) { } Program send_rh(b_args); invoke_rh(b); r = receive_rh( ); return r;

Peer-to-peer hw/sw interfaces Required Stubs 1 stub to call each RH procedure 1 stub for each procedure called by RH CPURH

Peer-to-peer hw/sw interfaces policy Compiling Procedures for RH Synthesis Procedures for CPU Program Partitioning Stubs Configuration Linker Executable automatic HLL to HDL

Peer-to-peer hw/sw interfaces Outline Motivation Interfacing RH & CPU Opportunities Conclusions

Peer-to-peer hw/sw interfaces Evaluation How much can be mapped to RH? SpecInt95 & Mediabench Partition strictly on procedure boundaries Limit RH to 10 6 bit-operations

Peer-to-peer hw/sw interfaces Coverage a( ) { b( ); } b( ) { c( ); } c( ) {} On RH Method1Method2 N N YY Y N 40%75% Total 100% 40% 35% 25% Running Time

Peer-to-peer hw/sw interfaces Coverage a( ) { b( ); } b( ) { c( ); } c( ) {} Running Time 40% 35% 25% On RH Method1Method2 N N YY N Y 25%65% Total 100%

Peer-to-peer hw/sw interfaces Policies leaves on RH RH X CPU arbitrary

Peer-to-peer hw/sw interfaces RH Stack Models Locals in registers f() { int local; g(&local); } Locals statically allocated f(x) { return x+1; } f(x) { f(x+1); } Dynamic stack

Peer-to-peer hw/sw interfaces Potential RH Coverage: SpecINT95 % Running time leaves CPU->RH CPU->RH->CPU dynamic stack static stack frames no stack

Peer-to-peer hw/sw interfaces Potential RH Coverage: Mediabench dynamic stack static stack frames no stack leaves CPU->RH CPU->RH->CPU

Peer-to-peer hw/sw interfaces Conclusions Stubs make RH/CPU interface transparent Stubs are automatically generated RH and CPU as peers RH/CPU interface: (remote) procedure call RPC used for control transfer (not data) Peering gives partitioner freedom

Peer-to-peer hw/sw interfaces The End

Peer-to-peer hw/sw interfaces

Independent of b Dispatcher Stubs a( ) { r = b(b_args); } b(b_args) { if (x) c( ); return r; } c( ) { } Program b’(b_args) { send_rh(b_args); invoke_rh(b); while (1) { com = get_rh_command( ); if (! com) break; (*com)( ); } r = receive_rh( ); return r; } c’s stub

Peer-to-peer hw/sw interfaces C’s Stub a( ) { r = b(b_args); } b(b_args) { if (x) c( ); return r; } c( ) { } Program c’( ) { receive_rh(c_args); r = c(c_args); send_rh(r); invoke_rh(return_to_rh); } back

Peer-to-peer hw/sw interfaces Attempt 1 Manual partitioning Interface: ad hoc Ex: OneChip, NAPA, PAM Advantage: huge speed-ups Problem: very hard work RH Program

Peer-to-peer hw/sw interfaces Attempt 2 Select small computations Interface: RH = functional unit Ex: PRISC, Chimaera Advantage: easy to automate Problem: low speed-up + >> Program + >> *

Peer-to-peer hw/sw interfaces Attempt 3 while (b) { b[ j+5]; } Select loop body Deeply pipelined implementation No memory access Interface: I/O or Functional Unit or Coprocessor Ex: PipeRench Advantage: very high speed-up Problems: cannot be automated loop-carried dependences few opportunities Program

Peer-to-peer hw/sw interfaces Attempt 4 Select whole loop Pipelined implementation Autonomous memory access Interface: coprocessor Ex: GARP Advantage: many opportunities Problems: complicated algorithm requires exceptional loop exits while (b) { if (error) printf(“err”); a[x] = y; } Program

Peer-to-peer Hardware-Software Interfaces for Reconfigurable Fabrics Mihai Budiu Mahim Mishra Ashwin Bharambe Seth Copen Goldstein Carnegie Mellon University.

Similar presentations

Presentation on theme: "Peer-to-peer Hardware-Software Interfaces for Reconfigurable Fabrics Mihai Budiu Mahim Mishra Ashwin Bharambe Seth Copen Goldstein Carnegie Mellon University."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Peer-to-peer Hardware-Software Interfaces for Reconfigurable Fabrics Mihai Budiu Mahim Mishra Ashwin Bharambe Seth Copen Goldstein Carnegie Mellon University.

Similar presentations

Presentation on theme: "Peer-to-peer Hardware-Software Interfaces for Reconfigurable Fabrics Mihai Budiu Mahim Mishra Ashwin Bharambe Seth Copen Goldstein Carnegie Mellon University."— Presentation transcript:

Similar presentations

About project

Feedback