1June 9, 2006Connections 2006 FPGA-based Prototyping of the Multi-Level Computing Architecture presented by Davor Capalija Supervisor: Prof. Tarek S. Abdelrahman Connections 2006
2June 9, 2006Connections 2006 A modern processor Superscalar, out-of-order and speculative execution XU Control Unit Instruction Queue XU Register File Memory Execution units
3June 9, 2006Connections 2006 Multi-level Computing Architecture while(…) { Allocate (out frame) Preprocess(…) Analyze(…) Output(…) } PU Control Processor Task Scheduler PU Allocate() Preprocess() Analyze() Shared Memory Universal Register File Tasks Control Program Task instruction
4June 9, 2006Connections 2006 Previous work in the MLCA group Automatic task formation –Kirk Stewart Compile-time optimizations to extract parallelism –Utku Aydonat Task memory management –Ahmed Abdelkhalek Power optimization using dynamic voltage scaling –Ivan Matosevic Work done using a high-level functional simulator
5June 9, 2006Connections 2006 Motivation and goal Realistic cycle-accurate evaluation using an FPGA-based prototype –Feasibility of hardware implementation Deliver scalable performance –The control processor is expected to be a bottleneck Custom hardware design of the control processor –Contribution: microarchitecture of the control processor
6June 9, 2006Connections 2006 Challenges Mapping the architecture to FPGA device resources High requirements for on-chip memory: blocks, capacity & ports –System: shared memory, URF –PUs: caches, private and instruction memories –CP: renaming tables, task queues Control processor microarchitecture design space –Performance vs. area trade-offs Support for speculative execution of tasks
7June 9, 2006Connections 2006 Status Initial FGPA-based prototype –Nios II Development Board, Stratix Pro Edition (1S40) –Based on initial implementation by David Han PUs - Altera Nios II/f processors Interconnect - Altera Avalon interconnect Memory - both on-chip & off-chip Software-based control processor –Emulated on one Nios II/f processor Determining and removing bottlenecks Next step: microarchitecture of the Control Processor
8June 9, 2006Connections 2006 Bonus I$ D$ PU1 Ins1 M Priv1 M Shared memory Universal Register File I$ D$ CP CP’s mem I$ D$ PU2 Ins2 M Priv2 M CP TQRT I$ D$ PU3 Ins3 M Priv3 M I$ D$ PU4 Ins4 M Priv4 M Comm1Comm2Comm3 Comm4 FPGA device