Download presentation
Presentation is loading. Please wait.
Published byJocelyn Hancock Modified over 9 years ago
1
CASTNESS11, Rome Italy © 2011 Target Compiler Technologies L 1 Ideas for the design of an ASIP for LQCD Target Compiler Technologies CASTNESS’11, Rome, Italy
2
CASTNESS11, Rome Italy © 2011 Target Compiler Technologies L 2 Agenda ASIPs and IP Designer EURETILE platform An ASIP for LQCD
3
CASTNESS11, Rome Italy © 2011 Target Compiler Technologies L 3 ASIPs in Multi-Core SoC ASIP: Application-Specific Processor Anything between general-purpose P and hardwired data-path Flexibility through programmability and design-time reconfigurability High throughput, low energy through parallelism and specialization ASIP is foundation of heterogeneous multi-core SoC Balanced SoC architecture offers best performance at lowest energy and lowest cost
4
CASTNESS11, Rome Italy © 2011 Target Compiler Technologies L 4 Why ASIPs? Maximise performance Specialisation Parallelism: VLIW, SIMD, multi-core Minimise power dissipation Specialisation Parallelism: VLIW, SIMD, multi-core Power-optimised RTL generation Leverage the benefits of programmability React to changing requirements Ship first for evolving standards Remedy defects Extend products to new markets without an SoC respin
5
CASTNESS11, Rome Italy © 2011 Target Compiler Technologies L 5 IP Designer Tool Suite
6
CASTNESS11, Rome Italy © 2011 Target Compiler Technologies L 6 nML – ASIP description language Structural skeleton reg V[4] ; trn vecr ; trn vecs ; trn vecd ; trn vect ; fu vec; fu vabs;... opn vec_adiff_opn(t:c2u, r:c2u) { action { stage E1: vecd = vsub(vecr=V[r],vecs=V[t]) @vec; V[t] = vect = vabs(vecd) @vabs; } syntax : "vadiff v"t ",v"r ",v"t; image : t::r; } Instruction-set grammar Example: architectural specialisation Absolute-difference instruction in motion estimation Registers, busses, functional units Application specific data type ‘vector’ Primitive functions: vsub() vabs() Operation pattern: V vabs() vsub() V, V
7
CASTNESS11, Rome Italy © 2011 Target Compiler Technologies L 7 Agenda ASIPs and IP Designer EURETILE platform An ASIP for LQCD
8
CASTNESS11, Rome Italy © 2011 Target Compiler Technologies L 8 EURETILE hardware platform Communication DNP Control RISC Computation DSP ASIPs: specialised towards the application −Lattice quantum chromo dynamics (LQCD) −Neural network (Izhikevich)
9
CASTNESS11, Rome Italy © 2011 Target Compiler Technologies L 9 Agenda ASIPs and IP Designer EURETILE platform An ASIP for LQCD
10
CASTNESS11, Rome Italy © 2011 Target Compiler Technologies L 10 LQCD ASIP Goals Increase performance Decrease gate count or usage of FPGA blocks Means Task level parallelism (multi tile architecture) Data level parallelism Instruction level parallelism Architecture specialisation
11
CASTNESS11, Rome Italy © 2011 Target Compiler Technologies L 11 LQCD ASIP Instruction level parallelism VU_1…VU_nLS_0…LS_m VLIW instruction word Arithmetic operations in parallel with load/store operations Appropriate mix of n and m based on feedback from compilation of Qphi() function n*m speed improvement over scalar architecture Data level parallelism c1c2c3 3-way SIMD fits with SU(3) matrix algebra 3x speed improvement over scalar architecture
12
CASTNESS11, Rome Italy © 2011 Target Compiler Technologies L 12 LQCD ASIP Architecture specialisation: complex floating point operations: C + C, C + i*C→ 2x speedup over scalar architecture C – C, C – i*C C * R → 4x speedup over scalar architecture C * C → 8x speedup over scalar architecture … Behaviour of floating point operations Defined in a C dialect intended for the modelling of functional units Translated into simulation and implementation (RTL) models Synthesis on standard cell library, mapping on FPGA primitives Vector types and operators defined for the C compiler vector v1, va[4], vb[4]; v1 += va[0] * vb[1];
13
CASTNESS11, Rome Italy © 2011 Target Compiler Technologies L 13 LQCD ASIP Architecture specialisation: address generation Goal: Vector units should be used every cycle, address generation must be done in parallel How: to be investigated, after feedback from C compiler! Deliverables SDK (Compiler, Assembler, Linker, Simulator, Debugger) based on IP Designer SystemC model RTL Model + FPGA mapping
14
CASTNESS11, Rome Italy © 2011 Target Compiler Technologies L 14
15
CASTNESS11, Rome Italy © 2011 Target Compiler Technologies L 15
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.