1 HandleC ) prepared by: Mitra Khorram Abadi professor: Dr. Maziar Goudarzi A language based on ISO-C, extended for hardware design ( HandleC ) prepared by: Mitra Khorram Abadi professor: Dr. Maziar Goudarzi
Addressing System Design Challenges A Crisis of Complexity - Existing FPGA flows inadequate for emerging designs - Design capacity outstripping ability to design and verify - Systems contain both hardware and software - Algorithmic design path to performance too long Celoxica System Design Solutions - System-level Electronic Design Automation company, established in 1996 with technology from Oxford University - Design tools apply C-based languages to FPGA design
3 What are Handel-C and DK Handel-C High-level language based on ISO/ANSI-C for the implementation of algorithms in hardware Allows software engineers to design hardware without retraining Clean extensions for hardware design including hardware data types, flexible data widths, parallelism and communications higher level than HDLs such as VHDL and Verilog based on extensive hardware compilation research at Oxford Core building block for Celoxica™ DK design suite
4 DK design suit A design environment for Handel-C, targeting FPGAs and reconfigurable hardware – project management – hardware compilation with optimization – simulation – generates output for FPGA place-&-route tools
5 Why Handel-C Chip design is becoming prohibitive for all but large production runs – increased complexity, high cost, long lead-time, short product life-cycles Potential solutions – design reuse (e.g., IP blocks) – raise the level of abstraction, leading to rapid design methodology – use reconfigurable hardware to extend product life-cycles in-service upgrades Handel-C provides for these solutions – increased system performance – lower system power – reduced cost Handel-C and DK provides for rapid development – bring hardware and software engineers closer together
6 Applications for Handel-C Handel-C enables concurrent hardware and software application design within a common C language environment. Celoxica’s rapid hardware prototyping capability offers an unparalleled ability to design and build fully optimized applications, thus boosting performance and reducing costs. This allows software engineers to reduce development complexity and compress the time-to-market by directly participating in the hardware design process. A number of recent projects developed under Handel-C illustrate the language’s wide applications fit. Internet Security-DES encryption algorithm in hardware for SSL acceleration Digital Music-MP3 decoding in reconfigurable hardware Internet Telephony-Voice-over-IP phone implementing H.323 and TCP/IP in hardware Image Processing-Accelerating complex image processing algorithms in FPGAs
7 Overview Overview of Handel-C – what is like C – what is different from C Design flow DK Features DK Compiler Features
8 Handel-C is Like C Standard ISO-C (ANSI-C) Basic data types – signed/unsigned integers, char, enums Composite data types – arrays, structs, unions – pointers Control structures – if, while, for, switch, etc. Functions with parameters Preprocessor, separate compilation, linker
9 Handel-C Extends C Arbitrary widths on variables Bit manipulation operators Timing model par{…} construct for parallelism Channels for communication and synchronization Sharing/copying expressions RAMs/ROMs and external pin connections
10 Widths of Variables 32-bit integers would consume excessive hardware resources Handel-C allows specification of bit-width as part of integer types int 6 a; // signed 6-bit integer unsigned 9 b; // unsigned 9-bit integer Variables compile to hardware registers – size determined by data type Arithmetic operator hardware sized by type of operands Compiler can infer bit width in most cases
11 Bit Manipulation Operators The usual C bit-wise operators on integer types can be used >> << & | ^ ~ Additional operators on integer types a <- 5 // take 5 least significant bits of a a \\ 5 // drop 5 least significant bits, return the rest b // concatenate bits of a and b a[3] // select bit 3 of a a[4:1] // select bits 4 through 1 of a
12 Timing model An assignment statement takes one cycle A delay statement takes one cycle Combinatorial expressions computed between clock edges – most complex expression determines clock period Example: the following takes 1 + n cycles (n is number of iterations) index = 0; // 1 cycle while (index < length) { if (table[index] = key) found = index; // 1 cycle per iteration break; else index = index + 1; }
13 Parallelism Handel-C blocks are by default sequential par {... } executes statements in parallel par { a = 1; b = 2; c = 3; } par block completes when all statements complete – time for par block is the time for the longest statement Can nest sequential blocks in par blocks
14
15 Channels Channels allow communication and synchronization between par blocks – semantics based on CSP: unbuffered (synchronous) send and receive Channel declaration – specifies the data type to be communicated, e.g.: chan unsigned 6 c; Send statement transfers a value when receiver is ready Receive statement copies a value into a variable when sender is ready One cycle for transfer, plus wait cycles if either party not ready Example { { c ! val + 1; c ? x; } }
16
17 Sharing Hardware for Expressions Functions provide a means of sharing hardware for expressions By default, compiler generates separate hardware for each expression – hardware is idle when control flow is elsewhere in the program {... x = x*a + b; y = y*c + d; } Hardware for function body is shared among call sites int mult_add(int z, c1, c2) { return z*c1 + c2; }... {... x = mult_add(x, a, b); y = mult_add(y, c, d); }
18 Replicating Hardware for Expressions Inline functions are expanded at the call site – provide for functional abstraction of complex hardware inline complex mult_complex(complex x, y) { complex z; par { z.re = x.re*y.re - x.im*y.im; z.im = x.re*y.im + x.im*y.re; } return z; }... complex x1, y1, x2, y2, z1, z2;... par { z1 = mult_complex(x1, y1); z2 = mult_complex(x2, y2); }
19 Memories ROM and RAM data types are like arrays – implemented using FPGA memory resources Example #define packet_length 18 ram unsigned 8 packet_buf[packet_length]... packet_buf[index] = received_byte;... if (packet_buf[0] == my_addr) {... } ROMs are similar, but initialized with data RAMs and ROMs can be internal (on-chip) or external (off-chip)
20 Differences between RAMs and arrays RAMs differ from arrays in that an array is equivalent to declaring a number of variables. Each entry in an array may be used exactly like an individual variable, with as many reads, and as many writes to a different element in the array as required within a clock cycle. RAMs, however, are normally more efficient to implement in terms of hardware resources than arrays, but they only allow one location to be accessed in any one clock cycle. Therefore, you should use an array when you wish to access the elements more than once in parallel and you should use a RAM when you need efficiency.
21 Targeting hardware; FPGAs and PLDs The set family and set part constructs allow you to specify the device you want to target in your source code. You can also set the device using the DK GUI. Targeting hardware; memory The ram and rom keywords allow you to create on-chip RAM and ROM, and to interface to external RAM and ROM. If you want to create a block RAM, use the block specification. To interface to off-chip RAMs or ROMs, use the offchip specification. The addr, data, we, cs, oe and clk specifications define the pins used between the FPGA/PLD and external RAM or ROM.
22 Design flow
23 DK Design Suite The Celoxica DK Design Suite is a fully featured development environment for software-compiled system design. It enables all members of the design team, from system architects and hardware developers to software and firmware designers and verification engineers, to share code from system specification through to implementation. Software-compiled system design is a methodology for designing modern electronic systems that contain both software-driven microprocessors and programmable hardware, either as discrete components or as integrated Field Programmable System on Chip (FPSoC) devices. Where both processors and custom logic are required it is clearly beneficial for the design and implementation process of both software and hardware to have a common language base and a connected methodology.
24 Software-compiled system design enables: The system specification to be written in a form that both teams can immediately use; Improved communication and shared understanding between the development teams; Simplified partitioning and migration of code between software and hardware; Iterative design exploration and implementation; Re-partitioning to optimize the system at any stage in the design process; and, Verification of design implementations using high-level test benches derived from the system requirements in the original specification. The benefits of the DK Design Suite result in increased design productivity, reduced development time, and improved overall quality of design (QoD).
25 Development Environment The Celoxica DK Design Suite has an easy to use Integrated Development Environment (IDE) that provides facilities for: Project file management Source code editing Fast simulation and debugging Compilation of the Handel-C language direct to FPGA hardware (EDIF) Compilation of the Handel-C language to Hardware Description Languages (VHDL and Verilog) Co-simulation between - C/C++ - Handel-C - HDLs - Instruction Set Simulators (ISSs) - Modeling languages such as SystemC and Matlab
26
27 DK Compiler Features Compiler output is – optimized – deterministic – target specific Targets Xilinx and Altera netlists directly (EDIF) RTL VHDL output IP Cores – generation of IP cores (Handel-C, EDIF, VHDL) – inclusion of IP cores as “black boxes”
28