Download presentation
Presentation is loading. Please wait.
Published byToby McKinney Modified over 9 years ago
1
technische universiteit eindhoven Department of Electrical Engineering Electronic Systems Optimizing the mMIPS Sander Stuijk
2
Electronic Systems 2 The mMIPS Pipelined core Hazard detection No forwarding mMIPS instruction set 31 instructions in hardware available (add, bnez, mul,...) Other instructions supported via a C compiler (div, sra,...)
3
Electronic Systems 3 Outline LCC compiler for the mMIPS Using memories in the mMIPS The mMIPS vs Hennesy and Patterson
4
Electronic Systems 4 Toolflow LCC C Compiler Borland C++ Compiler LCC C Compiler Xilinx ISE Synopsys FPGA Compiler II Synopsys SystemC compiler Application (C source) mMIPS (C++ sources that use SystemC libraries) test implementation sw hw
5
Electronic Systems 5 LCC compiler: it’s a C compiler Consider the following code fragment: for (int i = 0; i < 3; i++) a[i] =...; It should be: int i; for (i = 0; i < 3; i++) { a[i] =...; }
6
Electronic Systems 6 How does a compiler work?.c Preprocessor.c Compiler.c.asm Assembler.asm.obj Linker.obj.lib.exe lcc prog.c –o mips_rom.bin
7
Electronic Systems 7 Adding special functions Examples Division, multiply, swap, clip,... Constraints At most 2 input operands and 1 output operand Manifest loop bounds Clock frequency Chip area
8
Electronic Systems 8 Securing our skies Measure height each second The airplane may never be for more then 1 second below 1000ft If needed, take appropriate action…
9
Electronic Systems 9 Securing our skies Measure height each second The airplane may never be for more then 1 second below 1000ft If needed, take appropriate action…
10
Electronic Systems 10 missile.c #define TRUE 1 #define FALSE 0 int launch(int height1, int height2) { int l; if (height1 < 1000 && height2 < 1000) l = TRUE; else l = FALSE; return l; } void main(void) { int height1, height2; int l; while (TRUE) { l = launch(height1, height2); }
11
Electronic Systems 11 Assembler lcc missile.c –o missile disas missile 80: addiusp,sp,-8 84: lit8,1000 88: slts8,a0,t8 8c: beqzs8,0xac 90: nop 94: slts8,a1,t8 98: beqzs8,0xac 9c: nop a0: lit8,1 a4: b0xb0 a8: swt8,4(sp) ac: swzero,4(sp) b0: lwv0,4(sp) b4: jrra b8: addiusp,sp,8 int launch(int height1, int height2) { int l; if (height1 < 1000 && height2 < 1000) l = TRUE; else l= FALSE; return l; }
12
Electronic Systems 12 Adding a special function to the mMIPS (overview) New mMIPS instruction: launch Select an opcode and function code launch height 1 height 2 launch opcode → 0 functioncode → 0x10 (not yet used) opcode 6 bits rs 5 bits rt 5 bits rd 5 bits shamt 5 bits funct 6 bits
13
Electronic Systems 13 Adding a special function to the mMIPS (hardware) aluctrl alu
14
Electronic Systems 14 Converting a C program to the LCC data representation 0: int main(void) { 1: int a = 3; 2: if (a == 3) 3: return 1; 4: return 0; 5: } CNSTI4 3 ADDRLP4 a ASGNI4 1 INDIRI4 NEI4 CNSTI4 3 ADDRLP4 a 2 RETI4 CNSTI4 1 3 JUMPV ADDRGP4 1 2* RETI4 CNSTI4 0 4 The data representation is converted to assembler using rules. Rules map a set of nodes (one or more) onto assembler instructions. The LCC data representation is called an Abstract Syntax Tree (AST).
15
Electronic Systems 15 What does a rule look like? A rule for adding two unsigned integer (4 bytes): reg: ADDU4 (reg,reg) "\taddu $%c,$%0,%1\n" 1 one output register the node two source registers the assembler instruction weight %1 – The first source operand register %2 – The second source operand register %c – The destination register All rules are defined in the file ‘lcc/src/minimips.md’
16
Electronic Systems 16 CNSTI4 3 ADDRLP4 a ASGNI4 1 Converting the LCC data-structure to assembler INDIRI4 NEI4 CNSTI4 3 ADDRLP4 a 2 RETI4 CNSTI4 1 3 JUMPV ADDRGP4 1 2* RETI4 CNSTI4 0 4.set reorder.globl main.text.align 2.ent main main:.frame $sp,8,$31 addu $sp,$sp,-8 la $24,3 sw $24,-4+8($sp) lw $24,-4+8($sp) la $15,3 bne $24,$15,L.2 la $2,1 b L.1 L.2: move $2,$0 L.1: addu $sp,$sp,8 j $31.end main
17
Electronic Systems 17 Adding a special function to the mMIPS (software) Launch function must be detected by LCC Use special pattern to indicate use of launch function Example: ((a) - ((b) + *(int *) 0x12344321)) The following 4 constructs map to custom operations in LCC: ((a) - ((b) + *(int *) 0x12344321)) ((a) + ((b) + *(int *) 0x12344321)) ((a) - ((b) - *(int *) 0x12344321)) ((a) + ((b) - *(int *) 0x12344321)) More operations (possibly with more operands) can be added. Look at the website for more information.
18
Electronic Systems 18 Custom operation in C and assembler #define TRUE 1 #define FALSE 0 #define launch(h1, h2) ((h1) - ((h2) + *(int *) 0x12344321)) void main(void) { int height1, height2; int l; while (TRUE) { l = launch(height1, height2); } 80:addiusp,sp,-16 84: sws5,0(sp) 88: sws6,4(sp) 8c: b0x98 90: sws7,8(sp) 94: tgeus7,s6,0x2a0 98: b0x94 9c: nop a0: lws5,0(sp) a4: lws6,4(sp) a8: lws7,8(sp) ac: jrra b0: addiusp,sp,16 lcc missile.c –o missile disas missile
19
Electronic Systems 19 Comparison 80: addiusp,sp,-8 84: lit8,1000 88: slts8,a0,t8 8c: beqzs8,0xac 90: nop 94: slts8,a1,t8 98: beqzs8,0xac 9c: nop a0: lit8,1 a4: b0xb0 a8: swt8,4(sp) ac: swzero,4(sp) b0: lwv0,4(sp) b4: jrra b8: addiusp,sp,8 original 94: tgeus7,s6,0x2a0 added custom instruction Reduction of 14 instructions per execution!
20
Electronic Systems 20 Outline LCC compiler for the mMIPS Using memories in the mMIPS The mMIPS vs Hennesy and Patterson
21
Electronic Systems 21 The mMIPS memory layout.text 32Kb 0x8000 0x30000.data 0x.bss 0x38000 user data 116Kb 0x55000 0x0 Data that is not used by the compiler. This is private memory for I/O. Stack and dynamically allocated data- structures. Data memory of the compiler. Store arrays, constants, etc. The actual program goes here.
22
Electronic Systems 22 Taking the memory from LCC to the mMIPS 0x8000 0x30000 0x 0x38000 0x55000 0x0 mips_ram.bin mips_rom.bin The addresses in the data memory are relocated in the mMIPS using a mask (0x2FFFF). 0x30000 → 0x0 0x38000 → 0x8000
23
Electronic Systems 23 Using the data memory in your program /* The program copies the data from str1 to str2. Note that * at most 512 characters are copied. */ char *str1 = (char *)0x0; // Memory address 0 in ram char *str2 = (char *)0x200; // Memory address 0x200 in ram void main (void) { int i; for (i = 0; str1[i] != ‘\0’ && i < 0x200; i++) { str2[i] = str1[i]; } str2[i] = ‘\0’; } Remember that addresses are relocated, so we could have used 0x30000 and 0x30200 to write the data to the same location. We are playing in the data memory that is accessible by the compiler. We better use the private memory starting at 0x38000.
24
Electronic Systems 24 Outline LCC compiler for the mMIPS Using memories in the mMIPS The mMIPS vs Hennesy and Patterson
25
Electronic Systems 25 Registerfile and write-back hazards InputOutput REG WriteRead mMIPS H&P WriteOutput Data is available on the output of the registerfile in the next cycle Input Write Data is available on the output of the registerfile in the current cycle
26
Electronic Systems 26 Branch hazards Hennesy and Patterson Branch detection in the decoding phase (after registerfile); Two cycles needed to determine branch taken (IF and ID); The first instruction after the branch is the branch delay slot filled by the assembler. mMIPS Branch detection in the execution phase (using the alu); Three cycles needed to determine branch taken (IF, ID, EX); Two branch delay slots (one used by the assembler, the second delay slot is filled with a NOP by the hazard detection unit).
27
Electronic Systems 27 Questions?
28
Electronic Systems 28
29
Electronic Systems 29 Assignment
30
Electronic Systems 30 Assignment Optimize the run-time of an image processing algorithm running on the mMIPS. Allowed Add special instructions to the mMIPS; Change design of the mMIPS (e.g. forwarding). Not-allowed Modification of the image processing algorithm that are not needed to use special instructions (e.g. replace multiply with shifts).
31
Electronic Systems 31 Testing and implementing the design Test for functional correctness Run the original mMIPS with the algorithm to produce a reference output. Compare the results of your mMIPS to the reference output. Implement your design on the FPGA You must complete the flow till the FPGA. The maximum clock frequency at which your mMIPS can be synthesized is part of the performance.
32
Electronic Systems 32 ImageProcessing.zip Download the file ‘ImageProcessing.zip’ at http://www.es.ele.tue.nl/education/Computation/ogo12/ Content ImageProcessing/algorithm ImageProcessing/bendime ImageProcessing/bennoc ImageProcessing/cocentric ImageProcessing/lcc ImageProcessing/mips ImageProcessing/SystemC2.0.1borland bennoc_setup.csh
33
Electronic Systems 33 Support and Information Dominic Gawlowski - FPGA Valentin Gheorghita - LCC Sander Stuijk - SystemC Each Tuesday and Friday between 14.00 and 16.00h. Look also at http://www.es.ele.tue.nl/education/Computation/ogo12/ for information, tips, etc.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.