Technische universiteit eindhoven Department of Electrical Engineering Electronic Systems Optimizing the mMIPS Sander Stuijk.

Slides:



Advertisements
Similar presentations
Henk Corporaal TUEindhoven 2011
Advertisements

1 Lecture 3: MIPS Instruction Set Today’s topic:  More MIPS instructions  Procedure call/return Reminder: Assignment 1 is on the class web-page (due.
Goal: Write Programs in Assembly
Review of the MIPS Instruction Set Architecture. RISC Instruction Set Basics All operations on data apply to data in registers and typically change the.
Lecture 5: MIPS Instruction Set
1 ECE462/562 ISA and Datapath Review Ali Akoglu. 2 Instruction Set Architecture A very important abstraction –interface between hardware and low-level.
Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)
1 ECE369 ECE369 Chapter 2. 2 ECE369 Instruction Set Architecture A very important abstraction –interface between hardware and low-level software –standardizes.
ECE 15B Computer Organization Spring 2010 Dmitri Strukov Lecture 5: Data Transfer Instructions / Control Flow Instructions Partially adapted from Computer.
1 Lecture 4: Procedure Calls Today’s topics:  Procedure calls  Large constants  The compilation process Reminder: Assignment 1 is due on Thursday.
ELEN 468 Advanced Logic Design
CML CML CS 230: Computer Organization and Assembly Language Aviral Shrivastava Department of Computer Science and Engineering School of Computing and Informatics.
Computer Architecture CPSC 321 E. J. Kim. Overview Logical Instructions Shifts.
1 Chapter Six - 2nd Half Pipelined Processor Forwarding, Hazards, Branching EE3055 Web:
Technische universiteit eindhoven Department of Electrical Engineering Electronic Systems Optimizing the mMIPS Sander Stuijk.
1 Lecture 2: MIPS Instruction Set Today’s topic:  MIPS instructions Reminder: sign up for the mailing list cs3810 Reminder: set up your CADE accounts.
The Processor 2 Andreas Klappenecker CPSC321 Computer Architecture.
Lecture 5 Sept 14 Goals: Chapter 2 continued MIPS assembly language instruction formats translating c into MIPS - examples.
Shift Instructions (1/4)
7/2/ _23 1 Pipelining ECE-445 Computer Organization Dr. Ron Hayne Electrical and Computer Engineering.
ECE 4436ECE 5367 ISA I. ECE 4436ECE 5367 CPU = Seconds= Instructions x Cycles x Seconds Time Program Program Instruction Cycle CPU = Seconds= Instructions.
Processor I CPSC 321 Andreas Klappenecker. Midterm 1 Thursday, October 7, during the regular class time Covers all material up to that point History MIPS.
Railway Foundation Electronic, Electrical and Processor Engineering.
MIPS Instruction Set Advantages
Department of Electrical Engineering Electronic Systems Department of Electrical Engineering Electronic Systems Optimization of the mMIPS Sander Stuijk.
An Introduction Chapter Chapter 1 Introduction2 Computer Systems  Programmable machines  Hardware + Software (program) HardwareProgram.
Department of Electrical Engineering Electronic Systems Department of Electrical Engineering Electronic Systems Optimization of the mMIPS Sander Stuijk.
Pipelined Datapath and Control
11/02/2009CA&O Lecture 03 by Engr. Umbreen Sabir Computer Architecture & Organization Instructions: Language of Computer Engr. Umbreen Sabir Computer Engineering.
6.S078 - Computer Architecture: A Constructive Approach Introduction to SMIPS Li-Shiuan Peh Computer Science & Artificial Intelligence Lab. Massachusetts.
Instruction Level Parallelism Pipeline with data forwarding and accelerated branch Loop Unrolling Multiple Issue -- Multiple functional Units Static vs.
Lecture 4: MIPS Instruction Set
Computer Architecture CSE 3322 Lecture 2 NO CLASS MON Sept 1 Course WEB SITE crystal.uta.edu/~jpatters.
CDA 3101 Fall 2013 Introduction to Computer Organization
Computer Architecture CSE 3322 Lecture 3 Assignment: 2.4.1, 2.4.4, 2.6.1, , Due 2/3/09 Read 2.8.
ECE 353 Lab 2 Pipeline Simulator. Aims Further experience in C programming Handling strings Further experience in the use of assertions Reinforce concepts.
Computer Organization CS224 Fall 2012 Lessons 7 and 8.
Computer Organization Rabie A. Ramadan Lecture 3.
Chapter 2 — Instructions: Language of the Computer — 1 Memory Operands Main memory used for composite data – Arrays, structures, dynamic data To apply.
Chapter 2 — Instructions: Language of the Computer — 1 Conditional Operations Branch to a labeled instruction if a condition is true – Otherwise, continue.
Digital Computer Concept and Practice Copyright ©2012 by Jaejin Lee Control Unit.
ECE/CS 552: Pipeline Hazards © Prof. Mikko Lipasti Lecture notes based in part on slides created by Mark Hill, David Wood, Guri Sohi, John Shen and Jim.
Simulator Outline of MIPS Simulator project  Write a simulator for the MIPS five-stage pipeline that does the following: Implements a subset of.
Pipelining: Implementation CPSC 252 Computer Organization Ellen Walker, Hiram College.
“ INSTRUCTIONS SET OF AVR MICROCONTROLLER ” SIGMA INSTITUTE OF ENGINEERING Prepared By: SR.NO NAME OF STUDENT ENROLLMENT 1 Abhishek Lakhara
CS 230: Computer Organization and Assembly Language
Computer Architecture Instruction Set Architecture
MIPS Instruction Set Advantages
Control Unit Lecture 6.
COMPUTER ARCHITECTURE & OPERATIONS I
Morgan Kaufmann Publishers
CS161 – Design and Architecture of Computer Systems
Lecture 4: MIPS Instruction Set
ELEN 468 Advanced Logic Design
RISC Concepts, MIPS ISA Logic Design Tutorial 8.
CDA 3101 Spring 2016 Introduction to Computer Organization
Computer Architecture (CS 207 D) Instruction Set Architecture ISA
Super Quick Architecture Review
Pipelining: Advanced ILP
Instructions - Type and Format
Lecture 4: MIPS Instruction Set
Henk Corporaal TUEindhoven 2010
The University of Adelaide, School of Computer Science
ECE232: Hardware Organization and Design
Data Hazards Data Hazard
Computer Architecture
COMS 361 Computer Organization
ARM ORGANISATION.
Basic components Instruction processing
COMS 361 Computer Organization
Presentation transcript:

technische universiteit eindhoven Department of Electrical Engineering Electronic Systems Optimizing the mMIPS Sander Stuijk

Electronic Systems 2 The mMIPS  Pipelined core  Hazard detection  No forwarding  mMIPS instruction set  31 instructions in hardware available (add, bnez, mul,...)  Other instructions supported via a C compiler (div, sra,...)

Electronic Systems 3 Outline  LCC compiler for the mMIPS  Using memories in the mMIPS  The mMIPS vs Hennesy and Patterson

Electronic Systems 4 Toolflow LCC C Compiler Borland C++ Compiler LCC C Compiler Xilinx ISE Synopsys FPGA Compiler II Synopsys SystemC compiler Application (C source) mMIPS (C++ sources that use SystemC libraries) test implementation sw hw

Electronic Systems 5 LCC compiler: it’s a C compiler  Consider the following code fragment: for (int i = 0; i < 3; i++) a[i] =...;  It should be: int i; for (i = 0; i < 3; i++) { a[i] =...; }

Electronic Systems 6 How does a compiler work?.c Preprocessor.c Compiler.c.asm Assembler.asm.obj Linker.obj.lib.exe lcc prog.c –o mips_rom.bin

Electronic Systems 7 Adding special functions Examples  Division, multiply, swap, clip,... Constraints  At most 2 input operands and 1 output operand  Manifest loop bounds  Clock frequency  Chip area

Electronic Systems 8 Securing our skies  Measure height each second  The airplane may never be for more then 1 second below 1000ft  If needed, take appropriate action…

Electronic Systems 9 Securing our skies  Measure height each second  The airplane may never be for more then 1 second below 1000ft  If needed, take appropriate action…

Electronic Systems 10 missile.c #define TRUE 1 #define FALSE 0 int launch(int height1, int height2) { int l; if (height1 < 1000 && height2 < 1000) l = TRUE; else l = FALSE; return l; } void main(void) { int height1, height2; int l; while (TRUE) { l = launch(height1, height2); }

Electronic Systems 11 Assembler lcc missile.c –o missile disas missile 80: addiusp,sp,-8 84: lit8, : slts8,a0,t8 8c: beqzs8,0xac 90: nop 94: slts8,a1,t8 98: beqzs8,0xac 9c: nop a0: lit8,1 a4: b0xb0 a8: swt8,4(sp) ac: swzero,4(sp) b0: lwv0,4(sp) b4: jrra b8: addiusp,sp,8 int launch(int height1, int height2) { int l; if (height1 < 1000 && height2 < 1000) l = TRUE; else l= FALSE; return l; }

Electronic Systems 12 Adding a special function to the mMIPS (overview)  New mMIPS instruction: launch  Select an opcode and function code launch height 1 height 2 launch opcode → 0 functioncode → 0x10 (not yet used) opcode 6 bits rs 5 bits rt 5 bits rd 5 bits shamt 5 bits funct 6 bits

Electronic Systems 13 Adding a special function to the mMIPS (hardware) aluctrl alu

Electronic Systems 14 Converting a C program to the LCC data representation 0: int main(void) { 1: int a = 3; 2: if (a == 3) 3: return 1; 4: return 0; 5: } CNSTI4 3 ADDRLP4 a ASGNI4 1 INDIRI4 NEI4 CNSTI4 3 ADDRLP4 a 2 RETI4 CNSTI4 1 3 JUMPV ADDRGP4 1 2* RETI4 CNSTI4 0 4 The data representation is converted to assembler using rules. Rules map a set of nodes (one or more) onto assembler instructions. The LCC data representation is called an Abstract Syntax Tree (AST).

Electronic Systems 15 What does a rule look like? A rule for adding two unsigned integer (4 bytes): reg: ADDU4 (reg,reg) "\taddu $%c,$%0,%1\n" 1 one output register the node two source registers the assembler instruction weight %1 – The first source operand register %2 – The second source operand register %c – The destination register All rules are defined in the file ‘lcc/src/minimips.md’

Electronic Systems 16 CNSTI4 3 ADDRLP4 a ASGNI4 1 Converting the LCC data-structure to assembler INDIRI4 NEI4 CNSTI4 3 ADDRLP4 a 2 RETI4 CNSTI4 1 3 JUMPV ADDRGP4 1 2* RETI4 CNSTI4 0 4.set reorder.globl main.text.align 2.ent main main:.frame $sp,8,$31 addu $sp,$sp,-8 la $24,3 sw $24,-4+8($sp) lw $24,-4+8($sp) la $15,3 bne $24,$15,L.2 la $2,1 b L.1 L.2: move $2,$0 L.1: addu $sp,$sp,8 j $31.end main

Electronic Systems 17 Adding a special function to the mMIPS (software)  Launch function must be detected by LCC  Use special pattern to indicate use of launch function Example: ((a) - ((b) + *(int *) 0x )) The following 4 constructs map to custom operations in LCC: ((a) - ((b) + *(int *) 0x )) ((a) + ((b) + *(int *) 0x )) ((a) - ((b) - *(int *) 0x )) ((a) + ((b) - *(int *) 0x )) More operations (possibly with more operands) can be added. Look at the website for more information.

Electronic Systems 18 Custom operation in C and assembler #define TRUE 1 #define FALSE 0 #define launch(h1, h2) ((h1) - ((h2) + *(int *) 0x )) void main(void) { int height1, height2; int l; while (TRUE) { l = launch(height1, height2); } 80:addiusp,sp,-16 84: sws5,0(sp) 88: sws6,4(sp) 8c: b0x98 90: sws7,8(sp) 94: tgeus7,s6,0x2a0 98: b0x94 9c: nop a0: lws5,0(sp) a4: lws6,4(sp) a8: lws7,8(sp) ac: jrra b0: addiusp,sp,16 lcc missile.c –o missile disas missile

Electronic Systems 19 Comparison 80: addiusp,sp,-8 84: lit8, : slts8,a0,t8 8c: beqzs8,0xac 90: nop 94: slts8,a1,t8 98: beqzs8,0xac 9c: nop a0: lit8,1 a4: b0xb0 a8: swt8,4(sp) ac: swzero,4(sp) b0: lwv0,4(sp) b4: jrra b8: addiusp,sp,8 original 94: tgeus7,s6,0x2a0 added custom instruction Reduction of 14 instructions per execution!

Electronic Systems 20 Outline  LCC compiler for the mMIPS  Using memories in the mMIPS  The mMIPS vs Hennesy and Patterson

Electronic Systems 21 The mMIPS memory layout.text 32Kb 0x8000 0x30000.data 0x.bss 0x38000 user data 116Kb 0x x0 Data that is not used by the compiler. This is private memory for I/O. Stack and dynamically allocated data- structures. Data memory of the compiler. Store arrays, constants, etc. The actual program goes here.

Electronic Systems 22 Taking the memory from LCC to the mMIPS 0x8000 0x x 0x x x0 mips_ram.bin mips_rom.bin The addresses in the data memory are relocated in the mMIPS using a mask (0x2FFFF). 0x30000 → 0x0 0x38000 → 0x8000

Electronic Systems 23 Using the data memory in your program /* The program copies the data from str1 to str2. Note that * at most 512 characters are copied. */ char *str1 = (char *)0x0; // Memory address 0 in ram char *str2 = (char *)0x200; // Memory address 0x200 in ram void main (void) { int i; for (i = 0; str1[i] != ‘\0’ && i < 0x200; i++) { str2[i] = str1[i]; } str2[i] = ‘\0’; } Remember that addresses are relocated, so we could have used 0x30000 and 0x30200 to write the data to the same location. We are playing in the data memory that is accessible by the compiler. We better use the private memory starting at 0x38000.

Electronic Systems 24 Outline  LCC compiler for the mMIPS  Using memories in the mMIPS  The mMIPS vs Hennesy and Patterson

Electronic Systems 25 Registerfile and write-back hazards InputOutput REG WriteRead mMIPS H&P WriteOutput Data is available on the output of the registerfile in the next cycle Input Write Data is available on the output of the registerfile in the current cycle

Electronic Systems 26 Branch hazards  Hennesy and Patterson  Branch detection in the decoding phase (after registerfile);  Two cycles needed to determine branch taken (IF and ID);  The first instruction after the branch is the branch delay slot filled by the assembler.  mMIPS  Branch detection in the execution phase (using the alu);  Three cycles needed to determine branch taken (IF, ID, EX);  Two branch delay slots (one used by the assembler, the second delay slot is filled with a NOP by the hazard detection unit).

Electronic Systems 27 Questions?

Electronic Systems 28

Electronic Systems 29 Assignment

Electronic Systems 30 Assignment  Optimize the run-time of an image processing algorithm running on the mMIPS.  Allowed  Add special instructions to the mMIPS;  Change design of the mMIPS (e.g. forwarding).  Not-allowed  Modification of the image processing algorithm that are not needed to use special instructions (e.g. replace multiply with shifts).

Electronic Systems 31 Testing and implementing the design  Test for functional correctness  Run the original mMIPS with the algorithm to produce a reference output.  Compare the results of your mMIPS to the reference output.  Implement your design on the FPGA  You must complete the flow till the FPGA. The maximum clock frequency at which your mMIPS can be synthesized is part of the performance.

Electronic Systems 32 ImageProcessing.zip  Download the file ‘ImageProcessing.zip’ at   Content  ImageProcessing/algorithm  ImageProcessing/bendime  ImageProcessing/bennoc  ImageProcessing/cocentric  ImageProcessing/lcc  ImageProcessing/mips  ImageProcessing/SystemC2.0.1borland  bennoc_setup.csh

Electronic Systems 33 Support and Information  Dominic Gawlowski - FPGA  Valentin Gheorghita - LCC  Sander Stuijk - SystemC  Each Tuesday and Friday between and 16.00h.  Look also at for information, tips, etc.