Code Optimization Witawas Srisa-an CSCE 496: Embedded Systems Design and Implementation.

Slides:



Advertisements
Similar presentations
1 Lecture 3: MIPS Instruction Set Today’s topic:  More MIPS instructions  Procedure call/return Reminder: Assignment 1 is on the class web-page (due.
Advertisements

4.
Review of the MIPS Instruction Set Architecture. RISC Instruction Set Basics All operations on data apply to data in registers and typically change the.
The University of Adelaide, School of Computer Science
Lecture 9: MIPS Instruction Set
Computer Architecture Lecture 7 Compiler Considerations and Optimizations.
CPS3340 COMPUTER ARCHITECTURE Fall Semester, /15/2013 Lecture 11: MIPS-Conditional Instructions Instructor: Ashraf Yaseen DEPARTMENT OF MATH & COMPUTER.
1 ECE369 ECE369 Chapter 2. 2 ECE369 Instruction Set Architecture A very important abstraction –interface between hardware and low-level software –standardizes.
1 Procedure Calls, Linking & Launching Applications Lecture 15 Digital Design and Computer Architecture Harris & Harris Morgan Kaufmann / Elsevier, 2007.
Chapter 2 — Instructions: Language of the Computer — 1 Branching Far Away If branch target is too far to encode with 16-bit offset, assembler rewrites.
ECE 232 L6.Assemb.1 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers ECE 232 Hardware Organization and Design Lecture 6 MIPS Assembly.
Solution 2nd Exam.
ECE 15B Computer Organization Spring 2010 Dmitri Strukov Lecture 5: Data Transfer Instructions / Control Flow Instructions Partially adapted from Computer.
1 Lecture 4: Procedure Calls Today’s topics:  Procedure calls  Large constants  The compilation process Reminder: Assignment 1 is due on Thursday.
Lecture 6: MIPS Instruction Set Today’s topic –Control instructions –Procedure call/return 1.
10/9: Lecture Topics Starting a Program Exercise 3.2 from H+P Review of Assembly Language RISC vs. CISC.
Assembly Code Example Selection Sort.
The University of Adelaide, School of Computer Science
Computer Organization CS224
1 Today’s lecture  Last lecture we started talking about control flow in MIPS (branches)  Finish up control-flow (branches) in MIPS —if/then —loops —case/switch.
Computer Architecture CSCE 350
Csci136 Computer Architecture II Lab#4. - Stack and Nested Procedures
CS1104 – Computer Organization PART 2: Computer Architecture Lecture 4 Assembly Language Programming 2.
Procedures II (1) Fall 2005 Lecture 07: Procedure Calls (Part 2)
Apr. 12, 2000Systems Architecture I1 Systems Architecture I (CS ) Lecture 6: Branching and Procedures in MIPS* Jeremy R. Johnson Wed. Apr. 12, 2000.
CS3350B Computer Architecture Winter 2015 Lecture 4
Lecture 8: MIPS Instruction Set
Solution to Problem Recitation #1 (Week 2) ECEN350 Prof. Choi.
CSCI-365 Computer Organization Lecture Note: Some slides and/or pictures in the following are adapted from: Computer Organization and Design, Patterson.
Instruction Level Parallelism (ILP) Colin Stevens.
CS 536 Spring Code generation I Lecture 20.
CSCE 212 Quiz 2 – 2/2/11 1.What is the purpose of the jal instruction? 2.What are the two conditional branching (if, goto; not the slt instruction) instructions.
1 Lecture 5: MIPS Examples Today’s topics:  the compilation process  full example – sort in C Reminder: 2 nd assignment will be posted later today.
1 Lecture 6: Compilers, the SPIM Simulator Today’s topics:  SPIM simulator  The compilation process Additional TA hours: Liqun Cheng, legion at.
9/29: Lecture Topics Memory –Addressing (naming) –Address space sizing Data transfer instructions –load/store on arrays on arrays with variable indices.
Optimizing Compilers Nai-Wei Lin Department of Computer Science and Information Engineering National Chung Cheng University.
Introduction to Embedded Systems Profiling & Code Optimization Lecture 6.
Topic #10: Optimization EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
IT253: Computer Organization Lecture 4: Instruction Set Architecture Tonga Institute of Higher Education.
Computer Architecture CSE 3322 Lecture 3 Assignment: 2.4.1, 2.4.4, 2.6.1, , Due 2/3/09 Read 2.8.
Memory Management: Overlays and Virtual Memory. Agenda Overview of Virtual Memory –Review material based on Computer Architecture and OS concepts Credits.
Computer Architecture CSE 3322 Lecture 4 Assignment: 2.4.1, 2.4.4, 2.6.1, , Due 2/10/09
Chapter 2 — Instructions: Language of the Computer — 1 Conditional Operations Branch to a labeled instruction if a condition is true – Otherwise, continue.
COMPUTER ORGANIZATION LECTURE 3: ISA YASSER MOHAMMAD.
1 Lecture 6: Assembly Programs Today’s topics:  Large constants  The compilation process  A full example  Intro to the MARS simulator.
Oct. 11, 2000Machine Organization1 Machine Organization (CS 570) Lecture 3: Instruction Set Principles and Examples * Jeremy R. Johnson Wed. Oct. 11, 2000.
CS412/413 Introduction to Compilers and Translators April 2, 1999 Lecture 24: Introduction to Optimization.
CSCI-365 Computer Organization Lecture Note: Some slides and/or pictures in the following are adapted from: Computer Organization and Design, Patterson.
Computer Architecture & Operations I
Code Optimization.
CSCI206 - Computer Organization & Programming
CS 230: Computer Organization and Assembly Language
Prof. Hsien-Hsin Sean Lee
Lecture 6: Assembly Programs
CS2100 Computer Organisation
Computer Architecture & Operations I
Optimization Code Optimization ©SoftMoore Consulting.
RISC Concepts, MIPS ISA Logic Design Tutorial 8.
Procedures (Functions)
CSCI206 - Computer Organization & Programming
CSCI206 - Computer Organization & Programming
Addressing in Jumps jump j Label go to Label op address 2 address
CSCI206 - Computer Organization & Programming
MIPS Functions.
Computer Organization and Design Assembly & Compilation
Guest Lecturer TA: Shreyas Chand
3.
Lecture 6: Assembly Programs
9/27: Lecture Topics Memory Data transfer instructions
MIPS instructions.
Presentation transcript:

Code Optimization Witawas Srisa-an CSCE 496: Embedded Systems Design and Implementation

Agenda Talk about possible exam ideas Code optimization techniques –Not everyone has reconfigurable processors! Credits –Most of slides in this lecture are based on slides created by Profs. Raj Rajkumar and Professor Priya Narasimhan from ECE Dept at Carnegie Mellon

Exam Ideas

Code Optimization Programmers can improve program performance by writing better code –Improve data structure and/or algorithms Merge vs. bubble sorts –Reorganize code or provide flags to help compilers –Last option is to write in assembly

Better Algorithms Merge vs. bubble sorts –Which one runs faster? –Which one causes more cache misses?

Common Optimization Techniques Sub-expression elimination Dead code elimination Induction variables Strength reduction Loop unrolling In-lining

Common Techniques (cont.) Sub-expression elimination myfunction: index1 = 8 * i x = a [index1] temp = 8 * i index2 = 4 * j t = a[index2] a[temp] = t temp2 = 4 * j a[temp2] = x goto myfunction

Common Techniques (cont.) Dead code elimination int i = 0; i = i + 1; if (i == 0) j = j * 8; else j = j * 10; use ASSERT and #ifdef to advice the compiler about deadcode

Common Techniques (cont.) Induction variables and strength reduction i = 0 j = 0 label j = j + 1 i = 4 * j a[i * 2] = b [i] if (i < 1000) goto label

Optimization Techniques (cont.) In-lining main: addi $s0, $t1, 0 addi $s1, $t2, 0 jal mult add $t3, $v0, 0 mult: addi $sp, $sp -12 sw $s1, 4($sp) sw $s0, 8($sp) sw $ra, 12($sp) sll $v0, $s0, $s1 lw $s1, 4($sp) lw $s0, 8($sp) lw $ra, 12($sp) addi $sp, $sp, 12 jr $ra What’s wrong with this picture?

Optimization Techniques (cont.) Loop unrolling –Eliminate branches (why?)

Architecture Dependent Optimizations X = Y * 64 Convert 8-bit RGB to 8-bit YCC Y = 0.299R G B Cb = R G B Cr = 0.500R G – 0.082B + 128

Architecture Dependent Optimizations (cont.) Address Register Register Bank Mem Addr RegisterWrite Data RegisterRead Data/Instr Reg RAM Addr Incrementer Barrel Shifter 32-bit ALU Dout[31:0] Data[31:0] ALU Bus B Bus A Bus Incrementer Bus Write Buffer (holds address and data)

Summary No magic bullet –optimizations sometimes don’t work –programmers need to help –various techniques that may require prior knowledge of the hardware