Announcements MP 3 CS296 (Chase Geigle geigle1@illinois.edu)

Announcements MP 3 CS296 (Chase Geigle

Floating Point Numbers
How can we represent 3.14 ? What’s wrong with: (int_part, frac_part) 3.14 and have the same representation! The leading-zeroes problem can be solved if numbers are normalized write the number in the form d.f  10e , d is a single non-zero digit normalized(3.14) =  100, normalized(0.314) =  101 In binary, the “d” part will always be 1 (zero is a special case) this implicit 1 can be ignored Ideal representation scheme has these features: can represent positive and negative, low and high magnitude it is easy to compare two numbers it is easy to do basic math 2

IEEE 754 standard Format for single-precision (32-bit) and double-precision (64-bit) reals The normalized (non-zero) binary number  1.f  2e is stored as Comparison of floats almost identical to comparison of ints! MIPS has separate floating point registers and instructions 23-bit fraction f 8-bit exponent e excess-127 notation 1 sign bit 1 = negative 0 = positive single precision float 52-bit fraction f 11-bit exponent e excess-1023 notation 1 sign bit 1 = negative 0 = positive double precision double 3

Instruction Set Architecture (ISA)
The ISA is an abstraction layer between hardware and software Software doesn’t need to know how the processor is implemented Processors that implement the same ISA appears equivalent An ISA enables processor innovation without changing software This is how Intel has made billions of dollars Before ISAs, software was re-written/re-compiled for each new machine Software Proc #1 ISA Proc #2 4

ISA history: RISC vs. CISC
1964: IBM System/360, the first computer family IBM wanted to sell a range of machines that ran the same software 1960’s, 1970’s: Complex Instruction Set Computer (CISC) era Much assembly programming, compiler technology immature Hard to optimize, guarantee correctness, teach 1980’s: Reduced Instruction Set Computer (RISC) era Most programming in high-level languages, mature compilers Simpler, cleaner ISA’s facilitated pipelining, high clock frequencies 1990’s: Post-RISC era ISA compatibility outweighs any RISC advantage in general purpose CISC and RISC chips use same techniques (pipelining, superscalar, ..) Embedded processors prefer RISC for lower power, cost 2000’s: Multi-core era 5

Comparing x86 and MIPS x86 is a typical CISC ISA, MIPS is a typical RISC ISA Much more is similar than different: Both use registers and have byte-addressable memories Same basic types of instructions (arithmetic, branches, memory) A few of the differences: Fewer registers: 8 (vs. 32 for MIPS) 2-register instruction formats (vs. 3-register format for MIPS) Additional, complex addressing modes Variable-length instruction encoding (vs. fixed 32-bit length for MIPS) 6

Why did Intel win? x86 won because it was the first 16-bit chip by two years IBM put it in PCs because there was no competing choice Rest is inertia and “financial feedback” x86 is most difficult ISA to implement for high performance, but Because Intel sells the most processors ... It has the most money ... Which it uses to hire more and better engineers ... Which is uses to maintain competitive performance ... And given equal performance, compatibility wins ... So Intel sells the most processors! 7

The compilation process
To produce assembly code: gcc –S test.c produces test.s To produce object code: gcc –c test.c produces test.o To produce executable code: gcc test.c produces a.out

The purpose of a linker The linker is a program that takes one or more object files and assembles them into a single executable program. The linker resolves references to undefined symbols by finding out which other object defines the symbol in question, and replaces placeholders with the symbol's address.

Loader Before we can start executing a program, the O/S must load it:
Loading involves 5 steps: Allocates memory for the program's execution Copies the text and data segments from the executable into memory Copies program arguments (command line arguments) onto the stack Initializes registers: sets $sp to point to top of stack, clears the rest Jumps to start routine, which: 1) copies main's arguments off of the stack, and 2) jumps to main.

Compiler Purpose: convert high-level code into low-level assembly
Four key steps: lexing  parsing  code optimizations  code generation Code generation: instruction selection (depends on ISA; easier for RISC or CISC?) instruction scheduling (later in the course, CS 433) register allocation (today’s topic)  CS 421, 426 

Register allocation The compiler initially produces “intermediate” code that assumes an infinite number of registers $t0, $t1, … and maps each variable to a unique register To get actual code, variables must share registers Suppose there are only 3 real registers $t1, $t2, $t3 An easy case: every scope defines at most three variables But scope is an over-estimate, as in this example: // a, b, c defined in this scope for(int i = 0; i < a; i += b) // stuff c = 0; c is not live here live  in scope, converse not true

Live variable analysis
A variable x is live at a point p in the code if: x is defined at point d “” p x is read at a point r “” p x is not redefined between p and r Intuitively, x holds a value that may be needed in the future Liveness computed at compile time may have to over-estimate liveness (for correctness) If at some point p, number_of_live_variables(p)  number_of_registers, we obviously have to spill some variables to memory Is the converse true?

Example Consider the following code snippet: a = 0; b = a + 1;
c = b + 1; a = c + 1; We define a graph G where vertices are variables edge between two variables if their live regions overlap We want to assign variables to registers so that two variables that share an edge are not assigned the same register a b c a  $t1 j loop loop: b  $t2 c  $t1 Graph coloring At most two live variables Suppose we have two registers $t1, $t2

Graph Coloring Color the vertices of a graph with k colors so that no two neighboring vertices get the same color A tree is always 2-colorable A map is always 4-colorable There isn’t an efficient way to decide if a graph is 3-colorable unless “P = NP” (the biggest open problem in CS!) Fortunately, there are some efficient heuristics that can produce a near-optimal coloring

Which variables should be spilled?
Register Allocation Problem asks two key questions: Are we forced to spill registers? Yes iff min_colors(graph)  number_of_registers If so, which registers should we spill? Also hard to compute optimum Before good heuristics for graph coloring, register allocation was hard for RISC architectures with many registers Now we have good graph coloring heuristics, so the focus shifts to the second problem much more critical with CISC architectures with few registers

Announcements MP 3 CS296 (Chase Geigle geigle1@illinois.edu)

Similar presentations

Presentation on theme: "Announcements MP 3 CS296 (Chase Geigle geigle1@illinois.edu)"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Announcements MP 3 CS296 (Chase Geigle geigle1@illinois.edu)

Similar presentations

Presentation on theme: "Announcements MP 3 CS296 (Chase Geigle geigle1@illinois.edu)"— Presentation transcript:

Similar presentations

About project

Feedback