IT 252 Computer Organization and Architecture Number Representation Chia-Chi Teng
Where Are We Now? CS142 & 124 IT344
Review (do you remember from 124/104?) 8 bit signed 2’s complement binary # -> decimal # = ? = ? = ? Decimal # -> 8 bit signed 2’s complement binary # 32 = ? -2 = ? 200 = ?
Decimal Numbers: Base 10 Digits: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 Example: 3271 = (3x10 3 ) + (2x10 2 ) + (7x10 1 ) + (1x10 0 )
Numbers: positional notation Number Base B B symbols per digit: Base 10 (Decimal):0, 1, 2, 3, 4, 5, 6, 7, 8, 9 Base 2 (Binary):0, 1 Number representation: d 31 d d 1 d 0 is a 32 digit number value = d 31 B 31 + d 30 B d 1 B 1 + d 0 B 0 Binary:0,1 (In binary digits called “bits”) 0b11010 = 1 2 0 = = 26 Here 5 digit binary # turns into a 2 digit decimal # Can we find a base that converts to binary easily? #s often written 0b…
Hexadecimal Numbers: Base 16 Hexadecimal: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F Normal digits + 6 more from the alphabet In C, written as 0x… (e.g., 0xFAB5) Conversion: Binary Hex 1 hex digit represents 16 decimal values 4 binary digits represent 16 decimal values 1 hex digit replaces 4 binary digits One hex digit is a “nibble”. Two is a “byte” 2 bits is a “half-nibble”. Shave and a haircut… Example: (binary) = 0x_____ ?
Decimal vs. Hexadecimal vs. Binary Examples: (binary) = 0xAC (binary) = (binary) = 0x17 0x3F9 = (binary) How do we convert between hex and Decimal? A B C D E F1111 MEMORIZE!
Precision and Accuracy Precision is a count of the number bits in a computer word used to represent a value. Accuracy is a measure of the difference between the actual value of a number and its computer representation. Don’t confuse these two terms! High precision permits high accuracy but doesn’t guarantee it. It is possible to have high precision but low accuracy. Example:float pi = 3.14; pi will be represented using all bits of the significant (highly precise), but is only an approximation (not accurate).
What to do with representations of numbers? Just what we do with numbers! Add them Subtract them Multiply them Divide them Compare them Example: = 17 …so simple to add in binary that we can build circuits to do it! subtraction just as you would in decimal Comparison: How do you tell if X > Y ?
Visualizing (Mathematical) Integer Addition Integer Addition 4-bit integers u, v Compute true sum Add 4 (u, v) Values increase linearly with u and v Forms planar surface Add 4 (u, v) u v
Visualizing Unsigned Addition Wraps Around If true sum ≥ 2 w At most once 0 2w2w 2 w+1 UAdd 4 (u, v) u v True Sum Modular Sum Overflow
BIG IDEA: Bits can represent anything!! Characters? 26 letters 5 bits (2 5 = 32) upper/lower case + punctuation 7 bits (in 8) (“ASCII”) standard code to cover all the world’s languages 8,16,32 bits (“Unicode”) Logical values? 0 False, 1 True colors ? Ex: locations / addresses? commands? MEMORIZE: N bits at most 2 N things Red (00)Green (01)Blue (11)
How to Represent Negative Numbers? So far, unsigned numbers Obvious solution: define leftmost bit to be sign! 0 +, 1 – Rest of bits can be numerical value of number Representation called sign and magnitude x86 uses 32-bit integers. +1 ten would be: And –1 ten in sign and magnitude would be:
Shortcomings of sign and magnitude? Arithmetic circuit complicated Special steps depending whether signs are the same or not Also, two zeros 0x = +0 ten 0x = –0 ten What would two 0s mean for programming? Therefore sign and magnitude abandoned
Another try: complement the bits Example: 7 10 = –7 10 = Called One’s Complement Note: positive numbers have leading 0s, negative numbers have leadings 1s What is ? Answer: How many positive numbers in N bits? How many negative numbers?
Shortcomings of One’s complement? Arithmetic still a somewhat complicated. Still two zeros 0x = +0 ten 0xFFFFFFFF = -0 ten Although used for awhile on some computer products, one’s complement was eventually abandoned because another solution was better.
Standard Negative Number Representation What is result for unsigned numbers if tried to subtract large number from a small one? Would try to borrow from string of leading 0s, so result would have a string of leading 1s 00…0011 – 00…0100 = 11…1111 With no obvious better alternative, pick representation that made the hardware simple As with sign and magnitude, leading 0s positive, leading 1s negative xxx is ≥ 0, xxx is < 0 except 1…1111 is -1, not -0 (as in sign & mag.) This representation is Two’s Complement
2’s Complement Number “line”: N = 5 2 N-1 non- negatives 2 N-1 negatives one zero how many positives?
Numeric Ranges Unsigned Values UMin=0 000…0 UMax = 2 w – 1 111…1 Two’s Complement Values TMin= –2 w–1 100…0 TMax = 2 w–1 – 1 011…1 Other Values Minus 1 111…1 Values for W = 16
Values for Different Word Sizes Observations |TMin | = TMax + 1 Asymmetric range UMax=2 * TMax + 1 C Programming #include Declares constants, e.g., ULONG_MAX LONG_MAX LONG_MIN Values platform specific
Unsigned & Signed Numeric Values Equivalence Same encodings for nonnegative values Uniqueness Every bit pattern represents unique integer value Each representable integer has unique bit encoding Can Invert Mappings U2B(x) = B2U -1 (x) Bit pattern for unsigned integer T2B(x) = B2T -1 (x) Bit pattern for two’s comp integer XB2T(X)B2U(X) –88 –79 –610 –511 –412 –313 –214 –
Two’s Complement Formula Can represent positive and negative numbers in terms of the bit value times a power of 2: d 31 x -(2 31 ) + d 30 x d 2 x d 1 x d 0 x 2 0 Example: 1101 two = 1x-(2 3 ) + 1x x x2 0 = = = = -3 ten
Two’s Complement shortcut: Negation Change every 0 to 1 and 1 to 0 (invert or complement), then add 1 to the result Proof*: Sum of number and its (one’s) complement must be two However, two = -1 ten Let x’ one’s complement representation of x Then x + x’ = -1 x + x’ + 1 = 0 -x = x’ + 1 Example: -3 to +3 to -3 x : two x’: two +1: two ()’: two +1: two You should be able to do this in your head… *Check out
What if too big? Binary bit patterns above are simply representatives of numbers. Strictly speaking they are called “numerals”. Numbers really have an number of digits with almost all being same (00…0 or 11…1) except for a few of the rightmost digits Just don’t normally show leading digits If result of add (or -, *, / ) cannot be represented by these rightmost HW bits, overflow is said to have occurred unsigned
Peer Instruction Question X = two Y = two A. X > Y (if signed) B. X > Y (if unsigned) C. X = -19 (if signed) ABC 0: FFF 1: FFT 2: FTF 3: FTT 4: TFF 5: TFT 6: TTF 7: TTT
Peer Instruction Question X = two Y = two A. X > Y (if signed) B. X > Y (if unsigned) C. X = -19 (if signed) ABC 0: FFF 1: FFT 2: FTF 3: FTT 4: TFF 5: TFT 6: TTF 7: TTT A: False (X negative) B: True C: False(X = -20)
Number summary... We represent “things” in computers as particular bit patterns: N bits 2 N things Decimal for human calculations, binary for computers, hex to write binary more easily 1’s complement - mostly abandoned 2’s complement universal in computing: cannot avoid, so learn Overflow: numbers ; computers finite,errors! META: We often make design decisions to make HW simple
Information units Basic unit is the bit (has value 0 or 1) Bits are grouped together in units and operated on together: Byte = 8 bits Word = 4 bytes Double word = 2 words etc.
Encoding Byte Values Byte = 8 bits Byte = 8 bits Binary to Decimal: 0 10 to First digit must not be 0 in C Hexadecimal to FF 16 Base 16 number representation Use characters ‘0’ to ‘9’ and ‘A’ to ‘F’ Write FA1D37B 16 in C as 0xFA1D37B Or 0xfa1d37b A B C D E F Hex Decimal Binary
Memory addressing Memory is an array of information units –Each unit has the same size –Each unit has its own address –Address of an unit and contents of the unit at that address are different address contents
Addressing In most of today’s computers, the basic unit that can be addressed is a byte. (how many bit is a byte?) –x86 (and pretty much all CPU today) is byte addressable The address space is the set of all memory units that a program can reference –The address space is usually tied to the length of the registers –x86 has 32-bit registers. Hence its address space is 4G bytes –Older micros (minis) had 16-bit registers, hence 64 KB address space (too small) –Some current (Alpha, Itanium, Sparc, Altheon) machines have 64-bit registers, hence an enormous address space
Machine Words Machine Has “Word Size” Machine Has “Word Size” Nominal size of integer-valued data Including addresses Many current machines still use 32 bits (4 bytes) words Limits addresses to 4GB Becoming too small for memory-intensive applications New or high-end systems use 64 bits (8 bytes) words Potential address space 1.8 X bytes x86-64 machines support 48-bit addresses: 256 Terabytes Machines support multiple data formats Fractions or multiples of word size Always integral number of bytes
Addressing words Although machines are byte-addressable, 4 byte integers are the most commonly used units Every 32-bit integer starts at an address divisible by 4 int at address 0 int at address 4 int at address 8
Word-Oriented Memory Organization Addresses Specify Byte Locations Addresses Specify Byte Locations Address of first byte in word Addresses of successive words differ by 4 (32-bit) or 8 (64-bit) bit Words BytesAddr bit Words Addr = ?? Addr = ?? Addr = ?? Addr = ?? Addr = ?? Addr = ??
Data Representations Sizes of C Objects (in Bytes) Sizes of C Objects (in Bytes) C Data TypeTypical 32-bitIntel IA32x86-64 char111 short222 int444 long448 long long888 float444 double888 long double810/1210/16 char *448 Or any other pointer
Byte Ordering How should bytes within multi-byte word be ordered in memory? Conventions Big Endian: Sun, PPC Mac, Internet Least significant byte has highest address Little Endian: x86 Least significant byte has lowest address
Byte Ordering Example Big Endian Big Endian Least significant byte has highest address Big End First Little Endian Little Endian Least significant byte has lowest address Little End First Example Example Variable x has 4-byte representation 0x Address given by &x is 0x100 0x1000x1010x1020x x1000x1010x1020x103 Big Endian Little Endian
Big-endian vs. little-endian Byte order within a word: Little-endian (we’ll use this) Big-endian Memory address 3 Word #0 byte Value of Word #0
Reading Byte-Reversed Listings Disassembly Disassembly Text representation of binary machine code Generated by program that reads the machine code Example Fragment Example Fragment AddressInstruction CodeAssembly Rendition :5b pop %ebx :81 c3 ab add $0x12ab,%ebx c:83 bb cmpl $0x0,0x28(%ebx) Deciphering Numbers Value: 0x12ab Pad to 32 bits: 0x000012ab Split into bytes: ab Reverse: ab
Examining Data Representations Code to Print Byte Representation of Data Casting pointer to unsigned char * creates byte array typedef unsigned char *pointer; void show_bytes(pointer start, int len) { int i; for (i = 0; i < len; i++) printf("0x%p\t0x%.2x\n", start+i, start[i]); printf("\n"); } Printf directives: %p:Print pointer %x:Print Hexadecimal
show_bytes Execution Example int a = 15213; printf("int a = 15213;\n"); show_bytes((pointer) &a, sizeof(int)); Result (Linux): int a = 15213; 0x11ffffcb80x6d 0x11ffffcb90x3b 0x11ffffcba0x00 0x11ffffcbb0x00
Representing & Manipulating Sets Representation Representation Width w bit vector represents subsets of {0, …, w–1} a j = 1 if j A { 0, 3, 5, 6 } { 0, 2, 4, 6 } Operations Operations & Intersection { 0, 6 } | Union { 0, 2, 3, 4, 5, 6 } ^Symmetric difference { 2, 3, 4, 5 } ~Complement { 1, 3, 5, 7 }
Bit-Level Operations in C Operations &, |, ~, ^ Available in C Operations &, |, ~, ^ Available in C Apply to any “integral” data type long, int, short, char, unsigned View arguments as bit vectors Arguments applied bit-wise Examples (Char data type) Examples (Char data type) ~0x41 --> 0xBE ~ > ~0x00 --> 0xFF ~ > x69 & 0x55 --> 0x & > x69 | 0x55 --> 0x7D | >
Contrast: Logic Operations in C Contrast to Logical Operators Contrast to Logical Operators &&, ||, ! View 0 as “False” Anything nonzero as “True” Always return 0 or 1 Early termination Examples (char data type) Examples (char data type) !0x41 --> 0x00 !0x00 --> 0x01 !!0x41 --> 0x01 0x69 && 0x55 --> 0x01 0x69 || 0x55 --> 0x01 p && *p ( avoids null pointer access)
Shift Operations Left Shift: x << y Left Shift: x << y Shift bit-vector x left y positions Throw away extra bits on left Fill with 0’s on right Right Shift: x >> y Right Shift: x >> y Shift bit-vector x right y positions Throw away extra bits on right Logical shift Fill with 0’s on left Arithmetic shift Replicate most significant bit on right Undefined Behavior Undefined Behavior Shift amount < 0 or word size Argument x << Log. >> Arith. >> Argument x << Log. >> Arith. >>
The CPU - Instruction Execution Cycle The CPU executes a program by repeatedly following this cycle 1. Fetch the next instruction, say instruction i 2. Execute instruction i 3. Compute address of the next instruction, say j 4. Go back to step 1 Of course we’ll optimize this but it’s the basic concept
What’s in an instruction? An instruction tells the CPU –the operation to be performed via the OPCODE –where to find the operands (source and destination) For a given instruction, the ISA specifies –what the OPCODE means (semantics) –how many operands are required and their types, sizes etc.(syntax) Operand is either –register (integer, floating-point, PC) –a memory address –a constant
Reference slides You ARE responsible for the material on these slides (they’re just taken from the reading anyway) ; we’ve moved them to the end and off-stage to give more breathing room to lecture!
Kilo, Mega, Giga, Tera, Peta, Exa, Zetta, Yotta Common use prefixes (all SI, except K [= k in SI]) Confusing! Common usage of “kilobyte” means 1024 bytes, but the “correct” SI value is 1000 bytes Hard Disk manufacturers & Telecommunications are the only computing groups that use SI factors, so what is advertised as a 30 GB drive will actually only hold about 28 x 2 30 bytes, and a 1 Mbit/s connection transfers 10 6 bps. NameAbbrFactorSI size KiloK2 10 = 1, = 1,000 MegaM2 20 = 1,048, = 1,000,000 GigaG2 30 = 1,073,741, = 1,000,000,000 TeraT2 40 = 1,099,511,627, = 1,000,000,000,000 PetaP2 50 = 1,125,899,906,842, = 1,000,000,000,000,000 ExaE2 60 = 1,152,921,504,606,846, = 1,000,000,000,000,000,000 ZettaZ2 70 = 1,180,591,620,717,411,303, = 1,000,000,000,000,000,000,000 YottaY2 80 = 1,208,925,819,614,629,174,706, = 1,000,000,000,000,000,000,000,000 physics.nist.gov/cuu/Units/binary.html
kibi, mebi, gibi, tebi, pebi, exbi, zebi, yobi New IEC Standard Prefixes [only to exbi officially] International Electrotechnical Commission (IEC) in 1999 introduced these to specify binary quantities. Names come from shortened versions of the original SI prefixes (same pronunciation) and bi is short for “binary”, but pronounced “bee” :-( Now SI prefixes only have their base-10 meaning and never have a base-2 meaning. NameAbbrFactor kibiKi2 10 = 1,024 mebiMi2 20 = 1,048,576 gibiGi2 30 = 1,073,741,824 tebiTi2 40 = 1,099,511,627,776 pebiPi2 50 = 1,125,899,906,842,624 exbiEi2 60 = 1,152,921,504,606,846,976 zebiZi2 70 = 1,180,591,620,717,411,303,424 yobiYi2 80 = 1,208,925,819,614,629,174,706,176 en.wikipedia.org/wiki/Binary_prefix As of this writing, this proposal has yet to gain widespread use…
What is 2 34 ? How many bits addresses (I.e., what’s ceil log 2 = lg of) 2.5 TiB? Answer! 2 XY means… X=0 --- X=1 kibi ~10 3 X=2 mebi ~10 6 X=3 gibi ~10 9 X=4 tebi ~10 12 X=5 pebi ~10 15 X=6 exbi ~10 18 X=7 zebi ~10 21 X=8 yobi ~10 24 The way to remember #s Y=0 1 Y=1 2 Y=2 4 Y=3 8 Y=4 16 Y=5 32 Y=6 64 Y=7 128 Y=8 256 Y=9 512 MEMORIZE!
Which base do we use? Decimal: great for humans, especially when doing arithmetic Hex: if human looking at long strings of binary numbers, its much easier to convert to hex and look 4 bits/symbol Terrible for arithmetic on paper Binary: what computers use; you will learn how computers do +, -, *, / To a computer, numbers always binary Regardless of how number is written: 32 ten == == 0x20 == == 0b Use subscripts “ten”, “hex”, “two” in book, slides when might be confusing
Two’s Complement for N= two = 0 ten two = 1 ten two = 2 ten two = 2,147,483,645 ten two = 2,147,483,646 ten two = 2,147,483,647 ten two = –2,147,483,648 ten two = –2,147,483,647 ten two = –2,147,483,646 ten two =–3 ten two =–2 ten two =–1 ten One zero; 1st bit called sign bit 1 “extra” negative:no positive 2,147,483,648 ten
Two’s comp. shortcut: Sign extension Convert 2’s complement number rep. using n bits to more than n bits Simply replicate the most significant bit (sign bit) of smaller to fill new bits 2’s comp. positive number has infinite 0s 2’s comp. negative number has infinite 1s Binary representation hides leading bits; sign extension restores some of them 16-bit -4 ten to 32-bit: two two
Preview: Signed vs. Unsigned Variables Java and C declare integers int Use two’s complement (signed integer) Also, C declaration unsigned int Declares a unsigned integer Treats 32-bit number as unsigned integer, so most significant bit is part of the number, not a sign bit