Bits and Bytes Spring, 2015 Topics Why bits? Representing information as bits Binary / Hexadecimal Byte representations »Numbers »Characters and strings »Instructions Bit-level manipulations Boolean algebra Expressing in C COMP Comp Org & Assembly Lang
– 2 – COMP 21000, S’15 Why Don’t Computers Use Base 10? Base 10 Number Representation That’s why fingers are known as “digits” Natural representation for financial transactions Floating point number cannot exactly represent $1.20 Even carries through in scientific notation X 10 3 (1.5213e4)
– 3 – COMP 21000, S’15 Why Don’t Computers Use Base 10? Implementing Electronically Hard to store ENIAC (First electronic computer) used 10 vacuum tubes / digit IBM 650 used 5+2 bits (1958, successor to IBM’s Personal Automatic Computer, PAC from 1956) Hard to transmit Need high precision to encode 10 signal levels on single wire Messy to implement digital logic functions Addition, multiplication, etc.
– 4 – COMP 21000, S’15 Binary Representations Base 2 Number Representation Represent as Represent as [0011]… 2 Represent X 10 4 as X 2 13 Electronic Implementation Easy to store with bistable elements Reliably transmitted on noisy and inaccurate wires 0.0V 0.5V 2.8V 3.3V 010
– 5 – COMP 21000, S’15 Encoding Byte Values Byte = 8 bits Binary to Decimal: 0 10 to First digit must not be 0 in C Octal: to Use leading 0 in C Hexadecimal to FF 16 Base 16 number representation Use characters ‘0’ to ‘9’ and ‘A’ to ‘F’ Use leading 0x in C, »write FA1D37B 16 as 0xFA1D37B »Or 0xfa1d37b A B C D E F Hex Decimal Binary
– 6 – COMP 21000, S’15 Bases Every number system uses a base Decimal numbers use base 10 Other common bases used in computer science: Octal (base 8) Hexadecimal (base 16) Binary (base 2)
– 7 – COMP 21000, S’15 Decimal 7537 x x x x x x 10 0
– 8 – COMP 21000, S’15 Octal x x x x x x 8 0
– 9 – COMP 21000, S’15 Counting in octal
– 10 – COMP 21000, S’15 Converting Converting octal to decimal: 753 = 7 x x x 8 0 = = 491
– 11 – COMP 21000, S’15 Translating to octal writing 157 in base remainder: 157 remainder: x x x 8 0 = = 157 Divisor
– 12 – COMP 21000, S’15 Hexadecimal 7537 x x x x x x 16 0
– 13 – COMP 21000, S’15 Hexadecimal A B C D E F A 1B 1C 1D 1E 1F A 2B 2C 2D 2E 2F A 3B 3C 3D 3E 3F A 4B 4C 4D 4E 4F A 5B 5C 5D 5E 5F A 6B 6C 6D 6E 6F A 7B 7C 7D 7E 7F A 8B 8C 8D 8E 8F A 10B 10C 10D 10E 10F 120 …
– 14 – COMP 21000, S’15 Translating to hex 365 remainder: 365 remainder: D D 1 x x D x 16 0 = = 365
– 15 – COMP 21000, S’15 Binary 1011 x x x x x x 2 0
– 16 – COMP 21000, S’15 Counting in binary
– 17 – COMP 21000, S’15 Translating to binary 4 remainder: 4 remainder: Divisor1 0 0
– 18 – COMP 21000, S’15 Translating to binary 11 remainder: 11 remainder:
– 19 – COMP 21000, S’15 Literary Hex Common 8-byte hex filler: 0xdeadbeef Can you think of other 8-byte fillers?
– 20 – COMP 21000, S’15 Relation between hex and binary Hexadecimal: Hexadecimal: 16 characters, Each represents a number between 0 and 15 Binary: Binary: Consider 4 places Represent numbers between 0000 and 1111 Or between 0 and 15 Result: Result: 1 hex digit represents 4 binary digits
– 21 – COMP 21000, S’15 Relation between hex and binary HexBinaryDecimal A B C D E F111116
– 22 – COMP 21000, S’15 Converting main() { octal_print(0);/* result: 00 */ octal_print(-1);/* result: -01*/ octal_print(123456);/* result: */ octal_print( );/* result: */ octal_print( );/* result: since 0 at */ /* beginning indicates octal num */ octal_print( );/* result: */ octal_print(999);/* result: 01747*/ octal_print(-999);/* result: */ octal_print(1978);/* result: 03672*/ }
– 23 – COMP 21000, S’15 Converting #define DIG 20 void octal_print(int x) { int i = 0, od; char s[DIG];/* up to DIG octal digits */ if ( x < 0 )/* treat negative x*/ { putchar ('-');/* output a minus sign*/ x = - x; } do { od = x % 8;/* next higher octal digit */ s[i++] = (od + '0');/* octal digits in char form*/ } while ( (x = x/8) > 0);/* quotient by integer division */ putchar('0');/* octal number prefix*/ do { putchar(s[--i]);/* output characters on ss*/ } while ( i > 0 ); putchar ('\n'); } See /home/barr/Student/Examples/baseConversion/octal.c
– 24 – COMP 21000, S’15 Memory What is memory (RAM)? What is memory (RAM)? To the processor, it’s just a big array of cells To the processor, it’s just a big array of cells Each cell has an address Each cell contains a specific value Address Contents “John” Sunday ‘a’‘a’‘a’‘a’
– 25 – COMP 21000, S’15 Memory Of course, addresses and content are actually in binary Of course, addresses and content are actually in binary Address Contents Memory is byte addressable i.e., every byte has an address.
– 26 – COMP 21000, S’15 Memory When we look at memory in a debugger (like gdb), usually use hex notation When we look at memory in a debugger (like gdb), usually use hex notation Address Contents 9B 39 C3 5F 09
– 27 – COMP 21000, S’15 Memory How do we talk about the size of memory? How do we talk about the size of memory? in terms of powers of 2 (addresses are binary) x 2 10 = ,048, x 2 10 x 2 10 = ,073,741, x 2 10 x 2 10 x 2 10 = ,099,511,627,776
– 28 – COMP 21000, S’15 Memory How do we talk about the size of memory? How do we talk about the size of memory? in bytes (how may bytes of memory do you have?) in terms of powers of 2 (addresses are binary) 1 bit 1 byte 1 Kilobyte (Kb) 1 Megabyte (Mb) 1 Gigabyte (Gb) 1 Terabyte (Tb) 1 bit 8 bits 1 byte 2 10 bytes 1024 bytes 2 10 x 2 10 = ,048,576 bytes 2 10 x 2 10 x 2 10 = ,073,741,824 bytes 2 10 x 2 10 x 2 10 x 2 10 = ,099,511,627,776bytes 1,099,511,627,776 bytes
– 29 – COMP 21000, S’15 Byte-Oriented Memory Organization Programs Refer to Virtual Addresses Conceptually very large array of bytes Actually implemented with hierarchy of different memory types SRAM, DRAM, disk Only allocate for regions actually used by program In Unix and Windows NT, address space is private to particular “process” Program being executed Program can clobber its own data, but not that of others
– 30 – COMP 21000, S’15 Byte-Oriented Memory Organization Virtual memory is organized by the OS VM is implemented by the Compiler + Run-Time System Control Allocation (or process manager) Where different program objects should be stored Multiple mechanisms: static, stack, and heap In any case, all allocation within single virtual address space
– 31 – COMP 21000, S’15 Machine Words Machine (or processor) Has “Word Size” Nominal size of integer-valued data Including addresses Last generation machines used 32 bits (4 bytes) words Limits addresses to 4GB Becoming too small for memory-intensive applications We’ll use 32 bits as the word size in this class Most current systems use 64 bits (8 bytes) words Potential address space = 2 64 1.8 X bytes Machines support multiple data formats Fractions or multiples of word size Always integral number of bytes
– 32 – COMP 21000, S’15 Word-Oriented Memory Organization Addresses Specify Byte Locations Address of first byte in word Addresses of successive words differ by 4 (32-bit) or 8 (64-bit) bit Words BytesAddr bit Words Addr = ?? Addr = ?? Addr = ?? Addr = ?? Addr = ?? Addr = ??
– 33 – COMP 21000, S’15 Range A computer cell holds a fixed number of bits. The range is the numbers that will fit in a cell. Examples: = = 7range is 0 to = = 15range is 0 to 15
– 34 – COMP 21000, S’15 Range In general range is 0 to 2 n – 1 where there are n bits in the representation Intuition: 111 = 1 x x x 2 0 Which is 1 less than 1000 = 1 x x x x 2 0 Number of numbers represented: n bits can represent 2 n different numbers i.e., from 0 to 2 n – 1
– 35 – COMP 21000, S’15 Data Representations Sizes of C Objects (in Bytes) C Data TypeAlpha (RIP)Typical 32-bitIntel IA32 unsigned444 int444 long int844 char111 short222 float444 double888 long double8/16 † 810/12 char *844 »Or any other pointer ( †: Depends on compiler&OS, 128bit FP is done in software)
– 36 – COMP 21000, S’15 Byte Ordering How should bytes within a multi-byte word be ordered in memory? Conventions Sun’s, PowerPC Mac’s are “Big Endian” machines Least significant byte has highest address Alphas, intel machines (including PC’s and Mac’s) are “Little Endian” machines Least significant byte has lowest address Most networking hardware is “Big-Endian” See for a history of “endian-ness” Aside: “endian” is from Gulliver’s Travels, a book written by Jonathan Swift, in 1726.
– 37 – COMP 21000, S’15 Byte Ordering Example Big Endian Least significant byte (e.g., 67) has highest address Little Endian Least significant byte has lowest addressExample Variable x has 4-byte representation 0x Address given by &x is 0x100 0x1000x1010x1020x x1000x1010x1020x Big Endian Little Endian Endian is a function of hardware
– 38 – COMP 21000, S’15 Reading Byte-Reversed Listings Disassembly Text representation of binary machine code Generated by program that reads the machine code Example Fragment AddressInstruction CodeAssembly Rendition :5b pop %ebx :81 c3 ab add $0x12ab,%ebx c:83 bb cmpl $0x0,0x28(%ebx) Deciphering Numbers Value: 0x12ab Pad to 4 bytes: 0x000012ab Split into bytes: ab Reverse: ab
– 39 – COMP 21000, S’15 Examining Data Representations Code to Print Byte Representation of Data Casting pointer to unsigned char * creates byte array typedef unsigned char *byte_pointer; void show_bytes(byte_pointer start, int len) { int i; for (i = 0; i < len; i++) printf("0x%p\t0x%.2x\n", start+i, start[i]); printf("\n"); } Printf directives: %p :Print pointer %x :Print Hexadecimal %.2x means print only 2 digits of hex Printf directives: %p :Print pointer %x :Print Hexadecimal %.2x means print only 2 digits of hex Don’t need the 0x for pointers in some compilers This program is described in section of the textbook See /home/barr/Student/Examples/binOps
– 40 – COMP 21000, S’15 show_bytes Execution Example int a = 15213; printf("int a = %d;\n”, a); show_bytes((byte_pointer) &a, sizeof(int)); Result (Linux): int a = 15213; 0x11ffffcb80x6d 0x11ffffcb90x3b 0x11ffffcba0x00 0x11ffffcbb0x00 Real representation: B 6 D Address of bytes Content stored at the address
– 41 – COMP 21000, S’15 show_bytes Execution Example int a = 15213; printf("int a = %d;\n”, a); show_bytes((byte_pointer) &a, sizeof(int)); float b = ; printf(”float b = %f;\n”, b); show_bytes((byte_pointer) &b, sizeof(float)); char c[10] = “abcd”; printf(”string c = %s;\n”, c); show_bytes((byte_pointer) &c, 10*sizeof(char));
– 42 – COMP 21000, S’15 Why does endian-ness matter? Web Client (browser) Web Server What is Lady Gaga’s age? 28 0x C (solaris box) (intel box) 469,762,048 0x C 0x 1C Interpreted as Displayed as
– 43 – COMP 21000, S’15 Why does endian-ness matter? Spreadsheet created on big-endian machine Final cost: $ 28 0x C Your machine (solaris box) Your client’s machine (intel box) Final cost: $ 469,762,048 0x C 0x 1C Interpreted as Displayed as Spreadsheet read on big- endian machine Stored as
– 44 – COMP 21000, S’15 Summary of the Main Points It’s All About Bits & Bytes Numbers Programs Text Different Machines Follow Different Conventions for Word size Byte ordering Representations Boolean Algebra is the Mathematical Basis Basic form encodes “false” as 0, “true” as 1 General form like bit-level operations in C Good for representing & manipulating sets