Presentation is loading. Please wait.

Presentation is loading. Please wait.

Bits and Bytes Spring, 2015 Topics Why bits? Representing information as bits Binary / Hexadecimal Byte representations »Numbers »Characters and strings.

Similar presentations


Presentation on theme: "Bits and Bytes Spring, 2015 Topics Why bits? Representing information as bits Binary / Hexadecimal Byte representations »Numbers »Characters and strings."— Presentation transcript:

1 Bits and Bytes Spring, 2015 Topics Why bits? Representing information as bits Binary / Hexadecimal Byte representations »Numbers »Characters and strings »Instructions Bit-level manipulations Boolean algebra Expressing in C COMP 21000 Comp Org & Assembly Lang

2 – 2 – COMP 21000, S’15 How then do we represent data? Each different “type” in a programming language has its own representation. Each different “type” in a programming language has its own representation. Next few slides discuss the representation for the common types: Next few slides discuss the representation for the common types: Ints Pointers Floats Strings

3 – 3 – COMP 21000, S’15 Representing Integers int A = 15213; int B = -15213; long int C = 15213; Decimal: 15213 Binary: 0011 1011 0110 1101 Hex: 3 B 6 D 6D 3B 00 Linux/Alpha A 3B 6D 00 Sun A 93 C4 FF Linux/Alpha B C4 93 FF Sun B Two’s complement representation (Covered next lecture) 00 6D 3B 00 Alpha C 3B 6D 00 Sun C 6D 3B 00 Linux C Linux/Win/OS X: little endian Solaris: big endian

4 – 4 – COMP 21000, S’15 Representing Pointers int B = -15213; int *P = &B; Alpha Address Hex: 1 F F F F F C A 0 Binary: 0001 1111 1111 1111 1111 1111 1100 1010 0000 01 00 A0 FC FF Alpha P Sun Address Hex: E F F F F B 2 C Binary: 1110 1111 1111 1111 1111 1011 0010 1100 Different compilers & machines assign different locations to objects FB 2C EF FF Sun P FF BF D4 F8 Linux P Linux Address Hex: B F F F F 8 D 4 Binary: 1011 1111 1111 1111 1111 1000 1101 0100 Linux/Win/OS X: little endian Solaris: big endian

5 – 5 – COMP 21000, S’15 Representing Floats Float F = 15213.0; IEEE Single Precision Floating Point Representation Hex: 4 6 6 D B 4 0 0 Binary: 0100 0110 0110 1101 1011 0100 0000 0000 15213: 1110 1101 1011 01 Not same as integer representation, but consistent across machines 00 B4 6D 46 Linux/W in/OS X F B4 00 46 6D Solaris F Can see some relation to integer representation, but not obvious IEEE Single Precision Floating Point Representation Hex: 4 6 6 D B 4 0 0 Binary: 0100 0110 0110 1101 1011 0100 0000 0000 15213: 1110 1101 1011 01 IEEE Single Precision Floating Point Representation Hex: 4 6 6 D B 4 0 0 Binary: 0100 0110 0110 1101 1011 0100 0000 0000 15213 1110 1101 1011 01 (in binary)

6 – 6 – COMP 21000, S’15 char S[6] = "15213"; Representing Strings Strings in C Represented by array of characters Each character encoded in ASCII format Standard 7-bit encoding of character set Character “0” has code 0x30 »Digit i has code 0x30 +i String should be null-terminated Final character = 0Compatibility Byte ordering not an issue Text files generally platform independent Except for different conventions of line termination character(s)! »Unix ( ‘\n’ = 0x0a = ^J ) »Mac ( ‘\r’ = 0x0d = ^M ) »DOS and HTTP ( ‘\r\n’ = 0x0d0a = ^M^J ) Linux/Win/ OS X S Solaris S 32 31 35 33 00 32 31 35 33 00

7 – 7 – COMP 21000, S’15 ASCII Chart ASCII ‘0’ is 0x30 Reference: http://jimprice.com/jim-asc.htm

8 – 8 – COMP 21000, S’15 Characters Other popular encoding standards: Unicode 16 bit code Allows more characters: most languages can be encoded See http://en.wikipedia.org/wiki/Unicodehttp://en.wikipedia.org/wiki/Unicode The most commonly used encodings are UTF-8, UTF-16 and the now-obsolete UCS-2 UTF-8 »uses one byte for any ASCII character, »These bytes have the same code values in both UTF-8 and ASCII encoding, »and up to four bytes for other characters ISO Latin-8 8-bit encoding standard with dotted consonants International standard for Latin letters

9 – 9 – COMP 21000, S’15 Machine-Level Code Representation Encode Program as Sequence of Instructions Each instruction is a simple operation Arithmetic operation Read or write memory Conditional branch Instructions encoded as bytes Alpha’s, Sun’s, Mac’s use 4 byte instructions »Reduced Instruction Set Computer (RISC) PC’s (and intel Mac’s) use variable length instructions »Complex Instruction Set Computer (CISC) Different instruction types and encodings for different machines Most code is not binary compatible Programs are Byte Sequences Too!

10 – 10 – COMP 21000, S’15 Representing Instructions int sum(int x, int y) { return x+y; return x+y;} Different machines use totally different instructions and encodings 00 30 42 Alpha sum 01 80 FA 6B E0 08 81 C3 Solaris sum 90 02 00 09 For this example, Alpha & Sun use two 4-byte instructions Use differing numbers of instructions in other cases PC uses 7 instructions with lengths 1, 2, and 3 bytes Same for NT, OSX (intel) and for Linux NT / Linux/OSX not fully binary compatible E5 8B 55 89 PC sum 45 0C 03 45 08 89 EC 5D C3

11 – 11 – COMP 21000, S’15 What do we know now? A computer is a processor and RAM. Everything else is peripheral Everything is 1’s and 0’s. A processor executes instructions Instructions are encoded in 1’s and 0’s each instruction has binary representation RAM stores 1’s and 0’s We interpret these 1’s and 0’s based on context Every type is interpreted from binary integers: binary float: special representation in binary char: binary (ASCII encoding)

12 – 12 – COMP 21000, S’15 How do we get to bits? We write programs in an abstract language like C How do we get to the binary representation? Compilers & Assemblers! We write x = 5. The compiler changes it into mov 5, 0x005F The assembler changes it into: 80483c7: 8b 45 5F

13 – 13 – COMP 21000, S’15 Manipulating bits Since all information is in the form of bits, we must manipulate bits when programming at low levels We did some of this with the show_bytes program. To understand how to do more, we must understand Boolean algebra We can manipulate bits from C; this is where C and hardware meet.

14 – 14 – COMP 21000, S’15 Boolean Algebra Developed by George Boole in 19th Century Algebraic representation of logic Encode “True” as 1 and “False” as 0And A&B = 1 when both A=1 and B=1 Not ~A = 1 when A=0 Or A|B = 1 when either A=1 or B=1 Exclusive-Or (Xor) A^B = 1 when either A=1 or B=1, but not both

15 – 15 – COMP 21000, S’15 A ~A ~B B Connection when A&~B | ~A&B Application of Boolean Algebra Applied to Digital Systems by Claude Shannon 1937 MIT Master’s Thesis Reason about networks of relay switches Encode closed switch as 1, open switch as 0 A&~B ~A&B = A^B

16 – 16 – COMP 21000, S’15 General Boolean Algebras Operate on Bit Vectors Operations applied bitwise All of the Properties of Boolean Algebra Apply 01101001 & 01010101 01000001 01101001 | 01010101 01111101 01101001 ^ 01010101 00111100 ~ 01010101 10101010 01000001011111010011110010101010

17 – 17 – COMP 21000, S’15 Representing & Manipulating Sets Representation Width w bit vector represents subsets of {0, …, w–1} a j = 1 if j  A 01101001 { 0, 3, 5, 6 } 76543210 01010101 { 0, 2, 4, 6 } 76543210Operations & Intersection 01000001 { 0, 6 } | Union 01111101 { 0, 2, 3, 4, 5, 6 } ^Symmetric difference 00111100 { 2, 3, 4, 5 } ~Complement 10101010 { 1, 3, 5, 7 }

18 – 18 – COMP 21000, S’15 Bit-Level Operations in C Operations &, |, ~, ^ Available in C Apply to any “integral” data type long, int, short, char, unsigned View arguments as bit vectors Arguments applied bit-wise Examples (Char data type) ~0x41 --> 0xBE ~01000001 2 -->10111110 2 ~0x00 --> 0xFF ~00000000 2 -->11111111 2 0x69 & 0x55 --> 0x41 01101001 2 & 01010101 2 --> 01000001 2 0x69 | 0x55 --> 0x7D 01101001 2 | 01010101 2 --> 01111101 2

19 – 19 – COMP 21000, S’15 Contrast: Logic Operations in C Contrast to Logical Operators &&, ||, ! View 0 as “False” Anything nonzero as “True” Always return 0 or 1 Early termination Examples (char data type) !0x41 --> 0x00 !0x00 --> 0x01 !!0x41 --> 0x01 0x69 && 0x55 --> 0x01 0x69 || 0x55 --> 0x01 p && *p++ ( avoids null pointer access) Or p && (*p = 10) also protects Works because of short- circuit evaluation in C

20 – 20 – COMP 21000, S’15 Shift Operations Left Shift: x << y Shift bit-vector x left y positions Throw away extra bits on left Fill with 0’s on right Right Shift: x >> y Shift bit-vector x right y positions Throw away extra bits on right Logical shift Fill with 0’s on left Arithmetic shift Replicate most significant bit on right Useful with two’s complement integer representation 01100010 Argument x 00010000<< 3 00011000 Log. >> 2 00011000 Arith. >> 2 10100010 Argument x 00010000<< 3 00101000 Log. >> 2 11101000 Arith. >> 2 00010000 00011000 00010000 00101000 11101000 00010000 00101000 11101000 Which right shift does C do? Depends on the compiler! Often: int do ASR While unsign does LSR

21 – 21 – COMP 21000, S’15 Effect of LSL Each shift doubles the number. If the carry bit is 1 after a shift, then overflow (if the number is unsigned). Decimal: shift multiplies by 10. 35 << 350 Binary: 0 1 0 1 0 = 1 x 2 3 + 0 x 2 2 + 1 x 2 1 + 0 x 2 0 1 0 1 0 0 = 1 x 2 4 + 0 x 2 3 + 1 x 2 2 + 0 x 2 1 + 0 x 2 0 = 2 x (1 x 2 3 + 0 x 2 2 + 1 x 2 1 + 0 x 2 0 )

22 – 22 – COMP 21000, S’15 Effect of LSR Each shift halves the number. Round down (because you drop the least significant bit) Decimal: shift divides by 10. 350 >> 35 Binary: 1 0 1 0 0 = 1 x 2 4 + 0 x 2 3 + 1 x 2 2 + 0 x 2 1 + 0 x 2 0 0 1 0 1 0 = 1 x 2 3 + 0 x 2 2 + 1 x 2 1 + 0 x 2 0 = (1 x 2 4 + x 2 3 + 0 x 2 2 + 1 x 2 1 + 0 x 2 0 ) ÷ 2

23 – 23 – COMP 21000, S’15 Truncating Casting to a shorter form truncates the number. int x = 53191; short sx = (short) x;/* -12345 */ int y = sx;/* -12345*/ On a typical 32-bit machine: Casting int to short truncate 32-bit to 16-bits When cast back to 32 bits, do sign extension (see next lecture) so value doesn’t change.

24 – 24 – COMP 21000, S’15 Masks Can use bit patterns as masks Example Mask = 0xFFFFFF00when used with & filters out (i.e., zeros out) the last byte Mask = 0x000000FFwhen used with | places 1’s in the last byte, rest is unaffected.

25 – 25 – COMP 21000, S’15 Masks Can perform a mod by a power of 2 with a mask Examples: x % 2 = 0 (x is even) or 1 (x is odd) x % 2 = 0 (x is even) or 1 (x is odd) 1111 = 1 x 2 3 + 1 x 2 2 + 1 x 2 1 + 1 x 2 0 : all places except the last are divisible by 2. So if the last bit is 0, mod is 0. If last bit is 1, mod is 1 x % 4 x % 4 1111 = 1 x 2 3 + 1 x 2 2 + 1 x 2 1 + 1 x 2 0 : all places except the last two are divisible by 4. So if the last two bits are 0, mod is 0. Otherwise the mod is whatever value the last two bits have

26 – 26 – COMP 21000, S’15 Masks Can perform a mod by a power of 2 with a mask Examples: x % 8 = 0 x % 8 = 0 1111 = 1 x 2 3 + 1 x 2 2 + 1 x 2 1 + 1 x 2 0 : all places except the last three are divisible by 8. So if the last three bits are 0, mod is 0.. Otherwise the mod is the value in the last three bits so so x % 2 == x & 00000001 (or 0x01) x % 4 == x & 00000011 (or 0x03) x % 8 == x & 00000111 (or 0x07)

27 – 27 – COMP 21000, S’15 Converting (revisited) #include #define LOWDIGIT 07 #define DIG 20 void octal_print(register int x) { register int i = 0; char s[DIG]; if ( x < 0 ){ /* treat negative x */ putchar ('-'); /* output a minus sign */ x = - x; } do{ /* mod performed with & */ s[i++] = ( (x & LOWDIGIT) + '0'); } while ( ( x >>= 3) > 0 ); /* divide x by 8 via shift */ putchar('0'); /* octal number prefix */ do{ putchar(s[--i]); /* output characters on s */ } while ( i > 0 ); putchar ('\n'); }

28 – 28 – COMP 21000, S’15 Converting to binary void printBits(unsigned int word, int nBits){ for (int i=nBits-1;i>=0;i--){ for (int i=nBits-1;i>=0;i--){ printf("%u", !!(word&(1<<i))); printf("%u", !!(word&(1<<i))); } } printf("\n"); printf("\n");}

29 – 29 – COMP 21000, S’15 Cool Stuff with Xor void funny(int *x, int *y) { *x = *x ^ *y; /* #1 */ *x = *x ^ *y; /* #1 */ *y = *x ^ *y; /* #2 */ *y = *x ^ *y; /* #2 */ *x = *x ^ *y; /* #3 */ *x = *x ^ *y; /* #3 */} Bitwise Xor is form of addition With extra property that every value is its own additive inverse A ^ A = 0 BA Begin BA^B 1 (A^B)^B = AA^B 2 A(A^B)^A = B 3 AB End *y*x

30 – 30 – COMP 21000, S’15 Bit techniques INTEGER CODING RULES: Replace the "return" statement in each function with one Replace the "return" statement in each function with one or more lines of C code that implements the function. Your code or more lines of C code that implements the function. Your code must conform to the following style: must conform to the following style: int Funct(arg1, arg2,...) { int Funct(arg1, arg2,...) { /* brief description of how your implementation works */ /* brief description of how your implementation works */ int var1 = Expr1; int var1 = Expr1;...... int varM = ExprM; int varM = ExprM; varJ = ExprJ; varJ = ExprJ;...... varN = ExprN; varN = ExprN; return ExprR; return ExprR; }

31 – 31 – COMP 21000, S’15 Bit techniques Each "Expr" is an expression using ONLY the following: Each "Expr" is an expression using ONLY the following: 1. Integer constants 0 through 255 (0xFF), inclusive. You are 1. Integer constants 0 through 255 (0xFF), inclusive. You are not allowed to use big constants such as 0xffffffff. not allowed to use big constants such as 0xffffffff. 2. Function arguments and local variables (no global variables). 2. Function arguments and local variables (no global variables). 3. Unary integer operations ! ~ 3. Unary integer operations ! ~ 4. Binary integer operations & ^ | + > 4. Binary integer operations & ^ | + > Some of the problems restrict the set of allowed operators even further. Some of the problems restrict the set of allowed operators even further. Each "Expr" may consist of multiple operators. You are not restricted to Each "Expr" may consist of multiple operators. You are not restricted to one operator per line. one operator per line. You are expressly forbidden to: You are expressly forbidden to: 1. Use any control constructs such as if, do, while, for, switch, etc. 1. Use any control constructs such as if, do, while, for, switch, etc. 2. Define or use any macros. 2. Define or use any macros. 3. Define any additional functions in this file. 3. Define any additional functions in this file. 4. Call any functions. 4. Call any functions. 5. Use any other operations, such as &&, ||, -, or ?: 5. Use any other operations, such as &&, ||, -, or ?: 6. Use any form of casting. 6. Use any form of casting. 7. Use any data type other than int. This implies that you 7. Use any data type other than int. This implies that you cannot use arrays, structs, or unions. cannot use arrays, structs, or unions.

32 – 32 – COMP 21000, S’15 Bit techniques /* pow2plus1 - returns 2^x + 1, where 0 <= x <= 31 */ int pow2plus1(int x) { int pow2plus1(int x) { /* exploit ability of shifts to compute powers of 2 */ /* exploit ability of shifts to compute powers of 2 */ return (1 << x) + 1; return (1 << x) + 1; } } /* pow2plus4 - returns 2^x + 4, where 0 <= x <= 31 */ /* pow2plus4 - returns 2^x + 4, where 0 <= x <= 31 */ int pow2plus4(int x) { int pow2plus4(int x) { /* exploit ability of shifts to compute powers of 2 */ /* exploit ability of shifts to compute powers of 2 */ int result = (1 << x); int result = (1 << x); result += 4; result += 4; return result; return result; }

33 – 33 – COMP 21000, S’15 Bit techniques / * bitNor - ~(x|y) using only ~ and & * Example: bitNor(0x6, 0x5) = 0xFFFFFFF8 * Example: bitNor(0x6, 0x5) = 0xFFFFFFF8 * Legal ops: ~ & * Legal ops: ~ & * Max ops: 8 * Max ops: 8 * Rating: 1 * Rating: 1 */ */ int bitNor(int x, int y) { } /* evenBits - return word with all even-numbered bits set to 1 * Legal ops: ! ~ & ^ | + > * Legal ops: ! ~ & ^ | + > * Max ops: 8 * Max ops: 8 * Rating: 1 * Rating: 1 */ */ int evenBits(void) { } return (~x & ~y) int byte = 0x55; int word = byte | byte<<8; return word | word<<16; use DeMorgan’s Law: ~(A & B) = (~A) | (~B) And ~(A | B) = (~A) & (~B) use DeMorgan’s Law: ~(A & B) = (~A) | (~B) And ~(A | B) = (~A) & (~B)

34 – 34 – COMP 21000, S’15 Some Other Uses for Bitvectors Representation of small sets Representation of polynomials: Important for error correcting codes (see later slides) Arithmetic over finite fields, say GF(2^n) Example 0x15213 : x 16 + x 14 + x 12 + x 9 + x 4 + x + 1 Representation of graphs (intersection of sys & algo) A ‘1’ represents the presence of an edge Representation of bitmap images, icons, cursors, … Exclusive-or cursor patent (see later slides) Representation of Boolean expressions and logic circuits

35 – 35 – COMP 21000, S’15 UDP The User Datagram Protocol (UDP) is one of the core members of the Internet protocol suite, the set of network protocols used for the Internet. With UDP, computer applications can send messages, in this case referred to as datagrams, to other hosts on an Internet Protocol (IP) network without prior communications to set up special transmission channels or data paths. The protocol was designed by David P. Reed in 1980 and formally defined in RFC 768. http://en.wikipedia.org/wiki/User_Datagram_Protocol

36 – 36 – COMP 21000, S’15 Transport Layer 3-36 UDP: User Datagram Protocol [RFC 768] “no frills,” “bare bones” Internet transport protocol “best effort” service, UDP segments may be: lost delivered out of order to appconnectionless: no handshaking between UDP sender, receiver each UDP segment handled independently of others Why is there a UDP? no connection establishment (TCP: 3-way handshake) simple: no connection state at sender, receiver (TCP keeps buffers, sequence no., ack, etc.) small segment header (20 bytes for TCP, 8 for UDP) no congestion control: UDP can blast away as fast as desired With UDP applications basically talk directly to IP

37 – 37 – COMP 21000, S’15 Transport Layer UDP checksum Goal: detect “errors” (e.g., flipped bits) in transmitted segment Result: UDP does error detection not error correction 1. may just discard a segment with errors 2. may pass it to the application with a warning.

38 – 38 – COMP 21000, S’15 Transport Layer UDP checksum Sender: treat segment contents as sequence of 16-bit integers checksum: addition (1’s complement sum) of segment contents sender puts checksum value into UDP checksum field Receiver: compute checksum of received segment check if computed checksum equals checksum field value: NO - error detected YES - no error detected. But maybe errors nonetheless? More later …. Goal: detect “errors” (e.g., flipped bits) in transmitted segment

39 – 39 – COMP 21000, S’15 Transport Layer Internet Checksum Example Note When adding numbers, a carryout from the most significant bit needs to be added to the result sender: add two 16-bit integers 1 1 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 1 1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 1 0 0 1 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 1 wraparound sum Checksum (complement the sum)

40 – 40 – COMP 21000, S’15 Transport Layer Internet Checksum Example Note When adding numbers, a carryout from the most significant bit needs to be added to the result receiver: add two 16-bit integers plus the checksum: should get all 1’s (why?) 1 1 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 1 1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 wraparound sum Checksum Result For an explanation of 1’s complement addition see http://mathforum.org/library/drmath/view/54379.htmlhttp://mathforum.org/library/drmath/view/54379.html For an explanation of 1’s complement addition see http://mathforum.org/library/drmath/view/54379.htmlhttp://mathforum.org/library/drmath/view/54379.html Do XOR of sum & checksum; should get all 1’s

41 – 41 – COMP 21000, S’15 UDP checksum code (part 1) typedef unsigned short u16; typedef unsigned long u32; u16 udp_sum_calc(u16 len_udp, u16 src_addr[],u16 dest_addr[], BOOL padding, u16 buff[]) { u16 prot_udp=17; u16 padd=0; u16 word16; u32 sum; // Find out if the length of data is even or odd number. If odd, // add a padding byte = 0 at the end of packet if (padding&1==1){ padd=1;buff[len_udp]=0;} //initialize sum to zero sum=0; See http://www.netfor2.com/udpsum.htm for the full code.http://www.netfor2.com/udpsum.htm See http://www.netfor2.com/udpsum.htm for the full code.http://www.netfor2.com/udpsum.htm // make 16 bit words out of every two adjacent 8 bit words and // calculate the sum of all 16 vit words // make 16 bit words out of every two adjacent 8 bit words and // calculate the sum of all 16 vit words

42 – 42 – COMP 21000, S’15 UDP checksum code (part 2) for (i=0; i < len_udp+padd; i=i+2){ word16 =((buff[i]<<8)&0xFF00)+(buff[i+1]&0xFF); sum = sum + (unsigned long)word16; } // add the UDP pseudo header which contains the IP source and destinationn addresses for (i=0;i<4;i=i+2){ word16 =((src_addr[i]<<8)&0xFF00)+(src_addr[i+1]&0xFF); sum=sum+word16;} for (i=0;i<4;i=i+2){ word16 =((dest_addr[i]<<8)&0xFF00)+(dest_addr[i+1]&0xFF); sum=sum+word16;} See http://www.netfor2.com/udpsum.htm for the full code.http://www.netfor2.com/udpsum.htm See http://www.netfor2.com/udpsum.htm for the full code.http://www.netfor2.com/udpsum.htm // make 16 bit words out of every two adjacent 8 bit words and // calculate the sum of all 16 vit words // make 16 bit words out of every two adjacent 8 bit words and // calculate the sum of all 16 vit words // overflow bits are added on the next slide

43 – 43 – COMP 21000, S’15 UDP checksum code (part 3) // the protocol number and the length of the UDP packet sum = sum + prot_udp + len_udp; // keep only the last 16 bits of the 32 bit calculated sum and add the carries while (sum>>16) while (sum>>16) sum = (sum & 0xFFFF)+(sum >> 16); // Take the one's complement of sum sum = ~sum; return ((u16) sum); } See http://www.netfor2.com/udpsum.htm for the full code.http://www.netfor2.com/udpsum.htm See http://www.netfor2.com/udpsum.htm for the full code.http://www.netfor2.com/udpsum.htm /* here’s where the carry’s are added in. Turns out you can save the overflow bits and add them back at the end */ /* here’s where the carry’s are added in. Turns out you can save the overflow bits and add them back at the end */

44 – 44 – COMP 21000, S’15 XOR patent dispute for (y=0; y < y_height; ++y) { int loc = (y+y_off)*stride + x_off; int loc = (y+y_off)*stride + x_off; vidmem[loc] ^= 255; /* XOR */ vidmem[loc] ^= 255; /* XOR */ } // another way of doing this: for (y=0; y < y_height; ++y) { for (y=0; y < y_height; ++y) { int loc = (y+y_off)*stride + x_off; int loc = (y+y_off)*stride + x_off; for (x=0; x < 8; ++x) for (x=0; x < 8; ++x) if (vidmem[loc] & (1 << x)) if (vidmem[loc] & (1 << x)) vidmem[loc] &= ~(1 << x); vidmem[loc] &= ~(1 << x); else else vidmem[loc] |= (1 << x); vidmem[loc] |= (1 << x); } See http://nothings.org/computer/patents.html for a full discussionhttp://nothings.org/computer/patents.html See http://nothings.org/computer/patents.html for a full discussionhttp://nothings.org/computer/patents.html Closely related is a patent which apparently covers results, not process. The exclusive-or- cursor patent is a simple example of this. There are a number of mathematically equivalent ways of expressing the xor algorithm, but it appears to be widely believed that all of them are covered by the patent. The patent apparently is understood to cover the idea of representing an onscreen cursor by complementing each "visible" pixel of the cursor with the contents of whatever is on screen. This is somewhat surprising, since it is predated by hardware implementations that do the exact same thing for text-only modes--rendering a block or underline cursor so its appearance is the same as above. Closely related is a patent which apparently covers results, not process. The exclusive-or- cursor patent is a simple example of this. There are a number of mathematically equivalent ways of expressing the xor algorithm, but it appears to be widely believed that all of them are covered by the patent. The patent apparently is understood to cover the idea of representing an onscreen cursor by complementing each "visible" pixel of the cursor with the contents of whatever is on screen. This is somewhat surprising, since it is predated by hardware implementations that do the exact same thing for text-only modes--rendering a block or underline cursor so its appearance is the same as above.

45 – 45 – COMP 21000, S’15 More Bitvector Magic Count the number of 1’s in a word MIT Hackmem 169: int bitcount(unsigned int n) { unsigned int tmp; tmp = n - ((n >> 1) & 033333333333) - ((n >> 2) & 011111111111); return ((tmp + (tmp >> 3)) & 030707070707)%63; } MIT Hackmem Count. The main idea. 1. Consider a 3 bit unsigned number as being 4a+2b+c, example: consider the three bit number 101 then the unsigned number represents 1 x 2 2 + 0 x 2 1 + 1 x 2 0 = 1 x 4 + 0 x 2 + 1 let “a” represent the leftmost bit (the 1), “b” the next bit (the 0) and c the rightmost bit (the rightmost 1). then we get a x 4 + b x 2 + c or 4a+2b+c 2. If we shift the number right 1 bit, we get 010. If we follow the convention of using “abc” for the bits, then after the shift (logical) we get 0ab or 0 x 2 2 + a x 2 1 + b x 2 0 = 2a+b. 3. Subtracting this shifted number (2a+b) from the original number (4a+2b+c) gives 2a+b+c. 4. If we right-shift the original 3-bit number by two bits, we go from “abc” to “00a” which is just a x 2 0 5. If we subract this new shifted number (a) from the result of 3, we get 2a+b+c – a = a+b+c, 6. Since a, b, and c represent the bits (1, 0, and 1 in our example), a+b+c is just the number of bits in the original number!

46 – 46 – COMP 21000, S’15 More Bitvector Magic Count the number of 1’s in a word MIT Hackmem 169: int bitcount(unsigned int n) { unsigned int tmp; tmp = n - ((n >> 1) & 033333333333) - ((n >> 2) & 011111111111); return ((tmp + (tmp >> 3)) & 030707070707)%63; } MIT Hackmem Count. (continued) How is this insight employed in the code? 1. The key insight is that we’re actually going to consider the parameter n to be composed of 11 groups of 3 (3 x 11 = 33, or one more bit than we actually have in a 32-bit number). 2. So we’re going to do the algorithm from the previous page one each three bit group of bits. 3. Consider the bits 101 111 010 To apply the previous algorithm on each group we want to: a. (101 – 010 – 001) (111 – 011 – 001) (010 – 001 – 000) b. If we shift n we get: 010 111 101 Note that the shifted number in each position has a “wrong” first bit (it’s the shifted bit from the previous group) but the correct last two digits. c. We know that the first bit must be 0 since we’re doing logical shift right d. So we can take the result of shifting and mask out the first bit in each group. e. Since each group can be represented by an octal number, the mask for each group will be 3 in octal (011). This explains the line ((n >> 1) & 033333333333). Note that a number beginning in “0” in C is taken as an octal number. f. A similar explanation can be made for the next part of the expression, i.e., to shift each group twice, shift n right twice and mask with 001 (or octal 1) for each group. Now each octal digit in the number n represents the number of 1’s the original number in that position. In our example, the 9 digits become

47 – 47 – COMP 21000, S’15 More Bitvector Magic Count the number of 1’s in a word MIT Hackmem 169: int bitcount(unsigned int n) { unsigned int tmp; tmp = n - ((n >> 1) & 033333333333) - ((n >> 2) & 011111111111); return ((tmp + (tmp >> 3)) & 030707070707)%63; } MIT Hackmem Count. (continued) The last return statement sums these octal digits to produce the final answer. The key idea is to add adjacent pairs of octal digits together and then compute the remainder modulus 63. 1. Right-shifting tmp by three bits, 2. adding it to tmp itself and 3. ANDing with a suitable mask. This step saves every other group of 3 bits. We want this because we’ve added each 3-bit group with its right neighbor group; if we save every group we will have counted each group twice. 4. The result is a number in which groups of six adjacent bits (starting from the LSB) contain the number of 1's among those six positions in n. 5. This number modulo 63 yields the final answer. For 64-bit numbers, we would have to add triples of octal digits and use modulus 1023. This is HACKMEM 169, as used in X11 sources. Source: MIT AI Lab memo, late 1970's.

48 – 48 – COMP 21000, S’15 Summary of the Main Points It’s All About Bits & Bytes Numbers Programs Text Different Machines Follow Different Conventions for Word size Byte ordering Representations Boolean Algebra is the Mathematical Basis Basic form encodes “false” as 0, “true” as 1 General form like bit-level operations in C Good for representing & manipulating sets


Download ppt "Bits and Bytes Spring, 2015 Topics Why bits? Representing information as bits Binary / Hexadecimal Byte representations »Numbers »Characters and strings."

Similar presentations


Ads by Google