Bits and Bytes Bits and Bytes Chapter 1
Bits OR Bit = Binary Digit (or Binary Digit) = {0, 1} Off On Bits & Bytes Bits Bit = Binary Digit (or Binary Digit) = {0, 1} Assume you wish to send a message using a Light Switch A binary condition since the light switch can be either: OR Off On Any binary condition can be represented with a single light switch : Good Bad Yes No OR OR Male Female Dead Alive OR OR
If there are 2 light switches, the total combinations are: But, what if there are more than two states? What if I want to represent the conditions GOOD, SO-SO, and BAD???? Simple ….. Add more Light Switches If there are 2 light switches, the total combinations are: #1 #2 #3 #4 Off Off On 1 Off On 1 On 1 Interpreted as: Bad Interpreted as: So-So Interpreted as: Good Not Used
Bits and Bytes If I can transmit 4 messages with two bits, how many could I transmit if I had 3 bits? Or 4 bits? With 3-bits, there are 8 possible combinations: 000 100 001 101 010 110 011 111 And, with 4-bits, there are 16 possible combinations: 0000 0100 1000 1100 0001 0101 1001 1101 0010 0110 1010 1110 0011 0111 1100 1111
Bits and Bytes Is there any way to know how many messages we could transmit for a given number of bits without having to test all possible combinations?? As in Decimal (base 10), it is possible to determine how many messages can be transmitted for any number of decimal places. In Binary (base 2), the same calculations are made, but using bits (instead of decimals). Decimal Number Number Number Places Messages Bits Messages 0 100 = 1 0 20 = 1 1 101 = 10 1 21 = 2 2 102 = 100 2 22 = 4 3 103 = 1,000 3 23 = 8 4 104 = 10,000 4 24 = 16 5 105 = 100,00 5 25 = 32 6 106 = 1,000,000 6 26 = 64 7 107 = 10,000,000 7 27 = 128 8 108 = 100,000,000 8 28 = 256 9 109 = 1,000,000,000 9 29 = 512 10 1010 = 10,000,000,000 10 210 = 1,024
The General formula is: Bits and Bytes The General formula is: I = Bn where: I = The amount of Information (messages) available B = The base we are working in (Decimal or Binary) n = The number of digits (decimals or bits) we have Applying the formula to both decimal and binary values: 100 = 1 20 = 1 101 = 10 21 = 2 102 = 100 22 = 4 103 = 1,000 23 = 8 104 = 10,000 24 = 16 105 = 100,000 25 = 32 106 = 1,000,000 26 = 64 107 = 10,000,000 27 = 128 108 = 100,000,000 28 = 256 109 = 1,000,000,000 29 = 512 1010 = 10,000,000,000 210 = 1,024
Just reverse the process. Bits and Bytes What if I Know how much information (I = Number of Messages) I want to transmit. How do I determine the number of bits I need? Just reverse the process. If I = 10n (decimal) OR I = 2n (binary) then log(I) = n log(10) log(I) = n log(2) log(I) log(I) log(I) log(I) And n = = n = = log(10) 1.000 log(2) 0.30103 = log(I) Since 100.30103 = 2 Information Decimals Needed Bits Needed 10 log(10) = 1.000 log(10)/log(2) = 1.000/.30103 = 3.32 50 log(50) = 1.699 log(50)/log(2) = 1.699/.30103 = 5.64 100 log(100) = 2.000 log(100)/log(2) = 2.000/.30103 = 6.64 500 log(500) = 2.699 log(500)/log(2) = 2.699/.30103 = 8.97 1,000 log(1000) = 3.000 log(1000)/log(2) = 3.000/.30103 = 9.97 10,000 log(10000) = 4.000 log(10000)/log(2) = 4.000/.30103 = 13.29
Bits and Bytes How can we have partial bits (or decimals)? For example, how can we have 5.64 bits to represent 50 messages? We Can’t The formula given should have been: log(I) log(I) Where: n = = log(2) 0.30103 is the ceiling of the result (i.e., rounded up) And the number of bits needed would be: Messages Bits Needed 10 log(10)/log(2) = 1.000/.30103 = 3.32 = 4 50 log(50)/log(2) = 1.699/.30103 = 5.64 = 6 100 log(100)/log(2) = 2.000/.30103 = 6.64 = 7 500 log(500)/log(2) = 2.699/.30103 = 8.97 = 9 1,000 log(1000)/log(2) = 3.000/.30103 = 9.97 = 10 10,000 log(10000)/log(2) = 4.000/.30103 = 13.29 = 14
They either remain unused, or are available for future use Bits and Bytes Notice that we could have predicted that, for example, it would take 6 bits to represent 50 pieces of information since: 25 = 32 and 26 = 64 If we need 6 bits to represent 50 pieces of information, and we could represent 64 pieces of information, what happens to the remaining 16 pieces of information?? They either remain unused, or are available for future use
What does this have to do with Computers? Bits and Bytes What does this have to do with Computers? If we were to look inside a computer (especially earlier ones) we might see a series of ‘doughnuts’: Which were merely metal rings with wires running through them Depending on whether there was voltage running through them or not (actually, high voltage or low voltage) the series represented a sequence of messages. A BINARY SITUATION!
Where and represent the different voltage states Bits and Bytes Notice that since there are 5 ‘doughnuts’, there are 25 or 32 Combinations Where and represent the different voltage states
How Many bits (or ‘doughnuts’) do we really need? Bits and Bytes How Many bits (or ‘doughnuts’) do we really need? Good question! What symbols/information do we wish to convey? Pieces of Information The digits (0, …, 9) 10 The alphabet (a, …, z) 26 The upper case alphabet (A, …, Z) 26 Special characters (! + ( ) . ? / * - % & # =, etc.) 32 (?) 94 Since: n = log(I)/log(2) = log(94)/log(2) = 1.973/0.301 = 6.55 we need 7 bits, which we could have predicted since: 26 = 64 and 27 = 128
What about the remaining 34 (128 - 94) bits? Bits and Bytes What about the remaining 34 (128 - 94) bits? There are a number of additional special characters and a number of ‘hidden’ characters which we didn’t account for: Carriage Return (CR) Back Space (BS) End of File (EOF) etc. So the additional bits will be used. Are 7 bits normally used to represent a character set? Yes. The Standard coding scheme consists of 128 characters
Bits and Bytes LIAR!! LIAR!! PANTS ON FIRE !!! Doesn’t a byte represent a character? And isn’t a byte equal to 8 bits, not 7? Yes - sort of. 1-Byte = 8-bits A Byte is used to represent a character A Byte is the basic addressable unit in RAM BUT, the standard character still contains only 128 characters, which requires 7-bits Then Why does a byte contain 8-bits?
The Parity Bit There were problems with storage and data transmission Bits and Bytes There are a few reasons. Primarily, however, it is because earlier machines suffered some reliability problems (remember what the term debugging really means)1: There were problems with storage and data transmission One additional bit was added to help detect errors: The Parity Bit How does adding one additional bit help detect errors? 1 In the days of vacuum, tubes, bugs were attracted to the heat given off by the tubes. Programmers frequently spent much of their time scrapping dead bugs off the circuitry, or ‘de-bugging’.
Bits and Bytes Assume that we wished to send the series of bits: 1001100 But, because of transmission errors, actually sent the message: 1001101 How can we tell that an error was made? How do we know that the sequence 1001101 was not the true message? As it stands now, we can’t.
1001100 1 If we were to send a transmission using an extra bit: Bits and Bytes If we were to send a transmission using an extra bit: 1001100 1 Parity-Bit We could determine if the message was correctly transmitted by counting the total number of on bits E.G. If the total number of on bits is an EVEN number, the message was correctly transmitted. Since the message sent contains 4 bits (an even number) the message sent was correct.
Other examples using EVEN Parity: Bits and Bytes IF, however, we received the message: 1001101 1 Parity-Bit We know it is incorrect because the message contains 5 (an odd number) bits Other examples using EVEN Parity: Message Sent: Mess. Received: No. Bits: 1101101 1 1101101 1 6 (Even) Correct 0001100 0101100 3 (Odd) Incorrect 1101011 1 1001011 1 5 (Odd) Incorrect 0101110 1010110 4 (Even) Correct
All it takes is one incorrect message. Bits and Bytes What gives? The last message: Message Sent: Mess. Received: No. Bits: 0101110 1010110 4 (Even) Correct Was NOT correct, even though the total number of on bits received was even??? Yes - The system is NOT perfect, but if there are thousands or millions of messages sent, it is highly unlikely that mistakes will not be caught. All it takes is one incorrect message.
Notice that errors can still go undetected. Bits and Bytes Must Parity always be equal?? No, it can be ODD (or there can be NO parity). That decision is made by the software designer. If we look at our previous examples using ODD parity: Message Sent: Mess. Received: No. Bits: 1101101 1101101 5 (Odd) Correct 0001100 1 0101100 1 4 (Even) Incorrect 1101011 1001011 4 (Even) Incorrect 0101110 1 1010110 1 5 (Odd) Correct Notice that errors can still go undetected.
Given the binary sequences: Bits and Bytes There was one other problem with bytes: Compatibility Given the binary sequences: Manufact. #1: Manufact. #2: Manufact. #3: 0000000 A + 0000001 B 1 - 0000010 C 2 * 0000011 D 3 ? 1111110 6 v TAB 1111101 7 x CR 1111110 8 y LF 1111111 9 z FF Manufacturers Interpreted them differently
ASCII Which is the Correct Interpretation??? What’s the Solution ??? Bits and Bytes Which is the Correct Interpretation??? Each is equally Correct 0000010 Could be either a ‘C’ OR a ‘2’ The letter ‘C’ Could be pronounced either ‘cee’ OR ‘ess’ What’s the Solution ??? ASCII The American Standard Code for Information Interchange
Sample ASCII Codes: Binary Sequence Value Character Description . Bits and Bytes Sample ASCII Codes: Binary Sequence Value Character Description . 0000000 NULL NULL/Tape feed 0000111 7 BEL Rings Bell 0001000 8 BS Back Space 0001101 13 CR Carriage Return 0011011 27 ESC Escape 0100000 32 SP Space 0110000 48 Zero 0110001 49 1 One 1000001 65 A Capital ‘A’ 1000010 66 B Capital ‘B’ 1100001 97 a Lower Case ‘a’ 1100010 98 b Lower Case ‘b’
A Preview of Things to Come: Bits and Bytes A Preview of Things to Come: For the first Exam Memorize the Numeric Values for: NULL Value: 0 BEL (Ring The Bell) Value: 7 BS (Backspace) Value: 8 CR (Carriage Return) Value: 13 ESC (Escape) Value: 27 SP (Space) Value: 32 The digits (0, 1, …, 9) NOTE: The Digit 0 (zero) has the value: 48 The Uppercase Alphabet NOTE: The Character ‘A’ has the value: 65 The Lowercase Alphabet NOTE: The Character ‘a’ has the value: 97
Are We limited to only 128 (= 27) characters ?? Bits and Bytes Are We limited to only 128 (= 27) characters ?? Yes and no: The STANDARD ASCII Character Set Consists of 128 Characters (as given in Addendum 1.1) There is an EXTENDED ASCII Character set which uses ALL 8-bits (1-byte) available (parity is NOT an issue) The extended ASCII Character set consists of 256 (= 28) characters (See Addendum 1.2) The Majority of the characters included in the extended ASCII character set are extensions of the Greco-Roman Alphabet (e.g., ß, Ü, å) or ‘graphics’ characters (e.g., )
What does the term ‘ASCII file’ Mean ?? Bits and Bytes What does the term ‘ASCII file’ Mean ?? An ASCII File assumes that every 8-bits (1-byte) in the file are grouped together according to the ASCII tables Aren’t ALL Files ASCII Files ?? NO - As we will see later, not all data is stored according to ASCII formats That Helps (sort-of) to explain why when we display non-ASCII files we sometimes get characters such as , , , , , and
Do ALL computers use ASCII to Represent Symbols??? Bits and Bytes Do ALL computers use ASCII to Represent Symbols??? NO - Although most do. IBM had the first Coding Scheme (dating back to 1880) EBCDIC Extended Binary Coded Decimal Interchange Code EBCDIC is still used (?) in IBM Mainframes and to store data on large reel-to-reel Tape Drives
There is only ASCII and EBCDIC ?? Bits and Bytes And so that’s it ?? There is only ASCII and EBCDIC ?? Well, … No It became obvious that Even the Extended ASCII and Character Sets were insufficient How So – Kimo Sabi ?? Suppose you wanted to represent ALL the characters used by ALL the languages in the World --- How Many Are there ???? I Don’t know, How Many ?? I Don’t know, Either -- But it’s a lot !!!
AHA!! So Everyone is using Unicode now -- Right ?? Bits and Bytes Enter Unicode (1990): If we were to use 16-bits, instead of 8, to represent characters we could represent: 216 = 65,536 Characters AHA!! So Everyone is using Unicode now -- Right ?? Well, … No Well, why not ?? Life is not so simple …
(No fonts – No Characters) Bits and Bytes There are a lot of problems still be worked out: There is a lot of disagreement about what should be included (Even though there are 65,536 combinations, you would be surprised at how quickly those combinations can be used up) The large number of characters in this set poses a severe problem for a font vendor (No fonts – No Characters) By doubling the number of bits (or bytes), we are doubling the storage and processing requirements Result: It will take years to get this straightened out
Do I have to know this stuff ?? Bits and Bytes SO – What have we learned ???? What a bit is How a bit corresponds to computer architecture How combinations of bits can be used to store information How to calculate how much information a given number of bits yields How to calculate how many bits we need to store information What a byte is and why it is 8-bits What parity is and why it is/was necessary What ASCII is and why it was developed What EBCDIC is What Unicode is and why it was developed … And many other things in between … Do I have to know this stuff ?? Of Course not !! – I just like to waste my time and yours !!
??? Any Questions ??? (Please!!) Bits and Bytes So what do we need to do ?? Make sure you THOUROGHLY understand ALL of the concepts covered in these slides Answer ALL of the relevant questions on the Review Page Memorize the assigned ASCII codes Submit your References Submit your Question(s) Look at the Bits/Bytes/ASCII C/C++ Programming Assignment (it’s not due yet, but it can’t hurt to look at it) ??? Any Questions ??? (Please!!)
Bits and Bytes