A few more things about binary numbers CSIT 301 (Blum)
Errors Recall that using a binary representation maximizes our tolerance of fluctuations without loss of the information represented. Still errors occur. They have fall into two categories: Errors that cannot be interpreted as a 1 or 0 Flipped bits: interpreting a 1 where a 0 should be and vice versa. CSIT 301 (Blum)
Error Detection and Correction One may worry about errors wherever data is held temporarily, stored semi-permanently or transmitted. The discovery that a bit or bits were flipped is known as error detection. The restoration of the correct data is known as error correction. CSIT 301 (Blum)
Error-checking protocols Data being transmitted by a modem is chunked into "blocks" of a certain byte size and sent to the destination modem. The destination modem checks each block for errors and returns ACK if no errors are found NAK if errors are found, which leads to a retransmission. The kind of checking (checksum or cyclic redundancy checking) varies from protocol to protocol. CSIT 301 (Blum)
BER The bit error rate (BER) is the ratio of bits having errors (opposite of the value they are supposed to have 01 or 10) to the total number of bits sent It is often expressed as ten to a negative power. For example, a BER of 10-6, means that on average, for every million bits sent, one error occurs. CSIT 301 (Blum)
BER (Cont.) The BER indicates how often a packet has to be resent. Sometimes increasing the data transmission rate also increases the BER, making it actually beneficial to reduce the transmission rate so that fewer packets need to be resent. A BERT (bit error rate test or tester) is a procedure or device that measures the BER. CSIT 301 (Blum)
Error Detection (Cont.) Error detection schemes involve transmitting additional information which the receiver can use to validate the data. Error detection can inform the receiver of the presence of an error but cannot fix the error, the packet must be resent. Error correction schemes (which are distinct) attempt to pinpoint the flipped bit. CSIT 301 (Blum)
Parity Parity is a simple error detection scheme. One chooses ahead of time on the type of parity to be used (even or odd). Let us assume even. Then one takes a string of binary data and calculates whether the number of 1’s in the string is even or odd. CSIT 301 (Blum)
Parity (Cont.) The transmitter includes an extra bit 0 if the number of data-bit 1’s is even 1 if the number of data-bits 1’s is odd The string consisting of the data plus parity bit is now guaranteed to have an even number of 1’s. When the data is going to be used or received in the case of transmission, one checks that the data plus parity bit have even parity (an even number of 1’s). If the data now has odd parity, some error must have occurred. But we do not know which bit was flipped (detection not correction). CSIT 301 (Blum)
Even Parity (Cont.) D P 1 ? D – data P – parity CSIT 301 (Blum)
Parity (Cont.) (You may read that the number of 1’s is counted, this is not true and is unnecessarily complicated.) A multi-input XOR (excluded OR) gate will determine the parity bit, this circuitry is much simpler than what is required for counting the number of 1’s. CSIT 301 (Blum)
2-input XOR Truth Table A B Output 1 CSIT 301 (Blum)
3-input XOR Truth Table A B C Output 1 CSIT 301 (Blum)
Checksum A checksum is another error-detection scheme often used in transmission. The transmission string of data bits is broken up into units, for instance consisting of 16 bits each. These 16-bit numbers are then added. The sum is sent along as part of the frame’s trailer field. The receiver repeats the calculation. If the sums match, it's assumed that no transmission error occurred. CSIT 301 (Blum)
Checksum (Cont.) Actually instead of including the sum in the trailer, one can transmit instead its negative. The receiver then repeats the sum over the data and includes in that summation the negative of the sum sent in the trailer. The result should be zero. It is easier to test for a result of all zeros (an multi-input OR gate). CSIT 301 (Blum)
Catching errors Since the 16-bit string might end up in any of 216 (65536) possible states, you might assume that the checksum would catch errors 65535/65536 99.9985% of the time. But this would assume that all possible errors are equally likely, which is not true. CSIT 301 (Blum)
IP Datagram Protocol CSIT 301 (Blum)
TCP Segment Protocol CSIT 301 (Blum)
Burst errors For multi-bit errors, it is more likely that a group of consecutive bits are affected as opposed to randomly selected bits, such an error is known as a burst error. It is also common for periodic errors to occur (e.g. the first two bits in every byte). CSIT 301 (Blum)
CRC Cyclic Redundancy Check is better at catching burst errors than checksums. The idea is the same, perform some mathematical operation on the data, send the result, have the receiver do the same calculation and check that the same answer is obtained. CSIT 301 (Blum)
CRC (Cont.) CRC is somewhat like division If one thinks of the data as a large number, one can divide it by another number N giving a whole number answer and a remainder The remainder could be any number between 0 and N-1. The problem here is that division is a fairly difficult computation (more circuitry and a slow process). CSIT 301 (Blum)
CRC (Cont.) CRC uses a variation on division that while mathematically abstract is very simple to build a circuit for. The circuitry needs only a shift register and some (2-input) XOR gates. CSIT 301 (Blum)
Shift Register A register is a small piece of memory that holds values. In addition to holding values, a shift register performs a simple operation on the values; it moves them to the left to to the right. CSIT 301 (Blum)
Shift Register 1 time Output Shift register Input CSIT 301 (Blum)
How shift registers are used? Multiplication Adding floats Converting Parallel Data (the form inside the computer) to Serial (the form sent over transmission lines) Cyclic Redundancy check (CRC) CSIT 301 (Blum)
CRC (Cont.) Again the 16-bit string might end up in any of 216 (65536) possible states, so you might expect that CRC would catch errors 65535/65536 99.9985% of the time same as a 16-bit checksum. But CRC is better at detecting burst errors which are more likely than purely random errors. The positions of the XORs are important in determining what kinds of burst errors are detected. CSIT 301 (Blum)
CRC: Transmission Only The data must be serialized for a Cyclic Redundancy Check. This is fine for transmission error checking since the data was serialized for transmission. However, serializing the data would waste a lot of time if the data were in a parallel form (as it is inside the computer). CSIT 301 (Blum)
CRC = Shift register + XORs Basically one has a shift register with a few excluded OR gates inserted in strategic positions. CSIT 301 (Blum)
Ethernet Frame Protocol Ethernet uses CRC because at that low level the data is serialized. The higher levels in the stack (e.g. IP and TCP) used checksums. CSIT 301 (Blum)
XOR Truth Table Reminder in CRC Context Leftmost Bit Newer Bit Result same as newer bit 1 opposite of newer bit CSIT 301 (Blum)
11000001010 1 1000001010 1 1 000001010 1 1 00001010 CSIT 301 (Blum)
1 1 0001010 1 001010 1 01010 1 1010 CSIT 301 (Blum)
1 1 010 1 1 1 10 1 1 1 1 1 CSIT 301 (Blum)
Hamming Code Hamming code extends the idea of parity. It can be used as either as An extended error detection scheme Parity will discover if an odd number of bits have been flipped. Hamming code can be used to detect two-bit errors. Or an error correction scheme If a single bit error is assumed, Hamming code can locate the offending bit. CSIT 301 (Blum)
Hamming code (Cont.) Hamming code breaks the bit string into a few overlapping groups. By overlapping here we mean that a given bit can belong to more than one group. But each bit should belong to a unique set of groups. That’s how the bit is located. One generates a parity bit for each group. On checking for errors, one identifies which groups violate parity. From these one can locate the bit in error. CSIT 301 (Blum)
Hamming code (Cont.) The set of data and parity bits are numbered such that the parity bits correspond to numbers which are powers of 2: 1, 2, 4, 8 etc. Recall in binary the powers of 2 consist of one 1 and the rest 0’s. 1: 0001 2: 0010 4: 0100 8: 1000 CSIT 301 (Blum)
Hamming code (Cont.) The groups: Any bit whose count in binary has a 1 in the 1’s position belongs to the first group. Any bit whose count in binary has a 1 in the 2’s position belongs to the second group. Any bit whose count in binary has a 1 in the 4’s position belongs to the third group. Etc. CSIT 301 (Blum)
Groups for Hamming code Parity bits Data bits CSIT 301 (Blum)
Groups for Hamming code CSIT 301 (Blum)
Locating the offending bit Let us assume that in a case having four data bits and three parity bits that parity errors were found in check groups 1 and 3. We look for the bit that belongs to groups 1 and 3 and does not belong to group 2. If it belongs to group 1, it has a 1 in the 1’s position. If it does not belong to group 2, it has a 0 in the 2’s position. If it belongs to group 3, it has a 1 in the 4’s position. It must be 5. CSIT 301 (Blum)
Locating the offending bit 101 5 CSIT 301 (Blum)
Note that each row represents a different set of data. CSIT 301 (Blum)
CSIT 301 (Blum)
Now reverse the situation! Neither row violates even parity for the first group. CSIT 301 (Blum)
Both rows have parity violations for group 2. Neither row has a parity violation for group 3. CSIT 301 (Blum)
Both rows have parity violations in group 4. It just so happens in this example that the two rows have the same parity violation. That’s a coincidence. It doesn’t have to be like that. If there are parity violations in groups two and four then the binary number identifying the offending bit is 1010 (for both rows, unfortunate example) which is the 10th bit. CSIT 301 (Blum)
Big-endian/Little-endian In memory, one addresses bytes. Often the data being written to memory involves more than one byte. Big-endian and Little-endian refer to the two different ways in which bytes are placed in memory when the word being written consists of more than one byte. CSIT 301 (Blum)
Big-endian/Little-endian In big-endian architectures, the more significant bytes are placed in the locations with the lower addresses. “Big end first” Many mainframe computers, particularly IBM In little-endian architectures, the less significant bytes are placed in the locations with the higher addresses. “Little end first” Most modern computers, including PCs The PowerPC is bi-endian because it can understands both ways. CSIT 301 (Blum)
More significant Lower address 12345678 in Big-endian More significant Lower address “Big end first” CSIT 301 (Blum)
Less significant Lower address 12345678 in Little-endian Less significant Lower address “Little end first” CSIT 301 (Blum)
Byte versus bit Note that the previous example shows only big- and little-endian byte orders. One can also consider the bit ordering within each byte. It can also be either big- or little-endian. Some architectures use a mix: big-endian ordering for bits and little-endian ordering for bytes, or vice versa. CSIT 301 (Blum)
Origin of the term “The terms big-endian and little-endian are derived from the Lilliputians of Gulliver's Travels, whose major political issue was whether soft-boiled eggs should be opened on the big side or the little side. Likewise, the big-/little-endian computer debate has much more to do with political issues than technological merits.” www.webopedia.com CSIT 301 (Blum)
Other references http://www.webopedia.com http://www.whatis.com Understanding Data Communications & Networks, William Shay http://www.cs.utsa.edu/~wagner/laws/hamming.html CSIT 301 (Blum)