The Mathematics of Star Trek Data Transmission Michael A. Karls Ball State University.

Slides:



Advertisements
Similar presentations
STATISTIC & INFORMATION THEORY (CSNB134) MODULE 12 ERROR DETECTION & CORRECTION.
Advertisements

parity bit is 1: data should have an odd number of 1's
KFUPM COE 202: Digital Logic Design Number Systems Part 3 Courtesy of Dr. Ahmad Almulhem.
Error Control Code.
Michael Alves, Patrick Dugan, Robert Daniels, Carlos Vicuna
NETWORKING CONCEPTS. ERROR DETECTION Error occures when a bit is altered between transmission& reception ie. Binary 1 is transmitted but received is binary.
Digital Fundamentals Floyd Chapter 2 Tenth Edition
Is ASCII the only way? For computers to do anything (besides sit on a desk and collect dust) they need two things: 1. PROGRAMS 2. DATA A program is a.
1 Chapter 1 Introduction. 2 Outline 1.1 A Very Abstract Summary 1.2 History 1.3 Model of the Signaling System 1.4 Information Source 1.5 Encoding a Source.
Copyright © Cengage Learning. All rights reserved.
Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:
MAT 1000 Mathematics in Today's World Winter 2015.
Digital Fundamentals Floyd Chapter 2 Tenth Edition
Error Detection and Correction
Hamming Code A Hamming code is a linear error-correcting code named after its inventor, Richard Hamming. Hamming codes can detect up to two bit errors,
Lesson 2 0x Coding ASCII Code.
MAT 1000 Mathematics in Today's World Winter 2015.
Management Information Systems Lection 06 Archiving information CLARK UNIVERSITY College of Professional and Continuing Education (COPACE)
Digital Logic Chapter 2 Number Conversions Digital Systems by Tocci.
Dale & Lewis Chapter 3 Data Representation
Chapter 17: Information Science Lesson Plan Binary Codes Encoding with Parity-Check Sums Cryptography Web Searches and Mathematical Logic 1 Mathematical.
© 2009 Pearson Education, Upper Saddle River, NJ All Rights ReservedFloyd, Digital Fundamentals, 10 th ed Digital Fundamentals Tenth Edition Floyd.
Lecture 6 Topics Character codes Error Detection and Correction
Chapter 17: Information Science Lesson Plan
Mathematics in Management Science
The Mathematics of Star Trek
Digital Logic Lecture 4 Binary Codes The Hashemite University Computer Engineering Department.
STATISTIC & INFORMATION THEORY (CSNB134) MODULE 8 INTRODUCTION TO INFORMATION THEORY.
Digital Logic Design Lecture 3 Complements, Number Codes and Registers.
Chapters 16 & 17 Sarah Cameron 18 March  Review of Modular Arithmetic  Identification Numbers ZIP Codes Bar Codes  Binary Codes  Encryption.
Information Coding in noisy channel error protection:-- improve tolerance of errors error detection: --- indicate occurrence of errors. Source.
CS151 Introduction to Digital Design
1 i206: Lecture 2: Computer Architecture, Binary Encodings, and Data Representation Marti Hearst Spring 2012.
Math for Liberal Studies.  A binary code is a system for encoding data made up of 0’s and 1’s  Examples  Postnet (tall = 1, short = 0)  UPC (dark.
CODING/DECODING CONCEPTS AND BLOCK CODING. ERROR DETECTION CORRECTION Increase signal power Decrease signal power Reduce Diversity Retransmission Forward.
Practical Session 10 Error Detecting and Correcting Codes.
Section Section Summary The Product Rule The Sum Rule The Subtraction Rule The Division Rule Examples, Examples, and Examples Tree Diagrams.
Huffman coding Content 1 Encoding and decoding messages Fixed-length coding Variable-length coding 2 Huffman coding.
AS Computing Data Transmission and Networks. Transmission error Detecting errors in data transmission is very important for data integrity. There are.
Error Detection and Correction
PREPARED BY: ENGR. JO-ANN C. VIÑAS
Chapter 31 INTRODUCTION TO ALGEBRAIC CODING THEORY.
10.1 Chapter 10 Error Detection and Correction Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
10.1 Chapter 10 Error Detection and Correction Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
10.1 Chapter 10 Error Detection and Correction Data can be corrupted during transmission. Some applications require that errors be detected and.
© 2009 Pearson Education, Upper Saddle River, NJ All Rights ReservedFloyd, Digital Fundamentals, 10 th ed Digital Logic Design Dr. Oliver Faust.
Lecture 12 Huffman Algorithm. In computer science and information theory, a Huffman code is a particular type of optimal prefix code that is commonly.
ECE 101 An Introduction to Information Technology Information Coding.
Hamming Distance & Hamming Code
10.1 Chapter 10 Error Detection and Correction Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
ECE DIGITAL LOGIC LECTURE 4: BINARY CODES Assistant Prof. Fareena Saqib Florida Institute of Technology Fall 2016, 01/26/2016.
10.1 Chapter 10 Error Detection and Correction Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
1 Product Codes An extension of the concept of parity to a large number of words of data 0110… … … … … … …101.
Hamming (4,7) Code Binary Linear Codes Hamming Distance Weight of BLC
Chapter Nine: Data Transmission. Introduction Binary data is transmitted by either by serial or parallel methods Data transmission over long distances.
Binary Representation in Text
Binary Representation in Text
8 Coding Theory Discrete Mathematics: A Concept-based Approach.
2.8 Error Detection and Correction
The Mathematics of Star Trek Workshop
3.1 Denary, Binary and Hexadecimal Number Systems
Representing characters
Error Detection and Correction
MAT 105 Spring 2008 Chapter 17: Binary Codes.
Fundamentals of Data Representation
Chapter Nine: Data Transmission
Copyright © Cengage Learning. All rights reserved.
Communicating Efficiently
Chapter 17: Information Science Lesson Plan
Presentation transcript:

The Mathematics of Star Trek Data Transmission Michael A. Karls Ball State University

Topics Binary Codes ASCII Error Correction Parity Check-Sums Hamming Codes Binary Linear Codes Data Compression M. A. Karls, Ball State University

Binary Codes A code is a group of symbols that represent information together with a set of rules for interpreting the symbols. The process of turning a message into code form is called encoding. The reverse process is called decoding. A binary code is a coding scheme that uses two symbols, usually 0 or 1. Mathematically, binary codes represent numbers in base 2. For example, 1011 would represent the number 1 x x x x 2 3 = = 11. M. A. Karls, Ball State University

ASCII One example of a binary code is the American Standard Code for Information Interchange (ASCII). This code is used by computers to turn letters, numbers, and other characters into strings (lists) of binary digits or bits. When a key is pressed, a computer will interpret the corresponding symbol as a string of bits unique to that symbol. M. A. Karls, Ball State University

ASCII (cont.) Here are the ASCII bit strings for the capital letters in our alphabet: LetterASCIILetterASCII A N B O C P D0100 Q E R F S G T H U0101 I V J W K X L Y M Z M. A. Karls, Ball State University

ASCII (cont.) Thus, in binary, using ASCII, the text “MR SPOCK” would be encoded as: HW: What would be the decimal equivalent of this bit string? M. A. Karls, Ball State University

Error Correction When data is transmitted, it is important to make sure that errors are corrected! This is done all the time by computers, fax machines, cell phones, CD players, iPods, satellites, etc. In the Star Trek universe, this would be especially important for the transporter to work correctly! M. A. Karls, Ball State University

Error Correction (cont.) We use error correction in languages such as English! For example, consider the phrase: “Bean me up Scotty!” Most likely, there has been an error in transmission, which can be corrected by looking at the extra information in the sentence. The word “bean” is most likely “beam”. Other possibilities: bear, been, lean, which don’t really make sense. Language such as English have redundancy (extra information) built into them so that we can infer the correct message, even if the message may have been received incorrectly! M. A. Karls, Ball State University

Error Correction (cont.) Over the past 40 years, mathematicians and engineers have developed sophisticated schemes to build redundancy into binary strings to correct errors in transmission! One example can be illustrated with Venn diagrams! Venn diagrams are illustrations used in the branch of mathematics known as set theory. They are used to show the mathematical or logical relationship between different groups of things (sets). Claude Shannon ( ) “Father of Information Theory” M. A. Karls, Ball State University

Error Correction (cont.) Suppose we wish to send the message: Using the Venn diagram at the right, we can append three bits to our message to help catch errors in transmission! V I II III IV VI VII AB C M. A. Karls, Ball State University

Error Correction (cont.) The message bits 1001 are placed in regions I, II, III, and IV, respectively. For regions V, VI, and VII, choose either a 0 or a 1 to make the total number of 1’s in a circle even! V VI VII AB C M. A. Karls, Ball State University

Error Correction (cont.) Thus, we place a 1 in region V, a 0 in region VI, and a 1 in region VII. Thus, the message 1001 is encoded as AB C M. A. Karls, Ball State University

Error Correction (cont.) Suppose the message is received as , so there is an error in the first bit. To check for (and correct) this error, we use the Venn diagram! Put the bits of the message into regions I - VII in order. Notice that in circle A there is an odd number of 1’s. (We say that the parity of circle A is odd.) The same is true for circle B. This means that there has been an error in transmission, since we sent a message for which each circle had even parity! AB C M. A. Karls, Ball State University

Error Correction (cont.) To correct the error, we need to make the parity of all three circles even. Since circle C has an even number of 1’s, we leave it alone. It follows that the error is located in the portion of the diagram outside of circle C, i.e. in region V, I, or VI. Switching a 1 to a 0 or vice- versa, one region at a time, we find that the error is in region I! AB C M. A. Karls, Ball State University

Error Correction (cont.) A C B A C B A C B A has even parity B has odd parity A has even parity B has even parity A has odd parity B has even parity M. A. Karls, Ball State University

Error Correction (cont.) Thus, the correct message is: ! This scheme allows the encoding of the 16 possible 4-bit strings! Any single bit error will be detected and corrected. Note that if there are two or more errors this method may not detect the error or yield the correct message! (We’ll see why later!) AB C M. A. Karls, Ball State University

Parity-Check Sums In practice, binary messages are made up of strings that are longer than four digits (for example, MR SPOCK in ASCII). We now look at a mathematical method to encode binary strings that is equivalent to the Venn diagram method and can be applied to longer strings! Given any binary string of length four, a 1 a 2 a 3 a 4, we wish append three check digits so that any single error in any of the seven positions can be corrected. M. A. Karls, Ball State University

Parity-Check Sums (cont.) We choose the check digits as follows: c 1 = 0 if a 1 +a 2 +a 3 is even. c 1 = 1 if a 1 +a 2 +a 3 is odd. c 2 = 0 if a 1 +a 2 +a 4 is even. c 2 = 1 if a 1 +a 2 +a 4 is odd. c 3 = 0 if a 2 +a 3 +a 4 is even. c 3 = 1 if a 2 +a 3 +a 4 is odd. These sums are called parity-check sums! M. A. Karls, Ball State University

Parity-Check Sums (cont.) As an example, for a 1 a 2 a 3 a 4 = 1001, we find that: c 1 = 1, since a 1 +a 2 +a 3 = is odd. c 2 = 0, since a 1 +a 2 +a 4 = is even. c 3 = 1, since a 2 +a 3 +a 4 = is odd. Thus 1001 is encoded as , just as with the Venn diagram method! M. A. Karls, Ball State University

Parity-Check Sums (cont.) Try this scheme with the message 1000! Solution: Suppose that the message u = is received as v = (so there is an error in position 3). To decode the message v, we compare v with the 16 possible messages that could have been sent. For this comparison, we define the distance between strings of equal length to be the number of positions in which the strings differ. Thus, the distance between v = and w = would be 5. M. A. Karls, Ball State University

Parity-Check Sums (cont.) Here are the distances between message v and all possible code words: v code word distance v code word distance M. A. Karls, Ball State University

Parity-Check Sums (cont.) Comparing our message v = to the possible code words, we find that the minimum distance is 1, for code word For all other code words, the distance is greater than or equal to 2. Therefore, we decode v as u = This method is known as nearest-neighbor decoding. Note that this method will only correct an error in one position. (We’ll see why later!) If there is more than one possibility for the decoded message, we don’t decode. M. A. Karls, Ball State University

Binary Linear Codes The error correcting scheme we just saw is a special case of a Hamming code. These codes were first proposed in 1948 by Richard Hamming ( ), a mathematician working at Bell Laboratories. Hamming was frustrated with losing a week’s worth of work due to an error that a computer could detect, but not correct. M. A. Karls, Ball State University

Binary Linear Codes (cont.) A binary linear code consists of words composed of 0’s and 1’s and is obtained from all possible k-tuple messages by using parity- check sums to append check digits to the messages. The resulting strings are called code words. Generic code word: a 1 a 2 …a n, where a 1 a 2 …a k is the message part and a k+1 a k+2 …a n is the check digit part. M. A. Karls, Ball State University

Binary Linear Codes (cont.) Given a binary linear code, two natural questions to ask are: 1. How can we tell if it will correct errors? 2. How many errors will it detect? To answer these questions, we need the idea of the weight of a code. The weight, denoted t, of a binary linear code is the minimum number of 1’s that occur among all nonzero code words of that code. For example, the weight of the code in the examples above is t = 3. M. A. Karls, Ball State University

Binary Linear Codes (cont.) If the weight t is odd, the code will correct any (t-1)/2 or fewer errors. If the weight t is even, the code will correct any (t-2)/2 or fewer errors. If we just want to detect any errors, a code of weight t will detect any t-1 or fewer errors. Thus, our binary linear code of weight 3 can correct (3-1)/2 = 1 error or detect 3-1 = 2 errors. Note that we need to decide in advance if we want to correct or detect errors! For correcting, we apply the nearest neighbor method. For detecting, if we get an error, we ask for the message to be re-sent. M. A. Karls, Ball State University

Binary Linear Codes (cont.) The key to the error correcting schemes in binary linear codes is that the set of possible code words differ from each other in t positions, where t is the weight of the code. Thus, as many as t-1 errors in a code word can be detected, as any valid code word will differ from another in t positions! It t is odd, say t = 3, then a code word with an error in one position will differ from the correct code word in one position and differ from all other code words by at least two positions. M. A. Karls, Ball State University

Data Compression Binary linear codes are fixed-length codes, since each word in the code is represented by the same number of digits. The Morse Code, developed for the telegraph in the 1850’s by Samuel Morse is an example of a variable- length code in which the number of symbols for a word may vary.Morse Code Morse code is an example of data compression. One great example of where data compression is used the MP3 format for compressing music files! For the Star Trek universe, data compression would be useful for encoding information for the transporter! LetterMorseLetterMorse A. _N_. B_...O_ _ _ C_._.P._ _. D_..Q_ _._ E.R._. F.._.S… G_ _.T_ H….U.._ I..V…_ J._ _ _W._ _ K_._X_.._ L._..Y_._ _ M_ Z_ _.. M. A. Karls, Ball State University

Data Compression (cont.) Data compression is the process of encoding data so that the most frequently occurring data are represented by the fewest symbols. Comparing the Morse code symbols to a relative frequency chart for the letters in the English language, we find that the letters that occur the most have shorter Morse code symbols! LetterPercentageLetterPercentage A8.2N6.7 B1.5O7.5 C2.8P1.9 D4.3Q0.1 E12.7R6 F2.2S6.3 G2T9.1 H6.1U2.8 I7V1 J0.2W2.4 K0.8X0.2 L4Y2 M2.4Z0.1 Percentage of letters out of a sample of 100,362 alphabetic characters taken from newspapers and novels. M. A. Karls, Ball State University

Data Compression (cont.) As an illustration of data compression, let’s use the idea of gene sequences. Biologists are able to describe genes by specifying sequences composed of the four letters A, T, G, and C, which stand for the four nucleotides adenine, thymine, guanine, and cytosine, respectively. Suppose we wish to encode the sequence AAACAGTAAC. M. A. Karls, Ball State University

Data Compression (cont.) One way is to use the (fixed-length) code: A  00, C  01, T  10, and G  11. Then AAACAGTAAC is encoded as: From experience, biologists know that the frequency of occurrence from most frequent to least frequent is A, C, T, G. Thus, it would more efficient to choose the following binary code: A  0, C  10, T  110, and G  111. With this new code, AAACAGTAAC is encoded as: Notice that this new binary code word has 16 letters versus 20 letters for the fixed-length code, a decrease of 20%. This new code is an example of data compression! M. A. Karls, Ball State University

Data Compression (cont.) Suppose we wish to decode a sequence encoded with the new data compression scheme, such as Looking at groups of three digits at a time, we can decode this message! Since 0 only occurs at the end of a code word, and the codes words that end in 0 are 0, 10, and 110, we can put a mark after every 0, as this will be the end of a code word. The only time a sequence of 111 occurs is for the code word 111, so we can put a mark after every triple of 1’s. Thus, we have: 0,0,0,10,0,111,110,0,0,10, which is AAACAGTAAC. M. A. Karls, Ball State University

References The Code Book, by Simon Singh, For All Practical Purposes (5 th ed.), COMAP, St. Andrews' University History of Mathematics: and.ac.uk/~history/index.html and.ac.uk/~history/index.html M. A. Karls, Ball State University