ECE 101 An Introduction to Information Technology Information Coding.

Slides:



Advertisements
Similar presentations
Compression techniques. Why we need compression. Types of compression –Lossy and lossless Concentrate on lossless techniques. Run Length coding. Entropy.
Advertisements

Noise, Information Theory, and Entropy (cont.) CS414 – Spring 2007 By Karrie Karahalios, Roger Cheng, Brian Bailey.
Data Compression CS 147 Minh Nguyen.
Introduction to Computer Science 2 Lecture 7: Extended binary trees
Applied Algorithmics - week7
Sampling and Pulse Code Modulation
Michael Alves, Patrick Dugan, Robert Daniels, Carlos Vicuna
Huffman code and ID3 Prof. Sin-Min Lee Department of Computer Science.
Greedy Algorithms Amihood Amir Bar-Ilan University.
Bounds on Code Length Theorem: Let l ∗ 1, l ∗ 2,..., l ∗ m be optimal codeword lengths for a source distribution p and a D-ary alphabet, and let L ∗ be.
SIMS-201 Compressing Information. 2  Overview Chapter 7: Compression Introduction Entropy Huffman coding Universal coding.
Greedy Algorithms (Huffman Coding)
Data Compressor---Huffman Encoding and Decoding. Huffman Encoding Compression Typically, in files and messages, Each character requires 1 byte or 8 bits.
Data Compression Michael J. Watts
Lecture04 Data Compression.
Compression & Huffman Codes
Huffman Encoding 16-Apr-17.
Computer Science 335 Data Compression.
Compression & Huffman Codes Fawzi Emad Chau-Wen Tseng Department of Computer Science University of Maryland, College Park.
Information Theory Eighteenth Meeting. A Communication Model Messages are produced by a source transmitted over a channel to the destination. encoded.
Chapter 2 Error-Detecting Codes. Outline 2.1 Why Error-Detecting Codes? 2.2 Simple Parity Checks 2.3 Error-Detecting Codes 2.4 Independent Errors: White.
Document and Query Forms Chapter 2. 2 Document & Query Forms Q 1. What is a document? A document is a stored data record in any form A document is a stored.
CSE 143 Lecture 18 Huffman slides created by Ethan Apter
Lecture 4 Source Coding and Compression Dr.-Ing. Khaled Shawky Hassan
Fundamentals of Multimedia Chapter 7 Lossless Compression Algorithms Ze-Nian Li and Mark S. Drew 건국대학교 인터넷미디어공학부 임 창 훈.
Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:
Lossless Data Compression Using run-length and Huffman Compression pages
Source Coding Hafiz Malik Dept. of Electrical & Computer Engineering The University of Michigan-Dearborn
Data Compression Basics & Huffman Coding
Noise, Information Theory, and Entropy
Spring 2015 Mathematics in Management Science Binary Linear Codes Two Examples.
Noise, Information Theory, and Entropy
Huffman Codes Message consisting of five characters: a, b, c, d,e
CSE Lectures 22 – Huffman codes
Data Structures and Algorithms Huffman compression: An Application of Binary Trees and Priority Queues.
Noiseless Coding. Introduction Noiseless Coding Compression without distortion Basic Concept Symbols with lower probabilities are represented by the binary.
Huffman Codes Information coding: –Most info transmission machines (computer terminal, Voyager spacecraft) use a binary code. –Why? These electric signals.
Prof. Amr Goneid Department of Computer Science & Engineering
Communication Technology in a Changing World Week 2.
Compression.  Compression ratio: how much is the size reduced?  Symmetric/asymmetric: time difference to compress, decompress?  Lossless; lossy: any.
1 Source Coding and Compression Dr.-Ing. Khaled Shawky Hassan Room: C3-222, ext: 1204, Lecture 5.
COMPRESSION. Compression in General: Why Compress? So Many Bits, So Little Time (Space) CD audio rate: 2 * 2 * 8 * = 1,411,200 bps CD audio storage:
Prof. Amr Goneid, AUC1 Analysis & Design of Algorithms (CSCE 321) Prof. Amr Goneid Department of Computer Science, AUC Part 8. Greedy Algorithms.
Huffman Code and Data Decomposition Pranav Shah CS157B.
Abdullah Aldahami ( ) April 6,  Huffman Coding is a simple algorithm that generates a set of variable sized codes with the minimum average.
Lecture 4: Lossless Compression(1) Hongli Luo Fall 2011.
CS654: Digital Image Analysis Lecture 34: Different Coding Techniques.
Digital Image Processing Lecture 22: Image Compression
1 Data Compression Hae-sun Jung CS146 Dr. Sin-Min Lee Spring 2004.
Lecture 12 Huffman Algorithm. In computer science and information theory, a Huffman code is a particular type of optimal prefix code that is commonly.
بسم الله الرحمن الرحيم My Project Huffman Code. Introduction Introduction Encoding And Decoding Encoding And Decoding Applications Applications Advantages.
Huffman Coding (2 nd Method). Huffman coding (2 nd Method)  The Huffman code is a source code. Here word length of the code word approaches the fundamental.
Huffman code and Lossless Decomposition Prof. Sin-Min Lee Department of Computer Science.
Chapter Nine: Data Transmission. Introduction Binary data is transmitted by either by serial or parallel methods Data transmission over long distances.
Submitted To-: Submitted By-: Mrs.Sushma Rani (HOD) Aashish Kr. Goyal (IT-7th) Deepak Soni (IT-8 th )
Data Compression Michael J. Watts
3.3 Fundamentals of data representation
Design & Analysis of Algorithm Huffman Coding
Huffman Codes ASCII is a fixed length 7 bit code that uses the same number of bits to define each character regardless of how frequently it occurs. Huffman.
HUFFMAN CODES.
Compression & Huffman Codes
EE465: Introduction to Digital Image Processing
Digital Image Processing Lecture 20: Image Compression May 16, 2005
Data Compression.
Data Compression CS 147 Minh Nguyen.
Chapter 9: Huffman Codes
Why Compress? To reduce the volume of data to be transmitted (text, fax, images) To reduce the bandwidth required for transmission and to reduce storage.
Huffman Encoding.
Error Detection and Correction
Presentation transcript:

ECE 101 An Introduction to Information Technology Information Coding

Information Path Information Display Information Processor & Transmitter Information Receiver and Processor Source of Information Digital Sensor Transmission Medium Information Storage

Information Coding Fixed Length (same # bits per word) –Error detection –Error correction –Standard codes –Bar and credit card codes Variable length (frequently used words = small number of bits) - Data compression –Huffman code –Facsimile (Fax) code Encryption

Error Detection and Correction Codes can be written that detect errors –add redundant bits in the code words that do not add information (other than the possible presence of an error) –where probability of errors is small, they detect single errors only –could just repeat (doubles the data size) plus the error would be detected but not corrected

Parity Bit Add a “parity” bit to each word –“even-parity” adds a (redundant) bit to each word to form a word that contains an even number of 1’s; similarly for “odd-parity” with an odd number of 1’s. –more efficient than bit repetition –identifies the existence of an error but does not correct it Error Correction –addition of redundancy-check code word –size of data (plus parity bit) increased by one word

Redundancy Check Information to be sent: With even parity, the above is converted to: First bits are: Odd parity bit: 1 Second bits are: Odd parity bit: 1 Odd parity bits: 1 1 Even parity bit: 0 Transmitted: Parity bits Symbol Word

Redundancy Check Error Transmitted: Received: Even parity tells us that the second symbol has an error Comparing Odd parity with the first bit in each symbol shows us that the first bit in the second symbol should be a 0 Comparing Odd parity with the second bit in each symbol shows us that everything is OK

Fixed Length Codes Same number of bits in each word Use of n bits can create 2 n different words ASCII Code –American Standard Code for Information Interchange –computer memories structured with 8 bit (one byte) words –ASCII - conventional code for representing alphanumeric symbols as bytes

Digital Watermark Vertically shifting particular lines by 1/600 of an inch and then impose the original on it and observe the shading of the output. Can shift any number of lines or use symbols to quickly create a large number of variations.

Variable Length Codes Reduce the number of bits by assigning short code words to common symbols and longer code words to less common symbols Huffman Coding Procedure –uses a code tree, consisting of nodes connected by branches that ultimately terminate in leaves –node at top is root –branches from a node are 1 or 0 –so only 2 branches from any node

Entropy Minimum average number of bits to encode a domain of probabilities H = -  i=1 n P[X i ] log 2 {P[X i ]} bits/symbol, where n is number of possible outcomes, or H = 3.32  i=1 n P[X i ] log 10 {1/P[X i ]} bits/symbol

Huffman Coding Procedure –Determine the probabilities of occurrence of all possible values –List symbols in order of decreasing probability –Start at bottom of the list and assign a zero to the least probable and 1 to next least probable. –Combine the two least probable symbols into one composite symbol (sum of probabilities) –Revise list of symbols using the composite symbol in order of decreasing probability –Repeat steps until only two symbols remain and assign a 0 to less probable entry and 1 to the other(NOTE: in your textbook it’s the other way around, ie. 1 to the least probable entry, however it does not matter which protocol is used)

 Huffman Coding Creates a Binary Code Tree –Nodes connected by branches with leaves –Top node – root –Two branches from each node D B C A Start Root Branches Node Leaves The Huffman coding procedure finds the optimum, uniquely decodable, variable length code associated with a set of events, given their probabilities of occurrence.

A0 B10 C110 D 111 Given the adjacent Huffman code tree, decode the following sequence: Huffman Coding D B C A Start Root Branches Node Leaves C 10 B 0A0A 0A0A 111 D 0A0A

Huffman Code Construction First list all events in descending order of probability. Pair the two events with lowest probabilities and add their probabilities..3 Event A.3 Event B.13 Event C.12 Event D.1 Event E.05 Event F.3 Event A.3 Event B.13 Event C.12 Event D.1 Event E.05 Event F 0.15

Repeat for the last pair and add 0s to the left branches and 1s to the right branches..3 Event A.3 Event B.13 Event C.12 Event D.1 Event E.05 Event F Huffman Code Construction

Data Compression Two approaches to Data Compression: Lossless compression –retains all information present in the original data –tenfold data compression is typical (WinZip or hard drive compression) –Uniquely decodable –Huffman codes are often used to compress large data files into smaller files without loosing any information.

Data Compression Lossy compression –further reduction in data by permitting some loss of information –uses close approximations to the data rather than actual data –can be 100 fold compression –can specify the perceptual quality of the result –little perceptible distortion

Huffman Fax Code To further reduce the number of bits to transmit, use the Huffman code –this involves estimating the relative frequency of occurrence of different runs of black and white –i.e. common words, short code words –code the make-up words (64m) and terminating code words (r) separately

Fax Code Errors Fax must transmit alternating black and white codes, otherwise and error is detected If errors are detected the page must be sent again Fax machines do not currently use correction codes –trade-off for faster transmission times (no redundant bits in the code words) and expense of resending a fax when errors occur

Encryption Encryption is a way to randomly scramble data so that only the intended recipient can use the information –easiest way is use an exclusive-or (XOR) gate with the data and random binary sequence as inputs; output is then sent as an encrypted message –random binary sequence can be produced using a pseudo-random number generator (PRNG) –retrieved by applying the encrypted data with the same random binary sequence to and XOR gate

Encryption using PRNG Standard encryption requires a key for a number that determines the random binary sequence used to both encode and decode the data Choose the PRNG that produces the random 8 byte (byte sized patterns with the generator equation) –X n = [A  X n-1 + B] mod(256) where –A is an arbitrary multiplier of X n-1 –B prevents the sequence from degenerating into a set of zeroes –to get started we need an arbitrary X 0, or seed

Transmitting the Seed Assume that A, B, N=256 for the PRNG are known by everyone Need only to transmit the seed X 0, the key –T (transmitting person) selects x privately and transmits to R (receiving person): X = [a x ] mod(N) (note that this N need not be same as the value 256 given above) –R selects y privately and transmits to T: Y = [a y ] mod(N) –T computes [Y x ] mod(N) and R computes [X y ] mod(N) which equal one another so this becomes the seed value which they now both understand but the outside world does not.

Sample Exercise (1) Four groups, 2G and 2Y (with calculators) Each group privately selects two (of the 128) ASCII symbols page of the text Arbitrarily we’ll use X n =[177X n-1 +59] mod256 To find X 1 the key now is to determine X 0. To find that, use N=11 and a=9 The two G groups select privately an odd number, g, from 3 to 9 (arbitrary choice) The two Y groups select privately an even number, y, from 2 to 10 (arbitrary choice)

Sample Exercise (2) G groups compute G = [9 g ] mod 11 and sends the result to corresponding group Y Y groups compute Y = [9 y ] mod 11 and sends the result to corresponding group G G groups compute X 0 = [Y g ] mod 11 Y groups compute X 0 = [G y ] mod 11 Each group computes X 1 =[177X 0 +59] mod256 and X 2 =[177X 1 +59] mod256 for the PRNG random binary sequence – convert to binary

Sample Exercise (3) Each group does XOR of this PRN (X 1 X 2 ) of 16 bits with the two symbols it selected. Each group sends its coded message to the other group Notice that this message is secure to people who do not know the PRN selected. Each group uses XOR with the received message and PRN to decode the message sent.