CompSci 100e 10.1 Binary Digits (Bits) l Yes or No l On or Off l One or Zero l 10010010.

Slides:



Advertisements
Similar presentations
Introduction to Computer Science 2 Lecture 7: Extended binary trees
Advertisements

Information Representation
Michael Alves, Patrick Dugan, Robert Daniels, Carlos Vicuna
The Binary Numbering Systems
Lecture 10 : Huffman Encoding Bong-Soo Sohn Assistant Professor School of Computer Science and Engineering Chung-Ang University Lecture notes : courtesy.
Data Compressor---Huffman Encoding and Decoding. Huffman Encoding Compression Typically, in files and messages, Each character requires 1 byte or 8 bits.
Compression & Huffman Codes
Huffman Encoding 16-Apr-17.
Connecting with Computer Science, 2e
Is ASCII the only way? For computers to do anything (besides sit on a desk and collect dust) they need two things: 1. PROGRAMS 2. DATA A program is a.
Computer Science 335 Data Compression.
Compression & Huffman Codes Fawzi Emad Chau-Wen Tseng Department of Computer Science University of Maryland, College Park.
CS 206 Introduction to Computer Science II 04 / 29 / 2009 Instructor: Michael Eckmann.
Chapter 9: Huffman Codes
CS 206 Introduction to Computer Science II 12 / 10 / 2008 Instructor: Michael Eckmann.
Chapter 1 Data Storage. 2 Chapter 1: Data Storage 1.1 Bits and Their Storage 1.2 Main Memory 1.3 Mass Storage 1.4 Representing Information as Bit Patterns.
Data Compression Basics & Huffman Coding
Connecting with Computer Science 2 Objectives Learn why numbering systems are important to understand Refresh your knowledge of powers of numbers Learn.
CSE Lectures 22 – Huffman codes
Lecture 5.
Computers Organization & Assembly Language
Data Structures Arrays both single and multiple dimensions Stacks Queues Trees Linked Lists.
Algorithm Design & Analysis – CS632 Group Project Group Members Bijay Nepal James Hansen-Quartey Winter
CompSci 100 Prog Design and Analysis II
Huffman Encoding Veronica Morales.
Lecture Objectives  To learn how to use a Huffman tree to encode characters using fewer bytes than ASCII or Unicode, resulting in smaller files and reduced.
Compsci Today’s topics l Binary Numbers  Brookshear l Slides from Prof. Marti Hearst of UC Berkeley SIMS l Upcoming  Networks Interactive.
Lecture 5. Topics Sec 1.4 Representing Information as Bit Patterns Representing Text Representing Text Representing Numeric Values Representing Numeric.
1 i206: Lecture 2: Computer Architecture, Binary Encodings, and Data Representation Marti Hearst Spring 2012.
Chapter 1: Data Storage.
Communication Technology in a Changing World Week 2.
Lossless Compression CIS 465 Multimedia. Compression Compression: the process of coding that will effectively reduce the total number of bits needed to.
Compsci Today’s topics l Binary Numbers  Brookshear l Slides from Prof. Marti Hearst of UC Berkeley SIMS l Upcoming  Networks Interactive.
Chapter 1 Data Storage © 2007 Pearson Addison-Wesley. All rights reserved.
Prof. Amr Goneid, AUC1 Analysis & Design of Algorithms (CSCE 321) Prof. Amr Goneid Department of Computer Science, AUC Part 8. Greedy Algorithms.
Huffman coding Content 1 Encoding and decoding messages Fixed-length coding Variable-length coding 2 Huffman coding.
Chapter 1 Data Storage © 2007 Pearson Addison-Wesley. All rights reserved.
Compsci 100, Fall Review: Why Compression l What gets compressed?  Save on storage, why is this a good idea?  Save on data transmission, how.
Huffman Codes Juan A. Rodriguez CS 326 5/13/2003.
CompSci 100e Program Design and Analysis II April 7, 2011 Prof. Rodger CompSci 100e, Spring p 1 h 1 2 e 1 r 1 4 s 1 * 2 7 g 3 o
CPS 100, Spring Huffman Coding l D.A Huffman in early 1950’s l Before compressing data, analyze the input stream l Represent data using variable.
CS654: Digital Image Analysis Lecture 34: Different Coding Techniques.
Data Storage © 2007 Pearson Addison-Wesley. All rights reserved.
Chapter 1 Data Storage © 2007 Pearson Addison-Wesley. All rights reserved.
CompSci 100e 8.1 Plan for the Course! l Understand Huffman Coding  Data compression  Priority Queues  Bits and Bytes  Greedy Algorithms l Algorithms.
Lossless Decomposition and Huffman Codes Sophia Soohoo CS 157B.
Data Representation. How is data stored on a computer? Registers, main memory, etc. consists of grids of transistors Transistors are in one of two states,
Huffman code and Lossless Decomposition Prof. Sin-Min Lee Department of Computer Science.
CompSci From bits to bytes to ints  At some level everything is stored as either a zero or a one  A bit is a binary digit a byte is a binary.
Chapter 1: Data Storage.
Computer Science: An Overview Eleventh Edition
From bits to bytes to ints
Design & Analysis of Algorithm Huffman Coding
Huffman Codes ASCII is a fixed length 7 bit code that uses the same number of bits to define each character regardless of how frequently it occurs. Huffman.
HUFFMAN CODES.
Compression & Huffman Codes
Bottom up meets Top down
CompSci 201 Data Representation & Huffman Coding
Heaps, Priority Queues, Compression
Chapter 8 – Binary Search Tree
Chapter 1 Data Storage.
Chapter 9: Huffman Codes
Huffman Encoding.
Algorithms CSCI 235, Spring 2019 Lecture 30 More Greedy Algorithms
Scoreboard What else might we want to do with a data structure?
From bit to byte to char to int to long
Presentation transcript:

CompSci 100e 10.1 Binary Digits (Bits) l Yes or No l On or Off l One or Zero l

CompSci 100e 10.2 Data Encoding l Text: Each character (letter, punctuation, etc.) is assigned a unique bit pattern.  ASCII: Uses patterns of 7-bits to represent most symbols used in written English text  Unicode: Uses patterns of 16-bits to represent the major symbols used in languages world side  ISO standard: Uses patterns of 32-bits to represent most symbols used in languages world wide l Numbers: Uses bits to represent a number in base two l Limitations of computer representations of numeric values  Overflow – happens when a value is too big to be represented  Truncation – happens when a value is between two representable values

CompSci 100e 10.3 Images, Sound, & Compression l Images  Store as bit map: define each pixel RGB Luminance and chrominance  Vector techniques Scalable TrueType and PostScript l Audio  Sampling l Compression  Lossless: Huffman, LZW, GIF  Lossy: JPEG, MPEG, MP3

CompSci 100e 10.4 Memory management preview l Outside of this class, memory is a finite resource l 3 different types of storage with different lifetimes 1. Static: lives in static storage for the execution of the program 2. Local: lives in stack within a method or block 3. Dynamic: lives in heap starts with a new and ends with some deallocation Executable code Static Storage Heap Stack Unallocated

CompSci 100e 10.5 Decimal (Base 10) Numbers l Each digit in a decimal number is chosen from ten symbols: { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 } l The position (right to left) of each digit represents a power of ten. l Example: Consider the decimal number   position: = 2     10 0

CompSci 100e 10.6 Binary (Base 2) Numbers l Each digit in a binary number is chosen from two symbols: { 0, 1 } l The position (right to left) of each digit represents a power of two. l Example: Convert binary number 1101 to decimal   position: = 1     2 0 =1     1= = 13

CompSci 100e 10.7 Famous Powers of Two Images from

CompSci 100e 10.8 Other Number Systems Images from

CompSci 100e 10.9 Truth Tables Images from

CompSci 100e From bits to bytes to ints l At some level everything is stored as either a zero or a one  A bit is a binary digit a byte is a binary term (8 bits)  We should be grateful we can deal with Strings rather than sequences of 0's and 1's.  We should be grateful we can deal with an int rather than the 32 bits that make an int l Int values are stored as two's complement numbers with 32 bits, for 64 bits use the type long, a char is 16 bits  Standard in Java, different in C/C++  Facilitates addition/subtraction for int values  We don't need to worry about this, except to note: Integer.MAX_VALUE + 1 = Integer.MIN_VALUE Math.abs(Integer.MIN_VALUE) != Infinity

CompSci 100e More details about bits l How is 13 represented?  … _0_ _0_ _1_ _1_ _0_ _1_  Total is = 13 l What is bit representation of 32? Of 15? Of 1023?  What is bit-representation of 2 n - 1 ?  What is bit-representation of 0? Of -1? Study later, but -1 is all 1’s, left-most bit determines < 0 l How can we determine what bits are on? How many on?  Useful in solving problems, understanding machine

CompSci 100e How are data stored? l To facilitate Huffman coding we need to read/write one bit  Why do we need to read one bit?  Why do we need to write one bit?  When do we read 8 bits at a time? Read 32 bits at a time? l We can't actually write one bit-at-a-time. We can't really write one char at a time either.  Output and input are buffered,minimize memory accesses and disk accesses  Why do we care about this when we talk about data structures and algorithms? Where does data come from?

CompSci 100e How do we buffer char output? l Done for us as part of InputStream and Reader classes  InputStreams are for reading bytes  Readers are for reading char values  Why do we have both and how do they interact? Reader r = new InputStreamReader(System.in);  Do we need to flush our buffers? l In the past Java IO has been notoriously slow  Do we care about I? About O?  This is changing, and the java.nio classes help Map a file to a region in memory in one operation

CompSci 100e Buffer bit output l To buffer bit output we need to store bits in a buffer  When the buffer is full, we write it.  The buffer might overflow, e.g., in process of writing 10 bits to 32-bit capacity buffer that has 29 bits in it  How do we access bits, add to buffer, etc.? l We need to use bit operations  Mask bits -- access individual bits  Shift bits – to the left or to the right  Bitwise and/or/negate bits

CompSci 100e Representing pixels l A pixel typically stores RGB and alpha/transparency values  Each RGB is a value in the range 0 to 255  The alpha value is also in range 0 to 255 Pixel red = new Pixel(255,0,0,0); Pixel white = new Pixel(255,255,255,0); l Typically store these values as int values  A picture is simply an array of int values void process(int pixel){ int blue = pixel & 0xff; int green = (pixel >> 8) & 0xff; int red = (pixel>> 16) & 0xff; } G RColorB

CompSci 100e Bit masks and shifts void process(int pixel){ int blue = pixel & 0xff; int green = (pixel >> 8) & 0xff; int red = (pixel >> 16) & 0xff; } l Hexadecimal number: 0,1,2,3,4,5,6,7,8,9,a,b,c,d,e,f  Note that f is 15, in binary this is 1111, one less than  The hex number 0xff is an 8 bit number, all ones l The bitwise & operator creates an 8 bit value, 0—255 (why)  1&1 == 1, otherwise we get 0, similar to logical and  Similarly we have |, bitwise or

CompSci 100e Bit operations revisited l How do we write out all of the bits of a number / ** * writes the bit representation of a int * to standard out */ void bits(int val) {

CompSci 100e Problem: finding subsets l See CodeBloat APT, requires finding sums of all subsets  Given {72, 33, 41, 57, 25} what is sum closest (not over) 100?  How do we do this in general? l Consider three solutions (see also SubsetSums.java)  Recursively generate all sums: similar to backtracking Current value part of sum or not, two recursive calls  Use technique like sieve to form all sums Why is this so fast?  Alternative solution for all sums: use bit patterns to represent subsets What do 10110, 10001, 00111, 00000, and represent? How do we generate sums from these representations?

CompSci 100e Text Compression l Input: String S Output: String S  Shorter  S can be reconstructed from S

CompSci 100e Huffman Coding l D.A Huffman in early 1950’s l Before compressing data, analyze the input stream l Represent data using variable length codes l Variable length codes though Prefix codes  Each letter is assigned a codeword  Codeword is for a given letter is produced by traversing the Huffman tree  Property: No codeword produced is the prefix of another  Letters appearing frequently have short codewords, while those that appear rarely have longer ones l Huffman coding is optimal per-character coding method

CompSci 100e Text Compression: Examples SymbolASCIIFixed length Var. length a b c d e “abcde” in the different formats ASCII: … Fixed: Var: abcde a d bce Encodings ASCII: 8 bits/character Unicode: 16 bits/character

CompSci 100e Huffman coding: go go gophers l Encoding uses tree:  0 left/1 right  How many bits? 37!!  Savings? Worth it? ASCII 3 bits Huffman g o p h e r s sp s 1 * 2 2 p 1 h 1 2 e 1 r 1 4 g 3 o p 1 h 1 2 e 1 r 1 4 s 1 * 2 7 g 3 o

CompSci 100e Building a Huffman tree l Begin with a forest of single-node trees (leaves)  Each node/tree/leaf is weighted with character count  Node stores two values: character and count  There are n nodes in forest, n is size of alphabet? l Repeat until there is only one node left: root of tree  Remove two minimally weighted trees from forest  Create new tree with minimal trees as children, New tree root's weight: sum of children (character ignored) l Does this process terminate? How do we get minimal trees?  Remove minimal trees, hummm……

CompSci 100e Building a tree “A SIMPLE STRING TO BE ENCODED USING A MINIMAL NUMBER OF BITS” 116 I 5 E 5 N 1 C 1 F 1 P 2 U 2 R 2 L 2 D 2 G 3 T 3 O 3 B 3 A 4 M 4 S

CompSci 100e Building a tree “A SIMPLE STRING TO BE ENCODED USING A MINIMAL NUMBER OF BITS” 116 I 5 E 5 N 1 C 1 F 1 P 2 U 2 R 2 L 2 D 2 G 3 T 3 O 3 B 3 A 4 M 4 S 2 2

CompSci 100e Building a tree “A SIMPLE STRING TO BE ENCODED USING A MINIMAL NUMBER OF BITS” 116 I 5 E 5 N 1 F 1 C 1 P 2 U 2 R 2 L 2 D 2 G 3 T 3 O 3 B 3 A 4 M 4 S

CompSci 100e Building a tree “A SIMPLE STRING TO BE ENCODED USING A MINIMAL NUMBER OF BITS” 116 I 5 E 5 N 1 F 1 C 1 P 2 U 2 R 2 L 2 D 2 G 3 T 3 O 3 B 3 A 4 M 4 S

CompSci 100e Building a tree “A SIMPLE STRING TO BE ENCODED USING A MINIMAL NUMBER OF BITS” 116 I 5 E 5 N 1 F 1 C 1 P 2 U 2 R 2 L 2 D 2 G 3 T 3 O 3 B 3 A 4 M 4 S

CompSci 100e Building a tree “A SIMPLE STRING TO BE ENCODED USING A MINIMAL NUMBER OF BITS” 116 I 5 E 5 N 1 F 1 C 1 P 2 U 2 R 2 L 2 D 2 G 3 O 3 T 3 B 3 A 4 M 4 S

CompSci 100e Building a tree “A SIMPLE STRING TO BE ENCODED USING A MINIMAL NUMBER OF BITS” 116 I 5 E 5 N 1 F 1 C 1 P 2 U 2 R 2 L 2 D 2 G 3 O 3 T 3 B 3 A 4 M 4 S

CompSci 100e Building a tree “A SIMPLE STRING TO BE ENCODED USING A MINIMAL NUMBER OF BITS” 116 I 5 E 5 N 1 F 1 C 1 P 2 U 2 R 2 L 2 D 2 G 3 O 3 T 3 B 3 A 4 M 4 S

CompSci 100e Building a tree “A SIMPLE STRING TO BE ENCODED USING A MINIMAL NUMBER OF BITS” 116 I 5 E 5 N 1 F 1 C 1 P 2 U 2 R 2 L 2 D 2 G 3 O 3 T 3 B 3 A 4 M 4 S

CompSci 100e Building a tree “A SIMPLE STRING TO BE ENCODED USING A MINIMAL NUMBER OF BITS” 116 I 5 E 5 N 1 F 1 C 1 P 2 U 2 R 2 L 2 D 2 G 3 O 3 T 3 B 3 A 4 M 4 S

CompSci 100e Building a tree “A SIMPLE STRING TO BE ENCODED USING A MINIMAL NUMBER OF BITS” 116 I 5 N 5 E 1 F 1 C 1 P 2 U 2 R 2 L 2 D 2 G 3 O 3 T 3 B 3 A 4 M 4 S

CompSci 100e Building a tree “A SIMPLE STRING TO BE ENCODED USING A MINIMAL NUMBER OF BITS” 116 I 5 N 5 E 1 F 1 C 1 P 2 U 2 R 2 L 2 D 2 G 3 O 3 T 3 B 3 A 4 M 4 S

CompSci 100e Building a tree “A SIMPLE STRING TO BE ENCODED USING A MINIMAL NUMBER OF BITS” 11 6 I 5 N 5 E 1 F 1 C 1 P 2 U 2 R 2 L 2 D 2 G 3 O 3 T 3 B 3 A 4 M 4 S

CompSci 100e Building a tree “A SIMPLE STRING TO BE ENCODED USING A MINIMAL NUMBER OF BITS” 11 6 I 5 N 5 E 1 F 1 C 1 P 2 U 2 R 2 L 2 D 2 G 3 O 3 T 3 B 3 A 4 M 4 S

CompSci 100e Building a tree “A SIMPLE STRING TO BE ENCODED USING A MINIMAL NUMBER OF BITS” 11 6 I 5 N 5 E 1 F 1 C 1 P 2 U 2 R 2 L 2 D 2 G 3 O 3 T 3 B 3 A 4 M 4 S

CompSci 100e Building a tree “A SIMPLE STRING TO BE ENCODED USING A MINIMAL NUMBER OF BITS” 11 6 I 5 N 5 E 1 F 1 C 1 P 2 U 2 R 2 L 2 D 2 G 3 O 3 T 3 B 3 A 4 M 4 S

CompSci 100e Building a tree “A SIMPLE STRING TO BE ENCODED USING A MINIMAL NUMBER OF BITS” 11 6 I 5 N 5 E 1 F 1 C 1 P 2 U 2 R 2 L 2 D 2 G 3 O 3 T 3 B 3 A 4 M 4 S

CompSci 100e Building a tree “A SIMPLE STRING TO BE ENCODED USING A MINIMAL NUMBER OF BITS” 11 6 I 5 N 5 E 1 F 1 C 1 P 2 U 2 R 2 L 2 D 2 G 3 O 3 T 3 B 3 A 4 M 4 S

CompSci 100e Huffman Complexities l How do we measure? Size of input file, size of alphabet  Which is typically bigger? l Accumulating character counts: ______  How can we do this in O(1) time, though not really l Building the heap/priority queue from counts ____  Initializing heap guaranteed l Building Huffman tree ____  Why? l Create table of encodings from tree ____  Why? l Write tree and compressed file _____

CompSci 100e Properties of Huffman coding l Want to minimize weighted path length L ( T )of tree T l  w i is the weight or count of each codeword i  d i is the leaf corresponding to codeword i l How do we calculate character (codeword) frequencies? l Huffman coding creates pretty full bushy trees?  When would it produce a “bad” tree? l How do we produce coded compressed data from input efficiently?

CompSci 100e Writing code out to file l How do we go from characters to encodings?  Build Huffman tree  Root-to-leaf path generates encoding l Need way of writing bits out to file  Platform dependent?  Complicated to write bits and read in same ordering l See BitInputStream and BitOutputStream classes  Depend on each other, bit ordering preserved l How do we know bits come from compressed file?  Store a magic number

CompSci 100e Decoding a message 11 6 I 5 N 5 E 1 F 1 C 1 P 2 U 2 R 2 L 2 D 2 G 3 O 3 T 3 B 3 A 4 M 4 S

CompSci 100e Decoding a message 11 6 I 5 N 5 E 1 F 1 C 1 P 2 U 2 R 2 L 2 D 2 G 3 O 3 T 3 B 3 A 4 M 4 S

CompSci 100e Decoding a message 11 6 I 5 N 5 E 1 F 1 C 1 P 2 U 2 R 2 L 2 D 2 G 3 O 3 T 3 B 3 A 4 M 4 S

CompSci 100e Decoding a message 11 6 I 5 N 5 E 1 F 1 C 1 P 2 U 2 R 2 L 2 D 2 G 3 O 3 T 3 B 3 A 4 M 4 S

CompSci 100e Decoding a message 11 6 I 5 N 5 E 1 F 1 C 1 P 2 U 2 R 2 L 2 D 2 G 3 O 3 T 3 B 3 A 4 M 4 S G

CompSci 100e Decoding a message 11 6 I 5 N 5 E 1 F 1 C 1 P 2 U 2 R 2 L 2 D 2 G 3 O 3 T 3 B 3 A 4 M 4 S G

CompSci 100e Decoding a message 11 6 I 5 N 5 E 1 F 1 C 1 P 2 U 2 R 2 L 2 D 2 G 3 O 3 T 3 B 3 A 4 M 4 S G

CompSci 100e Decoding a message 11 6 I 5 N 5 E 1 F 1 C 1 P 2 U 2 R 2 L 2 D 2 G 3 O 3 T 3 B 3 A 4 M 4 S G

CompSci 100e Decoding a message 11 6 I 5 N 5 E 1 F 1 C 1 P 2 U 2 R 2 L 2 D 2 G 3 O 3 T 3 B 3 A 4 M 4 S G

CompSci 100e Decoding a message 11 6 I 5 N 5 E 1 F 1 C 1 P 2 U 2 R 2 L 2 D 2 G 3 O 3 T 3 B 3 A 4 M 4 S GO

CompSci 100e Decoding a message 11 6 I 5 N 5 E 1 F 1 C 1 P 2 U 2 R 2 L 2 D 2 G 3 O 3 T 3 B 3 A 4 M 4 S GO

CompSci 100e Decoding a message 11 6 I 5 N 5 E 1 F 1 C 1 P 2 U 2 R 2 L 2 D 2 G 3 O 3 T 3 B 3 A 4 M 4 S GO

CompSci 100e Decoding a message 11 6 I 5 N 5 E 1 F 1 C 1 P 2 U 2 R 2 L 2 D 2 G 3 O 3 T 3 B 3 A 4 M 4 S GO

CompSci 100e Decoding a message 11 6 I 5 N 5 E 1 F 1 C 1 P 2 U 2 R 2 L 2 D 2 G 3 O 3 T 3 B 3 A 4 M 4 S GO

CompSci 100e Decoding a message 11 6 I 5 N 5 E 1 F 1 C 1 P 2 U 2 R 2 L 2 D 2 G 3 O 3 T 3 B 3 A 4 M 4 S GOO

CompSci 100e Decoding a message 11 6 I 5 N 5 E 1 F 1 C 1 P 2 U 2 R 2 L 2 D 2 G 3 O 3 T 3 B 3 A 4 M 4 S GOO

CompSci 100e Decoding a message 11 6 I 5 N 5 E 1 F 1 C 1 P 2 U 2 R 2 L 2 D 2 G 3 O 3 T 3 B 3 A 4 M 4 S GOO

CompSci 100e Decoding a message 11 6 I 5 N 5 E 1 F 1 C 1 P 2 U 2 R 2 L 2 D 2 G 3 O 3 T 3 B 3 A 4 M 4 S GOO

CompSci 100e Decoding a message 11 6 I 5 N 5 E 1 F 1 C 1 P 2 U 2 R 2 L 2 D 2 G 3 O 3 T 3 B 3 A 4 M 4 S GOO

CompSci 100e Decoding a message 11 6 I 5 N 5 E 1 F 1 C 1 P 2 U 2 R 2 L 2 D 2 G 3 O 3 T 3 B 3 A 4 M 4 S GOOD

CompSci 100e Decoding a message 11 6 I 5 N 5 E 1 F 1 C 1 P 2 U 2 R 2 L 2 D 2 G 3 O 3 T 3 B 3 A 4 M 4 S GOOD

CompSci 100e Decoding 1. Read in tree dataO( ) 2. Decode bit string with treeO( )

CompSci 100e Huffman coding: go go gophers l choose two smallest weights  combine nodes + weights  Repeat  Priority queue? l Encoding uses tree:  0 left/1 right  How many bits? ASCII 3 bits Huffman g ?? o ?? p h e r s sp goers* 33 h p 1 h 1 2 e 1 r 1 3 s 1 * 2 2 p 1 h 1 2 e 1 r 1 4 g 3 o p

CompSci 100e Huffman Tree 2 l “A SIMPLE STRING TO BE ENCODED USING A MINIMAL NUMBER OF BITS”  E.g. “ A SIMPLE”  “ ”

CompSci 100e Huffman Tree 2 l “A SIMPLE STRING TO BE ENCODED USING A MINIMAL NUMBER OF BITS”  E.g. “ A SIMPLE”  “ ”

CompSci 100e Huffman Tree 2 l “A SIMPLE STRING TO BE ENCODED USING A MINIMAL NUMBER OF BITS”  E.g. “ A SIMPLE”  “ ”

CompSci 100e Huffman Tree 2 l “A SIMPLE STRING TO BE ENCODED USING A MINIMAL NUMBER OF BITS”  E.g. “ A SIMPLE”  “ ”

CompSci 100e Huffman Tree 2 l “A SIMPLE STRING TO BE ENCODED USING A MINIMAL NUMBER OF BITS”  E.g. “ A SIMPLE”  “ ”

CompSci 100e Huffman Tree 2 l “A SIMPLE STRING TO BE ENCODED USING A MINIMAL NUMBER OF BITS”  E.g. “ A SIMPLE”  “ ”

CompSci 100e Huffman Tree 2 l “A SIMPLE STRING TO BE ENCODED USING A MINIMAL NUMBER OF BITS”  E.g. “ A SIMPLE”  “ ”

CompSci 100e Huffman Tree 2 l “A SIMPLE STRING TO BE ENCODED USING A MINIMAL NUMBER OF BITS”  E.g. “ A SIMPLE”  “ ”

CompSci 100e Huffman Tree 2 l “A SIMPLE STRING TO BE ENCODED USING A MINIMAL NUMBER OF BITS”  E.g. “ A SIMPLE”  “ ”

CompSci 100e Other methods l Adaptive Huffman coding l Lempel-Ziv algorithms  Build the coding table on the fly while reading document  Coding table changes dynamically  Protocol between encoder and decoder so that everyone is always using the right coding scheme  Works well in practice ( compress, gzip, etc.) l More complicated methods  Burrows-Wheeler ( bunzip2 )  PPM statistical methods