Math 221 Huffman Codes.

Slides:



Advertisements
Similar presentations
Chapter 10: Trees. Definition A tree is a connected undirected acyclic (with no cycle) simple graph A collection of trees is called forest.
Advertisements

Binary Trees CSC 220. Your Observations (so far data structures) Array –Unordered Add, delete, search –Ordered Linked List –??
22C:19 Discrete Structures Trees Spring 2014 Sukumar Ghosh.
Data Structures: A Pseudocode Approach with C 1 Chapter 6 Objectives Upon completion you will be able to: Understand and use basic tree terminology and.
22C:19 Discrete Math Trees Fall 2011 Sukumar Ghosh.
Huffman Encoding Dr. Bernard Chen Ph.D. University of Central Arkansas.
Greedy Algorithms (Huffman Coding)
Lecture 10 : Huffman Encoding Bong-Soo Sohn Assistant Professor School of Computer Science and Engineering Chung-Ang University Lecture notes : courtesy.
Data Compressor---Huffman Encoding and Decoding. Huffman Encoding Compression Typically, in files and messages, Each character requires 1 byte or 8 bits.
1 Section 9.1 Introduction to Trees. 2 Tree terminology Tree: a connected, undirected graph that contains no simple circuits –must be a simple graph:
1 Huffman Codes. 2 Introduction Huffman codes are a very effective technique for compressing data; savings of 20% to 90% are typical, depending on the.
DL Compression – Beeri/Feitelson1 Compression דחיסה Introduction Information theory Text compression IL compression.
Data Structures – LECTURE 10 Huffman coding
Lossless Data Compression Using run-length and Huffman Compression pages
CS420 lecture eight Greedy Algorithms. Going from A to G Starting with a full tank, we can drive 350 miles before we need to gas up, minimize the number.
Huffman code uses a different number of bits used to encode characters: it uses fewer bits to represent common characters and more bits to represent rare.
x x x 1 =613 Base 10 digits {0...9} Base 10 digits {0...9}
Huffman Codes Message consisting of five characters: a, b, c, d,e
Let G be a pseudograph with vertex set V, edge set E, and incidence mapping f. Let n be a positive integer. A path of length n between vertex v and vertex.
Huffman Codes. Encoding messages  Encode a message composed of a string of characters  Codes used by computer systems  ASCII uses 8 bits per character.
Huffman Encoding Veronica Morales.
Lecture Objectives  To learn how to use a Huffman tree to encode characters using fewer bytes than ASCII or Unicode, resulting in smaller files and reduced.
CS-2852 Data Structures LECTURE 13B Andrew J. Wozniewicz Image copyright © 2010 andyjphoto.com.
CS261 Data Structures Trees Introduction and Applications.
4.8 Huffman Codes These lecture slides are supplied by Mathijs de Weerd.
 The amount of data we deal with is getting larger  Not only do larger files require more disk space, they take longer to transmit  Many times files.
Lecture 10 Trees –Definiton of trees –Uses of trees –Operations on a tree.
Section 10.1 Introduction to Trees These class notes are based on material from our textbook, Discrete Mathematics and Its Applications, 6 th ed., by Kenneth.
Data Structures and Algorithms Lecture (BinaryTrees) Instructor: Quratulain.
Red-Black Trees Acknowledgment Many thanks to “erm” from Purdue University for this very interesting way of presenting this course material. 1.
© Jalal Kawash 2010 Trees & Information Coding Peeking into Computer Science.
 Rooted tree and binary tree  Theorem 5.19: A full binary tree with t leaves contains i=t-1 internal vertices.
5.5.2 M inimum spanning trees  Definition 24: A minimum spanning tree in a connected weighted graph is a spanning tree that has the smallest possible.
5.5.3 Rooted tree and binary tree  Definition 25: A directed graph is a directed tree if the graph is a tree in the underlying undirected graph.  Definition.
Huffman coding Content 1 Encoding and decoding messages Fixed-length coding Variable-length coding 2 Huffman coding.
Agenda Review: –Planar Graphs Lecture Content:  Concepts of Trees  Spanning Trees  Binary Trees Exercise.
5.5.2 M inimum spanning trees  Definition 24: A minimum spanning tree in a connected weighted graph is a spanning tree that has the smallest possible.
Discrete Mathematics Chapter 5 Trees.
Huffman’s Algorithm 11/02/ Weighted 2-tree A weighted 2-tree T is an extended binary tree with n external nodes and each of the external nodes is.
Foundation of Computing Systems
Chapter 10: Trees A tree is a connected simple undirected graph with no simple circuits. Properties: There is a unique simple path between any 2 of its.
Lossless Decomposition and Huffman Codes Sophia Soohoo CS 157B.
Trees Ellen Walker CPSC 201 Data Structures Hiram College.
Huffman Codes. Overview  Huffman codes: compressing data (savings of 20% to 90%)  Huffman’s greedy algorithm uses a table of the frequencies of occurrence.
1 Data Compression Hae-sun Jung CS146 Dr. Sin-Min Lee Spring 2004.
Huffman encoding.
Chapter 6 – Trees. Notice that in a tree, there is exactly one path from the root to each node.
18-1 Chapter 18 Binary Trees Data Structures and Design in Java © Rick Mercer.
Chapter 11. Chapter Summary  Introduction to trees (11.1)  Application of trees (11.2)  Tree traversal (11.3)  Spanning trees (11.4)
5.6 Prefix codes and optimal tree Definition 31: Codes with this property which the bit string for a letter never occurs as the first part of the bit string.
Data Structures and Design in Java © Rick Mercer
CSCE 210 Data Structures and Algorithms
Huffman Codes ASCII is a fixed length 7 bit code that uses the same number of bits to define each character regardless of how frequently it occurs. Huffman.
B/B+ Trees 4.7.
Binary search tree. Removing a node
Binary Tree.
Data Compression If you’ve ever sent a large file to a friend, you may have compressed it into a zip archive like the one on this slide before doing so.
Taibah University College of Computer Science & Engineering Course Title: Discrete Mathematics Code: CS 103 Chapter 10 Trees Slides are adopted from “Discrete.
Kruskal’s Algorithm for finding a minimum spanning tree
Forests D. J. Foreman.
Huffman Encoding Huffman code is method for the compression for standard text documents. It makes use of a binary tree to develop codes of varying lengths.
Lecture 36 Section 12.2 Mon, Apr 23, 2007
Trees Addenda.
Trees 11.1 Introduction to Trees Dr. Halimah Alshehri.
File Compression Even though disks have gotten bigger, we are still running short on disk space A common technique is to compress files so that they take.
Binary Trees.
Algorithms CSCI 235, Spring 2019 Lecture 30 More Greedy Algorithms
CSE 589 Applied Algorithms Spring 1999
Algorithms CSCI 235, Spring 2019 Lecture 31 Huffman Codes
Presentation transcript:

Math 221 Huffman Codes

Suppose you have a file… Letter Code Frequency Total Bits a 000 10 30 e 001 15 45 i 010 12 36 s 011 3 9 t 100 4 space 101 13 39 newline 110 1 Total 174

Represent the Code with a Tree 1 1 1 1 1 1 a e i s t sp nl

Some Terminology A tree is a collection of nodes where any path that ends at the same node it started from must intersect itself, i.e. it has no “closed circuits”. A node with no edges coming out of it is a leaf. A node connected to an above node is a child of the above node. We will only consider binary trees, i.e. each node will have at most two children. Convention: 0 means to to the left, 1 to the right.

Important If a code is represented by the leaves of a binary tree, then a binary string can be uniquely decoded!

Improving the Code Notice that the newline does not have a sibling. Thus, we can place it in its parent node and get a shorter code! This shortens the number of bits needed to represent a newline, from three to two.

Huffman’s Algorithm Every node is given its frequency as a weight. Join the two nodes with lowest weight. Now we have a tree. In this algorithm the weight of a tree is the sum of the weights of its leaves. Now, at the nth stage, join the two trees with the lowest weight.

Our example We start with which becomes a e i s t sp nl a e i t sp s 10 e 15 i 12 s 3 t 4 sp 13 nl 1 which becomes 4 T1 a 10 e 15 i 12 t 4 sp 13 s 3 nl 1

which becomes 8 T2 4 t 4 T1 a 10 e 15 i 12 sp 13 s 3 nl 1

which becomes 18 T3 8 a 10 T2 4 t 4 T1 e 15 i 12 sp 13 s 3 nl 1

which becomes 18 T3 8 a 10 T4 25 T2 4 t 4 e 15 i 12 sp 13 T1 s 3 nl 1

which becomes e a t i sp s nl 33 T5 18 15 T3 8 10 T4 25 T2 4 4 12 13

which becomes e i sp a t And we are done! s nl 58 T6 T4 25 33 T5 18 15 12 sp 13 T3 8 a 10 T2 4 t 4 And we are done! T1 s 3 nl 1

Our New Code a 001 10 30 e 01 15 i 12 24 s 00000 3 t 0001 4 16 space Letter Code Frequency Total Bits a 001 10 30 e 01 15 i 12 24 s 00000 3 t 0001 4 16 space 11 13 26 newline 00001 1 5 Total 146