Heaps, Priority Queues, Compression

Slides:



Advertisements
Similar presentations
Introduction to Computer Science 2 Lecture 7: Extended binary trees
Advertisements

Michael Alves, Patrick Dugan, Robert Daniels, Carlos Vicuna
Min Chen School of Computer Science and Engineering Seoul National University Data Structure: Chapter 9.
COL 106 Shweta Agrawal and Amit Kumar
Binary Trees CSC 220. Your Observations (so far data structures) Array –Unordered Add, delete, search –Ordered Linked List –??
Advanced Data Structures Chapter 16. Priority Queues Collection of elements each of which has a priority. Does not maintain a first-in, first-out discipline.
Trees Chapter 8.
Compression & Huffman Codes
1 Chapter 6 Priority Queues (Heaps) General ideas of priority queues (Insert & DeleteMin) Efficient implementation of priority queue Uses of priority queues.
Fall 2007CS 2251 Trees Chapter 8. Fall 2007CS 2252 Chapter Objectives To learn how to use a tree to represent a hierarchical organization of information.
Priority Queues. Container of elements where each element has an associated key A key is an attribute that can identify rank or weight of an element Examples.
Trees Chapter 8. Chapter 8: Trees2 Chapter Objectives To learn how to use a tree to represent a hierarchical organization of information To learn how.
Trees Chapter 8. Chapter 8: Trees2 Chapter Objectives To learn how to use a tree to represent a hierarchical organization of information To learn how.
Version TCSS 342, Winter 2006 Lecture Notes Priority Queues Heaps.
A Data Compression Algorithm: Huffman Compression
CS 206 Introduction to Computer Science II 04 / 29 / 2009 Instructor: Michael Eckmann.
Source: Muangsin / Weiss1 Priority Queue (Heap) A kind of queue Dequeue gets element with the highest priority Priority is based on a comparable value.
CS 206 Introduction to Computer Science II 12 / 10 / 2008 Instructor: Michael Eckmann.
CSE 373 Data Structures and Algorithms Lecture 13: Priority Queues (Heaps)
1 CSC 427: Data Structures and Algorithm Analysis Fall 2010 transform & conquer  transform-and-conquer approach  balanced search trees o AVL, 2-3 trees,
1 Priority Queues (Heaps)  Sections 6.1 to The Priority Queue ADT  DeleteMin –log N time  Insert –log N time  Other operations –FindMin  Constant.
CS 46B: Introduction to Data Structures July 30 Class Meeting Department of Computer Science San Jose State University Summer 2015 Instructor: Ron Mak.
Data Structures Arrays both single and multiple dimensions Stacks Queues Trees Linked Lists.
Geoff Holmes and Bernhard Pfahringer COMP206-08S General Programming 2.
Trees. Tree Terminology Chapter 8: Trees 2 A tree consists of a collection of elements or nodes, with each node linked to its successors The node at the.
Lecture Objectives  To learn how to use a Huffman tree to encode characters using fewer bytes than ASCII or Unicode, resulting in smaller files and reduced.
CPS 100, Fall YAQ, YAQ, haha! (Yet Another Queue) l What is the dequeue policy for a Queue?  Why do we implement Queue with LinkedList Interface.
Trees Chapter 8. Chapter 8: Trees2 Chapter Objectives To learn how to use a tree to represent a hierarchical organization of information To learn how.
data ordered along paths from root to leaf
1 Heaps and Priority Queues Starring: Min Heap Co-Starring: Max Heap.
Huffman coding Content 1 Encoding and decoding messages Fixed-length coding Variable-length coding 2 Huffman coding.
CompSci 100E 24.1 Data Compression  Compression is a high-profile application .zip,.mp3,.jpg,.gif,.gz, …  What property of MP3 was a significant factor.
Priority Queues, Trees, and Huffman Encoding CS 244 This presentation requires Audio Enabled Brent M. Dingle, Ph.D. Game Design and Development Program.
CPS Heaps, Priority Queues, Compression l Compression is a high-profile application .zip,.mp3,.jpg,.gif,.gz, …  What property of MP3 was a significant.
CPS 100, Spring Huffman Coding l D.A Huffman in early 1950’s l Before compressing data, analyze the input stream l Represent data using variable.
CSE 373: Data Structures and Algorithms Lecture 11: Priority Queues (Heaps) 1.
CompSci 100e 8.1 Scoreboard l What else might we want to do with a data structure? AlgorithmInsertionDeletionSearch Unsorted Vector/array Sorted vector/array.
1. What is it? It is a queue that access elements according to their importance value. Eg. A person with broken back should be treated before a person.
CPS Heaps, Priority Queues, Compression l Compression is a high-profile application .zip,.mp3,.jpg,.gif,.gz, …  Why is compression important?
Priority Queues CS 110: Data Structures and Algorithms First Semester,
Properties: -The value in each node is greater than all values in the node’s subtrees -Complete tree! (fills up from left to right) Max Heap.
Lecture on Data Structures(Trees). Prepared by, Jesmin Akhter, Lecturer, IIT,JU 2 Properties of Heaps ◈ Heaps are binary trees that are ordered.
Partially Ordered Data ,Heap,Binary Heap
Priority Queue A Priority Queue Set S is made up of n elements: x0 x1 x2 x3 … xn-1 Functions: createEmptySet() returns a newly created empty priority.
CS 201 Data Structures and Algorithms
Compression & Huffman Codes
Tries 07/28/16 11:04 Text Compression
Heapsort CSE 373 Data Structures.
Data Compression Compression is a high-profile application
The Greedy Method and Text Compression
The Greedy Method and Text Compression
Source: Muangsin / Weiss
Compsci 201 Priority Queues & Autocomplete
Data Compression If you’ve ever sent a large file to a friend, you may have compressed it into a zip archive like the one on this slide before doing so.
Map interface Empty() - return true if the map is empty; else return false Size() - return the number of elements in the map Find(key) - if there is an.
Chapter 8 – Binary Search Tree
CSCI2100 Data Structures Tutorial 7
CMSC 341: Data Structures Priority Queues – Binary Heaps
Chapter 9: Huffman Codes
Binary Heaps Text Binary Heap Building a Binary Heap
CSE 373: Data Structures and Algorithms
CMSC 341 Lecture 14 Priority Queues & Heaps
Priority Queues.
Huffman Coding CSE 373 Data Structures.
Heapsort CSE 373 Data Structures.
A Heap Implementation Chapter 26 Adapted from Pearson Education, Inc.
Priority Queues & Heaps
Priority Queues.
Scoreboard What else might we want to do with a data structure?
Priority Queues (Heaps)
Presentation transcript:

Heaps, Priority Queues, Compression Compression is a high-profile application .zip, .mp3, .jpg, .gif, .gz, … What property of MP3 was a significant factor in what made Napster work (why did Napster ultimately fail?) What’s the difference between compression for .mp3 files and compression for .zip files? Between .gif and .jpg? What’s the source, what’s the destination? What is lossy vs. lossless compression? Are the differences important? Is it possible to compress (lossless compression rather than lossy) every file? Every file of a given size? What are repercussions?

Priority Queue Compression motivates the study of the ADT priority queue Supports three basic operations insert -– an element into the priority queue removeMin – the minimal element from the priority queue peekMin– find (don’t delete) minimal element getMin/delete analagous: stack top/pop, queue enqueue/dequeue Simple sorting using priority queue (see PQdemo.java), what is the complexity of this sorting method? void sort(ArrayList list) { int k; PriorityQueue pq = new ArrayPriorityQueue(); int size = list.size(); for(k=0; k < size; k++) pq.add(list.get(k)); for(k=0; k < size; k++) list.set(k,pq.removeMin()); }

Priority Queue implementations Implementing priority queues: average and worst case Insert average remove (delete) worst Unsorted array Sorted array Search tree Balanced tree Heap O(n) O(1) O(n) O(1) log n O(n) log n O(1) log n Heap has O(1) peekMin (no delete) and O(n) build heap

Priority Queue implementation The interface PriorityQueue is AP specific, not java.util Implementations provided: ArrayPriorityQueue What are complexities? Easy to implement? Could use TreeSet, see Set API for method first() How do we implement removeMin using peekMin? In general, elements must implement Comparable Changing method of comparison when calculating priority? Could provide Comparator in PriorityQueue Could wrap Comparable objects in other Comparable How do implement ReverseComparable?

Reverse Comparable: Decorator class ReverseComparable implements Comparable { private Comparable myObject; public ReverseComparable(Comparable c) myObject = c ; } public int compareTo(Object o) ReverseComparable c = (ReverseComparable) o; return -1 * myObject.compareTo(c.getObject()); public Comparable getObject() return myObject;

Creating Heaps Heap is an array-based implementation of a binary tree used for implementing priority queues, supports: insert, findmin, deletemin: complexities? Using array minimizes storage (no explicit pointers), faster too --- children are located by index/position in array Heap is a binary tree with shape property, heap/value property shape: tree filled at all levels (except perhaps last) and filled left-to-right (complete binary tree) each node has value smaller than both children

Array-based heap store “node values” in array beginning at index 1 for node with index k left child: index 2*k right child: index 2*k+1 why is this conducive for maintaining heap shape? what about heap property? is the heap a search tree? where is minimal node? where are nodes added? deleted? 1 2 3 4 5 6 7 8 9 10 17 13 25 21 19 6 10 7 17 13 9 21 19 25

Thinking about heaps Where is minimal element? Root, why? Where is maximal element? Leaves, why? How many leaves are there in an N-node heap (big-Oh)? O(n), but exact? What is complexity of find max in a minheap? Why? O(n), but ½ N? Where is second smallest element? Why? Near root? 6 10 7 17 13 9 21 19 25 1 2 3 4 5 6 7 8 9 10 17 13 25 21 19

Towards Compression Each ASCII character is represented by 8 bits, one byte bit is a binary digit, byte is a binary term compress text: use fewer bits for frequent characters (does this come free?) 256 character values, 28 = 256, how many bits needed for 7 characters? for 38 characters? for 125 characters? go go gophers: 8 different characters ASCII 3 bits g 103 1100111 000 o 111 1101111 001 p 112 1110000 010 h 104 1101000 011 e 101 1100101 100 r 114 1110010 101 s 115 1110011 110 sp. 32 1000000 111 ASCII: 13 x 8 = 104 bits 3 bit code: 13 x 3 = 39 bits compressed: ???

Huffman coding: go go gophers ASCII 3 bits Huffman g 103 1100111 000 ?? o 111 1101111 001 ?? p 112 1110000 010 h 104 1101000 011 e 101 1100101 100 r 114 1110010 101 s 115 1110011 110 sp. 32 1000000 111 choose two smallest weights combine nodes + weights Repeat Priority queue? Encoding uses tree: 0 left/1 right How many bits? g o p h 1 e r s * 3 3 1 1 1 1 2 2 p 1 h 2 e 1 r 3 s 1 * 2 2 p 1 h e r 4 g 3 o 6

Huffman coding: go go gophers ASCII 3 bits Huffman g 103 1100111 000 00 o 111 1101111 001 01 p 112 1110000 010 1100 h 104 1101000 011 1101 e 101 1100101 100 1110 r 114 1110010 101 1111 s 115 1110011 110 101 sp. 32 1000000 111 101 Encoding uses tree: 0 left/1 right How many bits? 37!! Savings? Worth it? g 3 o 6 13 3 2 p 1 h e r 4 s * 7 g 3 o 6 2 p 1 h e r 4 3 s 1 * 2

Properties of Huffman code Prefix property, no code is prefix of another code optimal per character compression Where do frequencies come from? decode: need tree a t r s e * 1000111101001110100000110101111011110001