CPS 100 9.1 Heaps, Priority Queues, Compression l Compression is a high-profile application .zip,.mp3,.jpg,.gif,.gz, …  Why is compression important?

Slides:



Advertisements
Similar presentations
Introduction to Computer Science 2 Lecture 7: Extended binary trees
Advertisements

Michael Alves, Patrick Dugan, Robert Daniels, Carlos Vicuna
COL 106 Shweta Agrawal and Amit Kumar
Binary Trees CSC 220. Your Observations (so far data structures) Array –Unordered Add, delete, search –Ordered Linked List –??
Priority Queues. 2 Priority queue A stack is first in, last out A queue is first in, first out A priority queue is least-first-out The “smallest” element.
Trees Chapter 8.
1 Chapter 6 Priority Queues (Heaps) General ideas of priority queues (Insert & DeleteMin) Efficient implementation of priority queue Uses of priority queues.
CS 315 March 24 Goals: Heap (Chapter 6) priority queue definition of a heap Algorithms for Insert DeleteMin percolate-down Build-heap.
Fall 2007CS 2251 Trees Chapter 8. Fall 2007CS 2252 Chapter Objectives To learn how to use a tree to represent a hierarchical organization of information.
Binary Heaps CSE 373 Data Structures Lecture 11. 2/5/03Binary Heaps - Lecture 112 Readings Reading ›Sections
Trees Chapter 8. Chapter 8: Trees2 Chapter Objectives To learn how to use a tree to represent a hierarchical organization of information To learn how.
Version TCSS 342, Winter 2006 Lecture Notes Priority Queues Heaps.
A Data Compression Algorithm: Huffman Compression
1 TCSS 342, Winter 2005 Lecture Notes Priority Queues and Heaps Weiss Ch. 21, pp
Priority Queues. Priority queue A stack is first in, last out A queue is first in, first out A priority queue is least-first-out –The “smallest” element.
Priority Queues. Priority queue A stack is first in, last out A queue is first in, first out A priority queue is least-first-out –The “smallest” element.
Source: Muangsin / Weiss1 Priority Queue (Heap) A kind of queue Dequeue gets element with the highest priority Priority is based on a comparable value.
CSE 373 Data Structures and Algorithms Lecture 13: Priority Queues (Heaps)
CSC 172 DATA STRUCTURES. Priority Queues Model Set with priorities associatedwith elements Priorities are comparable by a < operator Operations Insert.
1 Priority Queues (Heaps)  Sections 6.1 to The Priority Queue ADT  DeleteMin –log N time  Insert –log N time  Other operations –FindMin  Constant.
1 HEAPS & PRIORITY QUEUES Array and Tree implementations.
Data Structures Arrays both single and multiple dimensions Stacks Queues Trees Linked Lists.
Geoff Holmes and Bernhard Pfahringer COMP206-08S General Programming 2.
PRIORITY QUEUES (HEAPS). Queues are a standard mechanism for ordering tasks on a first-come, first-served basis However, some tasks may be more important.
Lecture Objectives  To learn how to use a Huffman tree to encode characters using fewer bytes than ASCII or Unicode, resulting in smaller files and reduced.
CPS 100, Fall YAQ, YAQ, haha! (Yet Another Queue) l What is the dequeue policy for a Queue?  Why do we implement Queue with LinkedList Interface.
CPS Balanced Search Trees l BST: efficient lookup, insertion, deletion  Average case: O(log n) for all operations since find is O(log n) [complexity.
Trees Chapter 8. Chapter 8: Trees2 Chapter Objectives To learn how to use a tree to represent a hierarchical organization of information To learn how.
Priority Queues and Binary Heaps Chapter Trees Some animals are more equal than others A queue is a FIFO data structure the first element.
data ordered along paths from root to leaf
Data Structures Week 8 Further Data Structures The story so far  Saw some fundamental operations as well as advanced operations on arrays, stacks, and.
CPS 100, Fall PF-AT-EOS l Data compression  Seriously important, why? l Priority Queue  Seriously important, why? l Assignment overview  Huff.
Chapter 21 Priority Queue: Binary Heap Saurav Karmakar.
CSC 213 – Large Scale Programming Lecture 15: Heap-based Priority Queue.
Priority Queue. Priority Queues Queue (FIFO). Priority queue. Deletion from a priority queue is determined by the element priority. Two kinds of priority.
CompSci 100E 24.1 Data Compression  Compression is a high-profile application .zip,.mp3,.jpg,.gif,.gz, …  What property of MP3 was a significant factor.
COSC2007 Data Structures II Chapter 12 Tables & Priority Queues III.
1 Joe Meehean.  We wanted a data structure that gave us... the smallest item then the next smallest then the next and so on…  This ADT is called a priority.
CPS Heaps, Priority Queues, Compression l Compression is a high-profile application .zip,.mp3,.jpg,.gif,.gz, …  What property of MP3 was a significant.
CPS 100, Spring Huffman Coding l D.A Huffman in early 1950’s l Before compressing data, analyze the input stream l Represent data using variable.
CSE373: Data Structures & Algorithms Lecture 6: Priority Queues Kevin Quinn Fall 2015.
Data StructuresData Structures Priority Queue. Recall Queues FIFO:First-In, First-Out Some contexts where this seems right? Some contexts where some things.
AVL Trees and Heaps. AVL Trees So far balancing the tree was done globally Basically every node was involved in the balance operation Tree balancing can.
CSE 373: Data Structures and Algorithms Lecture 11: Priority Queues (Heaps) 1.
Java Methods A & AB Object-Oriented Programming and Data Structures Maria Litvin ● Gary Litvin Copyright © 2006 by Maria Litvin, Gary Litvin, and Skylight.
CompSci 100e 8.1 Scoreboard l What else might we want to do with a data structure? AlgorithmInsertionDeletionSearch Unsorted Vector/array Sorted vector/array.
CompSci 100e 8.1 Plan for the Course! l Understand Huffman Coding  Data compression  Priority Queues  Bits and Bytes  Greedy Algorithms l Algorithms.
1. What is it? It is a queue that access elements according to their importance value. Eg. A person with broken back should be treated before a person.
CS 367 Introduction to Data Structures Lecture 8.
Priority Queues CS 110: Data Structures and Algorithms First Semester,
1 Chapter 6 Heapsort. 2 About this lecture Introduce Heap – Shape Property and Heap Property – Heap Operations Heapsort: Use Heap to Sort Fixing heap.
Properties: -The value in each node is greater than all values in the node’s subtrees -Complete tree! (fills up from left to right) Max Heap.
Priority Queues CS /02/05 L7: PQs Slide 2 Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved.
Lecture on Data Structures(Trees). Prepared by, Jesmin Akhter, Lecturer, IIT,JU 2 Properties of Heaps ◈ Heaps are binary trees that are ordered.
CSE373: Data Structures & Algorithms Priority Queues
Data Compression Compression is a high-profile application
Source: Muangsin / Weiss
Compsci 201 Priority Queues & Autocomplete
Heaps, Priority Queues, Compression
Map interface Empty() - return true if the map is empty; else return false Size() - return the number of elements in the map Find(key) - if there is an.
CMSC 341: Data Structures Priority Queues – Binary Heaps
CSE 373: Data Structures and Algorithms
Priority Queues.
Priority Queues & Heaps
Priority Queues.
Priority Queues CSE 373 Data Structures.
CSE 373, Copyright S. Tanimoto, 2002 Priority Queues -
Scoreboard What else might we want to do with a data structure?
Priority Queues (Heaps)
Presentation transcript:

CPS Heaps, Priority Queues, Compression l Compression is a high-profile application .zip,.mp3,.jpg,.gif,.gz, …  Why is compression important? What property of MP3 was a significant part of what make Napster succeed. l What’s the difference between compression for.mp3 files and compression for.zip files? Between.gif and.jpg?  What’s the source, what’s the destination?  Why does the difference make a difference? What is lossy vs. lossless compression? l Is it possible to compress (lossless compression rather than lossy) every file? Every file of a given size?  What are repercussions?

CPS Priority Queue l Compression motivates the study of the ADT priority queue  Supports two basic operations insert -– an element into the priority queue delete – the minimal element from the priority queue  Implementations may allow getmin separate from delete Analogous to top/pop, front/dequeue in stacks, queues l Simple sorting using priority queue (see pqdemo.cpp and simplepq.cpp), what is the complexity of this sorting method? string s; priority_queue pq; while (cin >> s) pq.insert(s); while (pq.size() > 0) { pq.deletemin(s); cout << s << endl; }

CPS Priority Queue implementations l Implementing priority queues: average and worst case Insert average Getmin (delete) Insert worst Getmin (delete) Unsorted vector Sorted vector Search tree Balanced tree Heap log n O(n) O(1)log n O(n)O(1)O(n)O(1) O(n)O(1)O(n)O(1) l Heap has O(1) find-min (no delete) and O(n) build heap

CPS Class tpqueue, see tpq.h l Templated class like tstack, tqueue, tvector, tmap, …  If deletemin is supported, what properties must types put into tpq have, e.g., can we insert string? double? struct?  Can we change what minimal means (think about anaword and sorting)?  Implementation in tpq.h, tpq.cpp uses heap l If we use a compare function object for comparing entries we can make a min-heap act like a max-heap, see pqdemo.cpp  Notice that RevComp inherits from Comparer  Where is class Comparer declaration? How used? STL standard C++ class priority_queue  See stlpq.cpp, changing comparison requires template

CPS Sorting with tapestrypq.cpp, stlpq.cpp void sort(tvector & v) // pre: v contains v.size() entries // post: v is sorted { tpqueue pq; for(int k=0; k < v.size(); k++) pq.insert(v[k]); for(int k=0; k < v.size(); k++) pq.deletemin(v[k]); } l How does this work, regardless of tpqueue implementation? l What is the complexity of this method?  insert O(1), deletemin O(log n)? If insert O(log n)?  In practice heapsort uses the vector as the priority queue rather than separate pq.  From a big-Oh perspective no difference: O(n log n) Is there a difference? What’s hidden with O notation?

CPS Priority Queue implementation The class tpqueue uses heaps, fast and reasonably simple  Why not use inheritance hierarchy as was used with tmap?  Trade-offs when using HMap and BSTMap: Time, space Ordering properties, e.g., what does BSTMap support? l Changing method of comparison when calculating priority?  Create a function that replaces operator < We want to pass the function, most general approach creates an object to hold the function Also possible to pass function pointers, we avoid that  The function object replacing operator < must: Compare two objects, so has two parameters Returns –1, 0, +1 depending on

CPS Creating Heaps l Heap is an array-based implementation of a binary tree used for implementing priority queues, supports:  insert, findmin, deletemin: complexities? l Using array minimizes storage (no explicit pointers), faster too --- children are located by index/position in array l Heap is a binary tree with shape property, heap/value property  shape: tree filled at all levels (except perhaps last) and filled left-to-right (complete binary tree)  each node has value smaller than both children

CPS Array-based heap l store “node values” in array beginning at index 1 l for node with index k  left child: index 2*k  right child: index 2*k+1 l why is this conducive for maintaining heap shape? l what about heap property? l is the heap a search tree? l where is minimal node? l where are nodes added? deleted?

CPS Adding values to heap l to maintain heap shape, must add new value in left-to-right order of last level  could violate heap property  move value “up” if too small l change places with parent if heap property violated  stop when parent is smaller  stop when root is reached l pull parent down, swapping isn’t necessary (optimization) insert 8 bubble 8 up

CPS Adding values, details void pqueue::insert(int elt) { // add elt to heap in myList myList.push_back(elt); int loc = myList.size(); while (1 < loc && elt < myList[loc/2]) { myList[loc] = myList[loc/2]; loc /= 2; // go to parent } // what’s true here? myList[loc] = elt; } tvector myList

CPS Removing minimal element l Where is minimal element?  If we remove it, what changes, shape/property? l How can we maintain shape?  “last” element moves to root  What property is violated? l After moving last element, subtrees of root are heaps, why?  Move root down (pull child up) does it matter where? l When can we stop “re-heaping”? 

CPS Huffman codes and compression l Compression exploits redundancy  Run-length encoding: Coded as Useful? Problems?  What about ? l Encoding can be based on characters, chunks, …  Instead of using 8-bits for ‘A’, use 2-bits and 14 bits for ‘Z’ Why might this be advantageous?  Methods can exploit local information abcabcabc is 3(abc) or is for alphabet ‘abc’ l Huffman coding is optimal per-character coding method

CPS Towards Compression l Each ASCII character is represented by 8 bits, one byte  bit is a binary digit, byte is a binary term  compress text: use fewer bits for frequent characters (does this come free?) l 256 character values, 2 8 = 256, how many bits needed for 7 characters? for 38 characters? for 125 characters? go go gophers: 8 different characters ASCII 3 bits g o p h e r s sp ASCII: 13 x 8 = 104 bits 3 bit code: 13 x 3 = 39 bits compressed: ???

CPS Huffman coding: go go gophers l choose two smallest weights  combine nodes + weights  Repeat  Priority queue? l Encoding uses tree:  0 left/1 right  How many bits? ASCII 3 bits Huffman g o p h e r s sp goers* 33 p 1 h p 1 h 1 2 e 1 r 1 3 s 1 * 2 2 p 1 h 1 2 e 1 r 1 4 g 3 o 3 6

CPS Properties of Huffman code l Prefix property, no code is prefix of another code l optimal per character compression l Where do frequencies come from? l decode: need tree e a r s * t

CPS Rodney Brooks l Flesh and Machines : “We are machines, as are our spouses, our children, and our dogs... I believe myself and my children all to be mere machines. But this is not how I treat them. I treat them in a very special way, and I interact with them on an entirely different level. They have my unconditional love, the furthest one might be able to get from rational analysis.” l Director of MIT AI Lab