Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Structures for Media Introduction. Dr. Minming Li Department of Computer Science Room :Y6426 Phone :27889538

Similar presentations


Presentation on theme: "Data Structures for Media Introduction. Dr. Minming Li Department of Computer Science Room :Y6426 Phone :27889538"— Presentation transcript:

1 Data Structures for Media Introduction

2 Dr. Minming Li Department of Computer Science Room :Y6426 Phone :27889538 Email: minmli@cs.cityu.edu.hk, mli000@cityu.edu.hk minmli@cs.cityu.edu.hkmli000@cityu.edu.hkminmli@cs.cityu.edu.hkmli000@cityu.edu.hk Mail box: P16(Outside CS General Office, Yellow zone, 6/F) Lecturer

3 What you can find at the course webpage: What you can find at the course webpage: http://www.cs.cityu.edu.hk/~minmli/course.htm Lecture Slides Ready at least 24 hours before the lecture Tutorial Exercises Selected questions are to be done during tutorials or as homework Assignments and solutions Announcement from the teacher Important ones will also be distributed through email Tutorial Class Room changed to MMW-2478 Q&A Course Web Page

4 Weiss M. Data Structures & Algorithm Analysis in C++. 3rd Ed. Addison Wesley (1999) Weiss M. Data Structures & Algorithm Analysis in C++. 3rd Ed. Addison Wesley (1999) Hanan Samet The Design and Analysis of Spatial Data Structures Addison Wesley (1989) Hanan Samet The Design and Analysis of Spatial Data Structures Addison Wesley (1989) Harvey M. Deitel, Paul J. Deitel. Visual C++.NET :How to Program Prentice Hall (2003) Harvey M. Deitel, Paul J. Deitel. Visual C++.NET :How to Program Prentice Hall (2003) Reference Books

5 Assessment Pattern Compulsory: exam mark >= 30 Exam (70%) 2 Tests (7% +8%) 2 Assignments (6% + 9%) Tutorial exercises (0%)

6 Better not read the textbook ahead of time Better not read the textbook ahead of time Try to keep up with the lecture Try to keep up with the lecture Read the materials carefully after the lecture Read the materials carefully after the lecture Do the assignments on your own Do the assignments on your own You may discuss with each other You may discuss with each other You may study materials available on internet You may study materials available on internet You may refer to any book You may refer to any book But the details should be entirely your work But the details should be entirely your work Respond in time if you have any suggestions to the course or problems with the course Respond in time if you have any suggestions to the course or problems with the course How to learn this course?

7 OVERVIEW OF CONTENTS

8 Why Data Structures? Why Data Structures? What to do when we have a large amount of data to deal with? What to do when we have a large amount of data to deal with? Organize it in ways easy to understand Organize it in ways easy to understand Space efficiency Space efficiency Time efficiency Time efficiency Easy to display and transform Easy to display and transform Overview

9 Linked List Linked List Tree Tree Stack Stack Queue Queue Hashing Hashing Overview Data Label

10 Overview PART I PART I Program Complexities Program Complexities Abstract Data Types Abstract Data Types Linked lists Linked lists Trees Trees Stacks Stacks Queues Queues Heaps Heaps Hash tables Hash tables

11 Overview PART II PART II Vectors and Bitmaps Vectors and Bitmaps Quadtrees and Octrees Quadtrees and Octrees The handling of 2D and 3D data The handling of 2D and 3D data Geometric Structures Geometric Structures Spatial Layout Spatial Layout Shape and Attributes Shape and Attributes Connectivity of Components Connectivity of Components

12 Data Structures for Media Program Complexities

13 What is an algorithm? What is an algorithm? A sequence of elementary computational steps that transform the input into the output What for? What for? A tool for solving well-specified computational problems, e.g., Sorting, Matrix Multiplication What do we need to do with an algorithm? What do we need to do with an algorithm? Correctness Proof: Correctness Proof: for every input instance, it halts with the correct output Performance Analysis: Performance Analysis: How does the algorithm behave as the problem size gets large both in running time and storage requirement Algorithms

14 Input : Input : Output: A permutation (re-ordering) of the input sequence such that a ’ 0 a ’ 1 … a ’ n-1 the input sequence such that a ’ 0  a ’ 1  …  a ’ n-1Example: => => A Sorting Problem

15 5, 3, 1, 2, 6, 4 3, 5, 1, 2, 6, 4 1, 3, 5, 2, 6, 4 1, 2, 3, 5, 6, 4 1, 2, 3, 4, 5, 6 Insertion Sort Note that when we are dealing with kth number, the first k-1 numbers are already sorted

16 To sort A[0,1, …,n-1] in place To sort A[0,1, …,n-1] in place Steps: Steps: Pick element A[j] Pick element A[j] Move A[j-1, …,0] to the right until proper position for A[j] is found Move A[j-1, …,0] to the right until proper position for A[j] is found Insertion Sort Currently sorted part Currently unsorted part j0…j+1.. 1 3 5 2 6 4 1 3 5 5 6 4 1 3 3 5 6 4 1 2 3 5 6 4 5, 3, 1, 2, 6, 4 3, 5, 1, 2, 6, 4 1, 3, 5, 2, 6, 4 1, 2, 3, 5, 6, 4 1, 2, 3, 4, 5, 6 Example 1 3 5 2 6 4

17 Insertion Sort Insertion-Sort (A) 1. for j=1 to n-1 1. for j=1 to n-1 2. key=A[j] 2. key=A[j] 3. i=j-1 3. i=j-1 4. while i>=0 and A[i]>key 4. while i>=0 and A[i]>key 5. A[i+1]=A[i] 5. A[i+1]=A[i] 6. i=i-1 6. i=i-1 7. A[i+1]=key 7. A[i+1]=key j=1 5 3 1 2 6 4 j=2 3 5 1 2 6 4 j=3 1 3 5 2 6 4 j=4 1 2 3 5 6 4 j=5 1 2 3 5 6 4 1 2 3 4 5 6 1 2 3 4 5 6 A[0] A[1] A[2] A[3] A[4] A[5] 1 3 5 2 6 4 1 3 5 5 6 4 1 3 3 5 6 4 1 2 3 5 6 4 j=3

18 We only consider algorithms with loops We only consider algorithms with loops Find a property as loop invariant Find a property as loop invariant How to show something is loop invariant? How to show something is loop invariant? Initialization: Initialization: It is true prior to the first iteration of the loop Maintenance: Maintenance: If it is true before an iteration, it remains true before the next iteration Termination: Termination: When the loop terminates, the invariant gives a useful property that helps to show the algorithm is correct Correctness of Algorithm

19 Correctness of Insertion Sort loop invariant loop invariant At start of each iteration of for loop, A[0..j-1] consists of the elements originally in A[0..j-1] but in sorted order At start of each iteration of for loop, A[0..j-1] consists of the elements originally in A[0..j-1] but in sorted order Initialization Initialization Before the first iteration, j=1. => A[0.. j-1] contains only A[0]. => Loop invariant holds prior to the first iteration. Before the first iteration, j=1. => A[0.. j-1] contains only A[0]. => Loop invariant holds prior to the first iteration. Maintenance Maintenance In each iteration, the algorithm moves A[j-1],A[j-2],A[j-3].. to the right until the proper position for A[j] is found. Then A[j] is inserted. => if the loop invariant is true before an iteration, it remains true before next iteration. In each iteration, the algorithm moves A[j-1],A[j-2],A[j-3].. to the right until the proper position for A[j] is found. Then A[j] is inserted. => if the loop invariant is true before an iteration, it remains true before next iteration. Termination Termination The outer loop ends with j=n. Substituting n for j in the loop invariant, we get “ A[______] consists of the n sorted elements. ” The outer loop ends with j=n. Substituting n for j in the loop invariant, we get “ A[______] consists of the n sorted elements. ”

20 The running time T(n) = c 1 *n+c 2 *(n-1)+c 3 *(n-1)+c 4 *(  j=1..n-1 (t j +1))+c 5 *(  j=1..n-1 t j )+c 6 *(  j=1..n-1 t j )+c 7 *(n-1) t j = no. of times that line 5,6 are executed, for each j. Insertion-Sort(A) 1for j = 1 to n-1 2key = A[j] 3i = j-1 4while i >= 0 and A[i] > key 5A[i+1] = A[i] 6i = i - 1 7A[i+1] = key n n-1  j=1..n-1 (t j +1)  j=1..n-1 t j n-1 c1c2c3c4c5c6c7c1c2c3c4c5c6c7 timesCost c 1, c 2,.. = running time for executing line 1, line 2, etc. Running time of Insertion Sort

21 T(n) = c 1 *n+c 2 *(n-1)+c 3 *(n-1)+c 4 *(  j=1..n-1 (t j +1))+ c 5 *(  j=1..n-1 t j )+c 6 *(  j=1..n-1 t j )+c 7 *(n-1) Worse case: Reversely sorted  inner loop body executed for all previous elements.  t j =j.  T(n) = c 1 *n+c 2 *(n-1)+c 3 *(n-1)+c 4 *(  j=1..n-1 (j+1))+ c 5 *(  j=1..n-1 j)+c 6 *(  j=1..n-1 j)+c 7 *(n-1)  T(n) = An 2 +Bn+C  T(n) = c 1 *n+c 2 *(n-1)+c 3 *(n-1)+c 4 *(  j=1..n-1 (j+1))+ c 5 *(  j=1..n-1 j)+c 6 *(  j=1..n-1 j)+c 7 *(n-1) Note :  j=1..n-1 j = n(n-1)/2  j=1..n-1 (j+1) = (n+2)(n-1)/2 Analyzing Insertion Sort

22 T(n)=c 1 *n+c 2 *(n-1)+c 3 *(n-1)+c 4 *(  j=1..n-1 (t j +1))+c 5 *(  j=1..n-1 t j )+c 6 *(  j=1..n-1 t j )+c 7 *(n-1) Worst caseReverse sorted  inner loop body executed for all previous elements. So, t j =j.  T(n) is quadratic (square): T(n)=An 2 +Bn+C Average caseHalf elements in A[0..j-1] are less than A[j]. So, t j = j/2  T(n) is also quadratic: T(n)=An 2 +Bn+C Best case Already sorted  inner loop body never executed. So, t j =0.  T(n) is linear: T(n)=An+B Analyzing Insertion Sort

23 (Usually) Worst case Analysis: T(n) = max time on any input of size n Knowing it gives us a guarantee about the upper bound. In some cases, worst case occurs fairly often (Sometimes) Average case Analysis: T(n) = average time over all inputs of size n Average case is often as bad as worst case. There really exists “ good ” example (Rarely) Best case Analysis: Cheat with slow algorithm that works fast on some input. Good only for showing bad lower bound. (New) Smoothed Analysis Average in the local region instead of all inputs Kinds of Analysis

24 Worst Case: maximum value Worst Case: maximum value Average Case: average value Average Case: average value Best Case: minimum value Best Case: minimum value Worst case Average case Best case 0 1 2 3 4 n

25 Order of Growth f(n)n=20n=40n=60 Log 2 n Sqrt(n) n n log 2 n n2n2n2n2 n4n4n4n4 12.96 sec 2n2n2n2n n! 5.32 * 10 -6 sec4.32 * 10 -6 sec5.91 * 10 -6 sec 6.32 * 10 -6 sec4.47 * 10 -6 sec7.75 * 10 -6 sec 40 * 10 -6 sec20 * 10 -6 sec60 * 10 -6 sec 213 * 10 -6 sec86 * 10 -6 sec354 * 10 -6 sec 1600 * 10 -6 sec400 * 10 -6 sec3600 * 10 -6 sec 2.56 sec0.16 sec 12.73 days1.05 sec36571 years 2.56 * 10 34 years77147 years2.64 * 10 68 yearsExamples: Running time of algorithm in microseconds (in term of data size n) Algorithm A Algorithm B Algorithm C Algorithm D Algorithm E Algorithm F Algorithm G Algorithm H

26 Assume: an algorithm can solve a problem of size n in f(n) microseconds (10 -6 seconds). Note: for example, For all f(n) in  (n 4 ), the shapes of their curves are nearly the same as f(n)=n 4. Order of Growth

27 Asymptotic Tight Bound:  f(n)=  (g(n)) Intuitively like “ = ” f(n) grows as fast as g(n) Asymptotic Upper Bound:  f(n)= O (g(n)) Intuitively like “≤” f(n) grows not faster than g(n) Asymptotic Lower Bound Ωf(n)= Ω(g(n)) Intuitively like “≥” f(n) grows not slower than g(n) Asymptotic Notation Formal formulations: (We’ll not go into details of these equations.) O(g(n)) = { f(n): there exist positive constants c and n 0 such that 0  f(n)  cg(n) for all n  n 0 } f(n) cg(n) n0n0 f(n) is O(g(n)) f(n) c 1 g(n) c 2 g(n) n0n0 f(n) is  (g(n))  (g(n)) = { f(n): there exist positive constants c 1, c 2, n 0 such that 0  c 1 g(n)  f(n)  c 2 g(n) for all n  n 0 }

28 Asymptotic Notation Note that:Running time of Insertion Sort is  (n 2 ) is incorrect. Why? Worst case running time of Insertion Sort is  (n 2 )-- correct / incorrect* Best case running time of Insertion Sort is  (n) -- correct / incorrect* Running time of Insertion Sort is O(n 2 ) -- correct / incorrect* Asymptotic Tight Bound:  Intuitively like “ = ” Asymptotic Upper Bound:  Intuitively like “≤” Asymptotic Lower Bound ΩIntuitively like “≥” What about “<” ? o versus O : o means better e.g. nlogn=o(n 2 ) means nlogn grows slower than n 2

29 Relationship between typical functions Relationship between typical functions log n = o (n) log n = o (n) n = o (n log n) n = o (n log n) n c = o (2 n ) where n c may be n 2, n 4, etc. n c = o (2 n ) where n c may be n 2, n 4, etc. If f(n)=n+log n, we call log n lower order terms If f(n)=n+log n, we call log n lower order terms (You are not required to analyze, but remember these relations) log n < sqrt(n) < n < nlog n < n 2 < n 4 < 2 n < n! Rule of combination (for positive function) Rule of combination (for positive function) f(n) = O (g(n)) and h(n) = O (k(n)) f(n) = O (g(n)) and h(n) = O (k(n))  f(n)h(n) = O (g(n)k(n))  f(n)h(n) = O (g(n)k(n))  f(n)+h(n) = O (g(n)+k(n))  f(n)+h(n) = O (g(n)+k(n)) Asymptotic Notation

30 When calculating asymptotic running time When calculating asymptotic running time Drop lower order terms Drop lower order terms Ignore leading constants Ignore leading constants Example 1: T(n) = An 2 +Bn+C Example 1: T(n) = An 2 +Bn+C An 2 An 2 T(n) = O(n 2 ) T(n) = O(n 2 ) Example 2: T(n) = Anlogn+Bn 2 +Cn+D Example 2: T(n) = Anlogn+Bn 2 +Cn+D Bn 2 Bn 2 T(n) = O(n 2 ) T(n) = O(n 2 ) Remember: We can write T(n)=O(n 2 ); T(n)=  (n 2 ), but not T(n) O(n 2 ); Remember: We can write T(n)=O(n 2 ); T(n)=  (n 2 ), but not T(n) ≤ O(n 2 ); Asymptotic Notation

31 Insertion-Sort(A) 1for j = 1 to n-1 2key = A[j] 3i = j-1 4while i >= 0 and A[i] > key 5A[i+1] = A[i] 6i = i – 1 7A[i+1] = key Very often the algorithm complexity can be observed directly from simple algorithms O(n 2 ) There are 4 very useful rules for such Big-Oh analysis... Asymptotic Performance

32 General rules for Big-Oh Analysis: Rule 1. FOR LOOPS The running time of a for loop is at most the running time of the statements inside the for loop (including tests) times no. of iterations Rule 2. NESTED FOR LOOPS The total running time of a statement inside a group of nested loops is the running time of the statement multiplied by the product of the sizes of all the loops. for (i=0;i<N;i++) for (j=0;j<N;j++) k++; O(N 2 ) for (i=0;i<N;i++) a++; O(N) Rule 3. CONSECUTIVE STATEMENTS Count the maximum one. for (i=0;i<N;i++) a++; for (i=0;i<N;i++) for (j=0;j<N;j++) k++; O(N 2 ) Rule 4. IF / ELSE For the fragment: If (condition) S1 else S2, take the test + the maximum for S1 and S2. Asymptotic Performance

33 Example of Big-Oh Analysis: void function1(int n) {int i, j; int x=0; for (i=0;i<n;i++) x++; for (i=0;i<n;i++) for (j=0;j<n;j++) x++; } This function is O(__) void function2(int n) {int i; int x=0; for (i=0;i<n/2;i++) x++; } This function is O(__) Asymptotic Performance

34 Example of Big-Oh Analysis: void function4(int n) {int i; int x=0; for (i=0;i<10;i++) for (j=0;j<n/2;j++) x--; } void function3(int n) {int i; int x=0; if (n>10) for (i=0;i<n/2;i++) x++; else {for (i=0;i<n;i++) for (j=0;j<n/2;j++) x--; } This function is O(__) Asymptotic Performance

35 Example of Big-Oh Analysis: void function5(int n) {int i; for (i=0;i<n;i++) if (IsSignificantData(i)) SpecialTreatment(i); } This function is O(____) Suppose IsSignificantData is O(n), SpecialTreatment is O(n log n) Asymptotic Performance

36 Recursion Recursion int Power(int base,int pow) {if (pow==0) return 1; else return base*Power(base,pow-1); } Example Example 3 2 =9 Power(3,2)=3*Power(3,1)Power(3,1)=3*Power(3,0)Power(3,0)=1 T(n): the number of multiplications needed to compute Power(3,n) T(n)=T(n-1)+1; T(0)=0 T(n)=n Running time of function Power(3,n) is O(n) Asymptotic Performance

37 Why recursion? Why recursion? Can ’ t we just use iteration (loop)? Can ’ t we just use iteration (loop)? The reason for recursion The reason for recursion Easy to program in some situations Easy to program in some situations Disadvantage Disadvantage More time and space required More time and space required Example: Example: Tower of Hanoi Problem Tower of Hanoi Problem Asymptotic Performance

38 Given some rods for stacking disks. Rules: (1) The disks must be stacked in order of size. (2) Each time move 1 disk. The problem: Use fewest steps to move all disks from the source rod to the target without violating the rules through the whole process (given one intermediate rod for buffering)? a source rodan intermediate roda target rod Tower of Hanoi

39 Suppose you can manage the n-1 disks Suppose you can manage the n-1 disks How do you solve the n disks case? How do you solve the n disks case? A recursive solution: A recursive solution: Step 1: Move the top n-1 disks from source rod to intermediate rod via target rod Step 1: Move the top n-1 disks from source rod to intermediate rod via target rod Step 2: Move the largest disk from source rod to target rod Step 2: Move the largest disk from source rod to target rod Step 3: Move the n-1 disks from intermediate rod to target rod via source rod Step 3: Move the n-1 disks from intermediate rod to target rod via source rod Tower of Hanoi

40 void Towers (int n, int Source, int Target, int Interm) {if (n==1) Console::Write(S “ \nFrom {0} to {1} ”, Source.ToString(), Target.ToString()); else { Towers(n-1, Source, Interm, Target); Towers(1, Source, Target, Interm); Towers(n-1, Interm, Target, Source); } How many “ Console::Write ” are executed?T(n)=2T(n-1)+1 Tower of Hanoi Towers (3,’A’,’C’,’B’) Towers (2,’A’,’B’,’C’) Towers (1,’A’,’C’,’B’) Towers (2,’B’,’C’,’A’)

41 T(n)=T(n-1)+A; T(1)=1 T(n)=T(n-1)+A; T(1)=1  T(n)=O(n)  T(n)=O(n) T(n)=T(n-1)+n; T(1)=1 T(n)=T(n-1)+n; T(1)=1  T(n)=O(n 2 )  T(n)=O(n 2 ) T(n)=2T(n/2) + n; T(1)=1 T(n)=2T(n/2) + n; T(1)=1  T(n)=O(n log n)  T(n)=O(n log n) T(n)=2T(n-1)+1 T(n)=2T(n-1)+1  T(n)=O(2 n )  T(n)=O(2 n ) More general form: T(n)=aT(n/b)+cn More general form: T(n)=aT(n/b)+cn Master ’ s Theorem Master ’ s Theorem (You are not required to know) Recursive Relation

42 Introduction / Insertion sort Introduction / Insertion sort Correctness of algorithm Correctness of algorithm Worst /Average case analysis Worst /Average case analysis Order of growth Order of growth Asymptotic Performance Asymptotic Performance 4 rules for asymptotic analysis 4 rules for asymptotic analysis Recursive programs Recursive programs Summary


Download ppt "Data Structures for Media Introduction. Dr. Minming Li Department of Computer Science Room :Y6426 Phone :27889538"

Similar presentations


Ads by Google