Computational Complexity, Choosing Data Structures Svetlin Nakov Telerik Corporation www.telerik.com.

Slides:



Advertisements
Similar presentations
Data Structures Through C
Advertisements

Analysis of Computer Algorithms
Growth-rate Functions
Algorithms Algorithm: what is it ?. Algorithms Algorithm: what is it ? Some representative problems : - Interval Scheduling.
College of Information Technology & Design
Designed and Presented by Dr. Ayman Elshenawy Elsefy Dept. of Systems & Computer Eng.. Al-Azhar University
12-Apr-15 Analysis of Algorithms. 2 Time and space To analyze an algorithm means: developing a formula for predicting how fast an algorithm is, based.
Analysis of Algorithms CS 477/677
Program Efficiency & Complexity Analysis
Analysis of Algorithms CS Data Structures Section 2.6.
Computational Complexity 1. Time Complexity 2. Space Complexity.
Fall 2006CENG 7071 Algorithm Analysis. Fall 2006CENG 7072 Algorithmic Performance There are two aspects of algorithmic performance: Time Instructions.
Chapter 10 Algorithm Efficiency
Complexity Analysis (Part I)
Analysis of Algorithms (Chapter 4)
Cmpt-225 Algorithm Efficiency.
Analysis of Algorithms 7/2/2015CS202 - Fundamentals of Computer Science II1.
Analysis of Algorithms CPS212 Gordon College. Measuring the efficiency of algorithms There are 2 algorithms: algo1 and algo2 that produce the same results.
Chapter 1 Introduction Definition of Algorithm An algorithm is a finite sequence of precise instructions for performing a computation or for solving.
Analysis of Algorithms Spring 2015CS202 - Fundamentals of Computer Science II1.
Algorithm Analysis (Big O)
Overview of Data Structures and Basic Algorithms. Computational Complexity. Asymptotic Notation Svetlin Nakov Telerik Software Academy academy.telerik.com.
COMP s1 Computing 2 Complexity
Analysis of Performance
Program Performance & Asymptotic Notations CSE, POSTECH.
Chapter 2.6 Comparison of Algorithms modified from Clifford A. Shaffer and George Bebis.
Week 2 CS 361: Advanced Data Structures and Algorithms
Lecture 2 Computational Complexity
Analysis of Algorithms
Algorithm Evaluation. What’s an algorithm? a clearly specified set of simple instructions to be followed to solve a problem a way of doing something What.
Computational Complexity, Choosing Data Structures Svetlin Nakov Telerik Corporation
Program Efficiency & Complexity Analysis. Algorithm Review An algorithm is a definite procedure for solving a problem in finite number of steps Algorithm.
Java Methods Big-O Analysis of Algorithms Object-Oriented Programming
ICOM 4035 – Data Structures Dr. Manuel Rodríguez Martínez Electrical and Computer Engineering Department Lecture 11 – September 25, 2001.
1 Algorithms  Algorithms are simply a list of steps required to solve some particular problem  They are designed as abstractions of processes carried.
Computational Complexity of Fundamental Data Structures, Choosing a Data Structure.
Algorithm Analysis (Big O)
تصميم وتحليل الخوارزميات عال311 Chapter 3 Growth of Functions
Algorithm Complexity L. Grewe 1. Algorithm Efficiency There are often many approaches (algorithms) to solve a problem. How do we choose between them?
E.G.M. PetrakisAlgorithm Analysis1  Algorithms that are equally correct can vary in their utilization of computational resources  time and memory  a.
SoftUni Team Technical Trainers Software University Data Structures, Algorithms and Complexity Analyzing Algorithm Complexity. Asymptotic.
BITS Pilani Pilani Campus Data Structure and Algorithms Design Dr. Maheswari Karthikeyan Lecture1.
Analysis of Algorithms Spring 2016CS202 - Fundamentals of Computer Science II1.
GC 211:Data Structures Week 2: Algorithm Analysis Tools Slides are borrowed from Mr. Mohammad Alqahtani.
Computational Complexity, Choosing Data Structures Svetlin Nakov Telerik Software Academy Manager Technical Trainer
Data Structures & Algorithm CS-102 Lecture 12 Asymptotic Analysis Lecturer: Syeda Nazia Ashraf 1.
Data Structures I (CPCS-204) Week # 2: Algorithm Analysis tools Dr. Omar Batarfi Dr. Yahya Dahab Dr. Imtiaz Khan.
Algorithm Analysis 1.
CMPT 438 Algorithms.
Design and Analysis of Algorithms
Analysis of Algorithms
Analysis of Algorithms
GC 211:Data Structures Week 2: Algorithm Analysis Tools
Analysis of Algorithms
Introduction to Algorithms
GC 211:Data Structures Algorithm Analysis Tools
Complexity Analysis.
Chapter 2 Fundamentals of the Analysis of Algorithm Efficiency
Algorithms Complexity and Data Structures Efficiency
Building Java Programs
GC 211:Data Structures Algorithm Analysis Tools
Analysis of Algorithms
Chapter 2.
Algorithm Analysis Bina Ramamurthy CSE116A,B.
GC 211:Data Structures Algorithm Analysis Tools
CSE 2010: Algorithms and Data Structures Algorithms
Analysis of Algorithms
Analysis of Algorithms
CMPT 225 Lecture 6 – Review of Complexity Analysis using the Big O notation + Comparing List ADT class implementations.
Presentation transcript:

Computational Complexity, Choosing Data Structures Svetlin Nakov Telerik Corporation

1. Algorithms Complexity and Asymptotic Notation Time and Memory Complexity Time and Memory Complexity Mean, Average and Worst Case Mean, Average and Worst Case 2. Fundamental Data Structures – Comparison Arrays vs. Lists vs. Trees vs. Hash-Tables Arrays vs. Lists vs. Trees vs. Hash-Tables 3. Choosing Proper Data Structure 2

Data structures and algorithms are the foundation of computer programming Data structures and algorithms are the foundation of computer programming Algorithmic thinking, problem solving and data structures are vital for software engineers Algorithmic thinking, problem solving and data structures are vital for software engineers All.NET developers should know when to use T[], LinkedList, List, Stack, Queue, Dictionary, HashSet, SortedDictionary and SortedSet All.NET developers should know when to use T[], LinkedList, List, Stack, Queue, Dictionary, HashSet, SortedDictionary and SortedSet Computational complexity is important for algorithm design and efficient programming Computational complexity is important for algorithm design and efficient programming 3

Asymtotic Notation

Why we should analyze algorithms? Why we should analyze algorithms? Predict the resources that the algorithm requires Predict the resources that the algorithm requires Computational time (CPU consumption) Computational time (CPU consumption) Memory space (RAM consumption) Memory space (RAM consumption) Communication bandwidth consumption Communication bandwidth consumption The running time of an algorithm is: The running time of an algorithm is: The total number of primitive operations executed (machine independent steps) The total number of primitive operations executed (machine independent steps) Also known as algorithm complexity Also known as algorithm complexity 5

What to measure? What to measure? Memory Memory Time Time Number of steps Number of steps Number of particular operations Number of particular operations Number of disk operations Number of disk operations Number of network packets Number of network packets Asymptotic complexity Asymptotic complexity 6

Worst-case Worst-case An upper bound on the running time for any input of given size An upper bound on the running time for any input of given size Average-case Average-case Assume all inputs of a given size are equally likely Assume all inputs of a given size are equally likely Best-case Best-case The lower bound on the running time The lower bound on the running time 7

Sequential search in a list of size n Sequential search in a list of size n Worst-case: Worst-case: n comparisons n comparisons Best-case: Best-case: 1 comparison 1 comparison Average-case: Average-case: n/2 comparisons n/2 comparisons The algorithm runs in linear time The algorithm runs in linear time Linear number of operations Linear number of operations…………………n 8

Algorithm complexity is rough estimation of the number of steps performed by given computation depending on the size of the input data Algorithm complexity is rough estimation of the number of steps performed by given computation depending on the size of the input data Measured through asymptotic notation Measured through asymptotic notation O(g) where g is a function of the input data size O(g) where g is a function of the input data size Examples: Examples: Linear complexity O(n) – all elements are processed once (or constant number of times) Linear complexity O(n) – all elements are processed once (or constant number of times) Quadratic complexity O(n 2 ) – each of the elements is processed n times Quadratic complexity O(n 2 ) – each of the elements is processed n times 9

Asymptotic upper bound Asymptotic upper bound O-notation (Big O notation) O-notation (Big O notation) For given function g(n), we denote by O(g(n)) the set of functions that are different than g(n) by a constant For given function g(n), we denote by O(g(n)) the set of functions that are different than g(n) by a constant Examples: Examples: 3 * n 2 + n/ O(n 2 ) 3 * n 2 + n/ O(n 2 ) 4*n*log 2 (3*n+1) + 2*n-1 O(n * log n) 4*n*log 2 (3*n+1) + 2*n-1 O(n * log n) O(g(n)) = { f(n) : there exist positive constants c and n 0 such that f(n) = n 0 } 10

11ComplexityNotationDescriptionconstantO(1) Constant number of operations, not depending on the input data size, e.g. n = operations logarithmic O(log n) Number of operations propor- tional of log 2 (n) where n is the size of the input data, e.g. n = operations linearO(n) Number of operations proportional to the input data size, e.g. n = operations

12ComplexityNotationDescriptionquadratic O(n 2 ) Number of operations proportional to the square of the size of the input data, e.g. n = operations cubic O(n 3 ) Number of operations propor- tional to the cube of the size of the input data, e.g. n = operations exponential O(2 n ), O(k n ), O(n!) Exponential number of operations, fast growing, e.g. n = operations

13Complexity O(1) < 1 s O(log(n)) O(n) O(n*log(n)) O(n 2 ) < 1 s 2 s2 s2 s2 s min O(n 3 ) < 1 s 20 s 5 hours 231 days O(2 n ) < 1 s 260 days hangshangshangshangs O(n!) < 1 s hangshangshangshangshangshangs O(n n ) min hangshangshangshangshangshangs

Complexity can be expressed as formula on multiple variables, e.g. Complexity can be expressed as formula on multiple variables, e.g. Algorithm filling a matrix of size n * m with natural numbers 1, 2, … will run in O(n*m) Algorithm filling a matrix of size n * m with natural numbers 1, 2, … will run in O(n*m) DFS traversal of graph with n vertices and m edges will run in O(n + m) DFS traversal of graph with n vertices and m edges will run in O(n + m) Memory consumption should also be considered, for example: Memory consumption should also be considered, for example: Running time O(n), memory requirement O(n 2 ) Running time O(n), memory requirement O(n 2 ) n = OutOfMemoryException n = OutOfMemoryException 14

A polynomial-time algorithm is one whose worst-case time complexity is bounded above by a polynomial function of its input size A polynomial-time algorithm is one whose worst-case time complexity is bounded above by a polynomial function of its input size Example of worst-case time complexity Example of worst-case time complexity Polynomial-time: log n, 2n, 3n 3 + 4n, 2 * n log n Polynomial-time: log n, 2n, 3n 3 + 4n, 2 * n log n Non polynomial-time : 2 n, 3 n, n k, n! Non polynomial-time : 2 n, 3 n, n k, n! Non-polynomial algorithms don't work for large input data sets Non-polynomial algorithms don't work for large input data sets W(n) O(p(n)) 15

Examples

Runs in O(n) where n is the size of the array Runs in O(n) where n is the size of the array The number of elementary steps is ~ n The number of elementary steps is ~ n int FindMaxElement(int[] array) { int max = array[0]; int max = array[0]; for (int i=0; i<array.length; i++) for (int i=0; i<array.length; i++) { if (array[i] > max) if (array[i] > max) { max = array[i]; max = array[i]; } } return max; return max;}

Runs in O(n 2 ) where n is the size of the array Runs in O(n 2 ) where n is the size of the array The number of elementary steps is ~ n*(n+1) / 2 The number of elementary steps is ~ n*(n+1) / 2 long FindInversions(int[] array) { long inversions = 0; long inversions = 0; for (int i=0; i<array.Length; i++) for (int i=0; i<array.Length; i++) for (int j = i+1; j<array.Length; i++) for (int j = i+1; j<array.Length; i++) if (array[i] > array[j]) if (array[i] > array[j]) inversions++; inversions++; return inversions; return inversions;}

Runs in cubic time O(n 3 ) Runs in cubic time O(n 3 ) The number of elementary steps is ~ n 3 The number of elementary steps is ~ n 3 decimal Sum3(int n) { decimal sum = 0; decimal sum = 0; for (int a=0; a<n; a++) for (int a=0; a<n; a++) for (int b=0; b<n; b++) for (int b=0; b<n; b++) for (int c=0; c<n; c++) for (int c=0; c<n; c++) sum += a*b*c; sum += a*b*c; return sum; return sum;}

Runs in quadratic time O(n*m) Runs in quadratic time O(n*m) The number of elementary steps is ~ n*m The number of elementary steps is ~ n*m long SumMN(int n, int m) { long sum = 0; long sum = 0; for (int x=0; x<n; x++) for (int x=0; x<n; x++) for (int y=0; y<m; y++) for (int y=0; y<m; y++) sum += x*y; sum += x*y; return sum; return sum;}

Runs in quadratic time O(n*m) Runs in quadratic time O(n*m) The number of elementary steps is ~ n*m + min(m,n)*n The number of elementary steps is ~ n*m + min(m,n)*n long SumMN(int n, int m) { long sum = 0; long sum = 0; for (int x=0; x<n; x++) for (int x=0; x<n; x++) for (int y=0; y<m; y++) for (int y=0; y<m; y++) if (x==y) if (x==y) for (int i=0; i<n; i++) for (int i=0; i<n; i++) sum += i*x*y; sum += i*x*y; return sum; return sum;}

Runs in exponential time O(2 n ) Runs in exponential time O(2 n ) The number of elementary steps is ~ 2 n The number of elementary steps is ~ 2 n decimal Calculation(int n) { decimal result = 0; decimal result = 0; for (int i = 0; i < (1<<n); i++) for (int i = 0; i < (1<<n); i++) result += i; result += i; return result; return result;}

Runs in linear time O(n) Runs in linear time O(n) The number of elementary steps is ~ n The number of elementary steps is ~ n decimal Factorial(int n) { if (n==0) if (n==0) return 1; return 1; else else return n * Factorial(n-1); return n * Factorial(n-1);}

Runs in exponential time O(2 n ) Runs in exponential time O(2 n ) The number of elementary steps is ~ Fib(n+1) where Fib(k) is the k -th Fibonacci's number The number of elementary steps is ~ Fib(n+1) where Fib(k) is the k -th Fibonacci's number decimal Fibonacci(int n) { if (n == 0) if (n == 0) return 1; return 1; else if (n == 1) else if (n == 1) return 1; return 1; else else return Fibonacci(n-1) + Fibonacci(n-2); return Fibonacci(n-1) + Fibonacci(n-2);}

Examples

26 Data Structure AddFindDelete Get-by- index Array ( T[] ) O(n)O(n)O(n)O(1) Linked list ( LinkedList ) O(1)O(n)O(n)O(n) Resizable array list ( List ) O(1)O(n)O(n)O(1) Stack ( Stack ) O(1)-O(1)- Queue ( Queue ) O(1)-O(1)-

27 Data Structure AddFindDelete Get-by- index Hash table ( Dictionary ) O(1)O(1)O(1)- Tree-based dictionary ( Sorted Dictionary ) O(log n) - Hash table based set ( HashSet ) O(1)O(1)O(1)- Tree based set ( SortedSet ) O(log n) -

Arrays ( T[] ) Arrays ( T[] ) Use when fixed number of elements should be processed by index Use when fixed number of elements should be processed by index Resizable array lists ( List ) Resizable array lists ( List ) Use when elements should be added and processed by index Use when elements should be added and processed by index Linked lists ( LinkedList ) Linked lists ( LinkedList ) Use when elements should be added at the both sides of the list Use when elements should be added at the both sides of the list Otherwise use resizable array list ( List ) Otherwise use resizable array list ( List ) 28

Stacks ( Stack ) Stacks ( Stack ) Use to implement LIFO (last-in-first-out) behavior Use to implement LIFO (last-in-first-out) behavior List could also work well List could also work well Queues ( Queue ) Queues ( Queue ) Use to implement FIFO (first-in-first-out) behavior Use to implement FIFO (first-in-first-out) behavior LinkedList could also work well LinkedList could also work well Hash table based dictionary ( Dictionary ) Hash table based dictionary ( Dictionary ) Use when key-value pairs should be added fast and searched fast by key Use when key-value pairs should be added fast and searched fast by key Elements in a hash table have no particular order Elements in a hash table have no particular order 29

Balanced search tree based dictionary ( SortedDictionary ) Balanced search tree based dictionary ( SortedDictionary ) Use when key-value pairs should be added fast, searched fast by key and enumerated sorted by key Use when key-value pairs should be added fast, searched fast by key and enumerated sorted by key Hash table based set ( HashSet ) Hash table based set ( HashSet ) Use to keep a group of unique values, to add and check belonging to the set fast Use to keep a group of unique values, to add and check belonging to the set fast Elements are in no particular order Elements are in no particular order Search tree based set ( SortedSet ) Search tree based set ( SortedSet ) Use to keep a group of ordered unique values Use to keep a group of ordered unique values 30

Algorithm complexity is rough estimation of the number of steps performed by given computation Algorithm complexity is rough estimation of the number of steps performed by given computation Complexity can be logarithmic, linear, n log n, square, cubic, exponential, etc. Complexity can be logarithmic, linear, n log n, square, cubic, exponential, etc. Allows to estimating the speed of given code before its execution Allows to estimating the speed of given code before its execution Different data structures have different efficiency on different operations Different data structures have different efficiency on different operations The fastest add / find / delete structure is the hash table – O(1) for all these operations The fastest add / find / delete structure is the hash table – O(1) for all these operations 31

Questions?

1. A text file students.txt holds information about students and their courses in the following format: Using SortedDictionary print the courses in alphabetical order and for each of them prints the students ordered by family and then by name: 33 Kiril | Ivanov | C# Stefka | Nikolova | SQL Stela | Mineva | Java Milena | Petrova | C# Ivan | Grigorov | C# Ivan | Kolev | SQL C#: Ivan Grigorov, Kiril Ivanov, Milena Petrova Java: Stela Mineva SQL: Ivan Kolev, Stefka Nikolova

2. A large trade company has millions of articles, each described by barcode, vendor, title and price. Implement a data structure to store them that allows fast retrieval of all articles in given price range [x…y]. Hint: use OrderedMultiDictionary from Wintellect's Power Collections for.NET. Wintellect's Power Collections for.NET.Wintellect's Power Collections for.NET. 3. Implement a data structure PriorityQueue that provides a fast way to execute the following operations: add element; extract the smallest element. 4. Implement a class BiDictionary that allows adding triples {key1, key2, value} and fast search by key1, key2 or by both key1 and key2. Note: multiple values can be stored for given key. 34

5. A text file phones.txt holds information about people, their town and phone number: Duplicates can occur in people names, towns and phone numbers. Write a program to execute a sequence of commands from a file commands.txt : find(name) – display all matching records by given name (first, middle, last or nickname) find(name) – display all matching records by given name (first, middle, last or nickname) find(name, town) – display all matching records by given name and town find(name, town) – display all matching records by given name and town 35 Mimi Shmatkata | Plovdiv | Kireto | Varna | Daniela Ivanova Petrova | Karnobat | Bat Gancho | Sofia |