MotivationLocating the k largest subsequences: Main ideasResults Problem definitions Problem instance ( k=5 ) Bibliography -87422-52 34 9 -50 41 1 -43.

Slides:



Advertisements
Similar presentations
College of Information Technology & Design
Advertisements

Introduction to Computer Science 2 Lecture 7: Extended binary trees
PARTITIONAL CLUSTERING
CSE332: Data Abstractions Lecture 14: Beyond Comparison Sorting Dan Grossman Spring 2010.
© The McGraw-Hill Companies, Inc., Chapter 2 The Complexity of Algorithms and the Lower Bounds of Problems.
Transform & Conquer Replacing One Problem With Another Saadiq Moolla.
Greedy Algorithms Basic idea Connection to dynamic programming Proof Techniques.
A Linear Time Algorithm for the k Maximum Sums Problem By Gerth S. Brodal and Allan G. Jørgensen.
Heapsort By: Steven Huang. What is a Heapsort? Heapsort is a comparison-based sorting algorithm to create a sorted array (or list) Part of the selection.
1 6.3 Binary Heap - Other Heap Operations There is no way to find any particular key without a linear scan through the entire heap. However, if we know.
1 Sorting Problem: Given a sequence of elements, find a permutation such that the resulting sequence is sorted in some order. We have already seen: –Insertion.
Heapsort Idea: two phases: 1. Construction of the heap 2. Output of the heap For ordering number in an ascending sequence: use a Heap with reverse.
Optimal Merging Of Runs
Chapter 9 Graph algorithms. Sample Graph Problems Path problems. Connectedness problems. Spanning tree problems.
2 -1 Chapter 2 The Complexity of Algorithms and the Lower Bounds of Problems.
CSC 2300 Data Structures & Algorithms March 27, 2007 Chapter 7. Sorting.
2 -1 Analysis of algorithms Best case: easiest Worst case Average case: hardest.
Chapter 9 Graph algorithms Lec 21 Dec 1, Sample Graph Problems Path problems. Connectedness problems. Spanning tree problems.
Chapter 3: The Efficiency of Algorithms Invitation to Computer Science, C++ Version, Fourth Edition.
2 -1 Chapter 2 The Complexity of Algorithms and the Lower Bounds of Problems.
Efficient algorithms for the scaled indexing problem Biing-Feng Wang, Jyh-Jye Lin, and Shan-Chyun Ku Journal of Algorithms 52 (2004) 82–100 Presenter:
The Complexity of Algorithms and the Lower Bounds of Problems
Heaps and heapsort COMP171 Fall 2005 Part 2. Sorting III / Slide 2 Heap: array implementation Is it a good idea to store arbitrary.
The Design and Analysis of Algorithms
Dynamic Programming Introduction to Algorithms Dynamic Programming CSE 680 Prof. Roger Crawfis.
C o n f i d e n t i a l Developed By Nitendra NextHome Subject Name: Data Structure Using C Title: Overview of Data Structure.
Lecture 2 We have given O(n 3 ), O(n 2 ), O(nlogn) algorithms for the max sub-range problem. This time, a linear time algorithm! The idea is as follows:
CSCE350 Algorithms and Data Structure Lecture 17 Jianjun Hu Department of Computer Science and Engineering University of South Carolina
Order Statistics. Order statistics Given an input of n values and an integer i, we wish to find the i’th largest value. There are i-1 elements smaller.
The Lower Bounds of Problems
1 Introduction to Data Structures. 2 Course Name : Data Structure (CSI 221) Course Teacher : Md. Zakir Hossain Lecturer, Dept. of Computer Science Stamford.
Computer Sciences Department1. Sorting algorithm 3 Chapter 6 3Computer Sciences Department Sorting algorithm 1  insertion sort Sorting algorithm 2.
Multiple Sequence Alignments Craig A. Struble, Ph.D. Department of Mathematics, Statistics, and Computer Science Marquette University.
Data Structure Introduction.
Algorithm Analysis CS 400/600 – Data Structures. Algorithm Analysis2 Abstract Data Types Abstract Data Type (ADT): a definition for a data type solely.
Chapter 18: Searching and Sorting Algorithms. Objectives In this chapter, you will: Learn the various search algorithms Implement sequential and binary.
Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality Piotr Indyk, Rajeev Motwani The 30 th annual ACM symposium on theory of computing.
1 CSC 421: Algorithm Design & Analysis Spring 2014 Complexity & lower bounds  brute force  decision trees  adversary arguments  problem reduction.
Computer Science/Ch. Algorithmic Foundation of CS 4-1 Chapter 4 Chapter 4 Algorithmic Foundation of Computer Science.
Lecture 9COMPSCI.220.FS.T Lower Bound for Sorting Complexity Each algorithm that sorts by comparing only pairs of elements must use at least 
1. Searching The basic characteristics of any searching algorithm is that searching should be efficient, it should have less number of computations involved.
ProblemData StructuresLower Bound Preprocess a set of N 3-dimensional points into an I/O-efficient data structure, such that all points inside an axis.
Heap Sort Uses a heap, which is a tree-based data type Steps involved: Turn the array into a heap. Delete the root from the heap and insert into the array,
Sorting Lower Bounds n Beating Them. Recap Divide and Conquer –Know how to break a problem into smaller problems, such that –Given a solution to the smaller.
Introduction to NP Instructor: Neelima Gupta 1.
BITS Pilani Pilani Campus Data Structure and Algorithms Design Dr. Maheswari Karthikeyan Lecture1.
19 March More on Sorting CSE 2011 Winter 2011.
The Range Mode ProblemCell Probe Lower Bounds for the Range Mode ProblemThe Range k-frequency Problem Preprocess an array A of n elements into a space.
Advanced Sorting 7 2  9 4   2   4   7
Sorting With Priority Queue In-place Extra O(N) space
Decision Trees DEFINITION: DECISION TREE A decision tree is a tree in which the internal nodes represent actions, the arcs represent outcomes of an action,
Applied Discrete Mathematics Week 2: Functions and Sequences
Andreas Klappenecker [partially based on the slides of Prof. Welch]
Tries 07/28/16 11:04 Text Compression
Priority Queues An abstract data type (ADT) Similar to a queue
Data Structures and Algorithms
Priority Queues Chuan-Ming Liu
13 Text Processing Hongfei Yan June 1, 2016.
Optimal Merging Of Runs
(2,4) Trees 11/15/2018 9:25 AM Sorting Lower Bound Sorting Lower Bound.
The Complexity of Algorithms and the Lower Bounds of Problems
Optimal Merging Of Runs
(2,4) Trees 12/4/2018 1:20 PM Sorting Lower Bound Sorting Lower Bound.
Comparative RNA Structural Analysis
ITEC 2620M Introduction to Data Structures
Priority Queues An abstract data type (ADT) Similar to a queue
Chapter 8: Overview Comparison sorts: algorithms that sort sequences by comparing the value of elements Prove that the number of comparison required to.
Sorting We have actually seen already two efficient ways to sort:
Algorithm Course Algorithms Lecture 3 Sorting Algorithm-1
Presentation transcript:

MotivationLocating the k largest subsequences: Main ideasResults Problem definitions Problem instance ( k=5 ) Bibliography Frederickson’s algorithm finds the red nodes in O(k) time (no particular order) Q4Q4 2-52= = = = Q5Q H5H5 H4H4 H3H3 H2H2 H1H1   An optimal algorithm finding the subsequence A[i,…,j] maximizing  j t=i A[t] in O(n) time is described in [1].   In [2] we design an optimal O(n+k) time algorithm using O(k) space. This algorithm can be used to solve the 2-dimensional version of the problem in O(n 3 +k) time using O(n+k) space and in general the d-dimensional problem in O(n 2d-1 +k) time and O(n d-1 +k) space.   In [3] we generalize this algorithm to find the subsequences inducing the k largest sums among all subsequences of length between l and u.   In [3] we show an optimal O(n log k/n) bound for selecting the subsequence inducing the k ’th largest sum, by providing an algorithm with this running time and by proving a matching lower bound.   We also generalize the selection algorithm to select the k ’th largest subsequence among all subsequences of length between l and u, in O(n log k/n) time. Remark that in this case k≤(u-l+1)n. [1]Bentley, J.: Programming Pearls: algorithm design techniques. Commun. ACM 27(9)(1984) [2]Brodal, G. S. and Jørgensen, A. G.: A linear time algorithm for the k maximal sums problem. Proc. 32 nd International Symposium on Mathematical Foundations of Computer Science. [3]Brodal, G. S. and Jørgensen, A. G.: Sum selection in arrays. In submission. Given a Sequence A[1,…,n] of Numbers   Find a subsequence A[i,….,j] maximizing  j t=i A[t]. Given a Sequence A[1,…,n] of Numbers and Integer k   Find k subsequences maximizing  j t=i A[t].   Find a k ’th largest subsequence. Given a Sequence A[1,...,n] of Numbers, Integers l, u and k   Find k subsequences maximizing  j t=i A[t] among all subsequences of length at least l and at most u.   Find a k ’th largest subsequence among all subsequences of length at least l and at most u. Finding interesting parts of sequences is a problem appearing repeatedly in data analysis. Locating G+C rich regions of DNA sequences or finding tandem repeats in chromosome data are such problems. In addition to Bioinformatics, similar and/or identical problems appear in Pattern Matching, Image Processing, and Data Mining. Insert all n(n+1)/2 sums in a heap ordered binary tree (values increase towards root). Find the k largest using Frederickson’s heap selection algorithm. Represent the O(n 2 ) sums implicitly in a heap ordered binary tree using O(n) space. A representation of Q i can be build by adding the same value to all elements in Q i -1 and adding one new element. This allows for efficient construction of each set. Represent each set Q i using a heap ordered binary tree H i and join them. Use Frederickson’s algorithm and find the k+n-1 largest and discard the ∞ values. Input Sequence A : Subsequences A[i,…,j] : The k-1 largest are green, the k ’th largest is red. Locating Interesting Subsequences MADALGO – Center for Massive Data Algorithmics, a Center of the Danish National Research Foundation ∞ ∞∞ ∞ Allan Grønlund Jørgensen University of Aarhus