Heaviest Segments in a Number Sequence Kun-Mao Chao ( 趙坤茂 ) Department of Computer Science and Information Engineering National Taiwan University, Taiwan.

Slides:



Advertisements
Similar presentations
Longest Common Subsequence
Advertisements

Finding a Length-Constrained Maximum-Density Path in a Tree Rung-Ren Lin, Wen-Hsiung Kuo, and Kun-Mao Chao.
Sequence Alignment Kun-Mao Chao ( 趙坤茂 ) Department of Computer Science and Information Engineering National Taiwan University, Taiwan
Efficient Algorithms for Locating Maximum Average Consecutive Substrings Jie Zheng Department of Computer Science UC, Riverside.
Minimum Spanning Trees Kun-Mao Chao ( 趙坤茂 ) Department of Computer Science and Information Engineering National Taiwan University, Taiwan
6/11/2015 © Bud Mishra, 2001 L7-1 Lecture #7: Local Alignment Computational Biology Lecture #7: Local Alignment Bud Mishra Professor of Computer Science.
Efficient Algorithms for Locating the Length- Constrained Heaviest Segments, with Applications to Biomolecular Sequence Analysis Yaw-Ling Lin Tao Jiang.
Efficient Algorithms for Locating the Length- Constrained Heaviest Segments, with Applications to Biomolecular Sequence Analysis Yaw-Ling Lin * Tao Jiang.
Efficient Algorithms for Locating the Length- Constrained Heaviest Segments, with Applications to Biomolecular Sequence Analysis Yaw-Ling Lin * Tao Jiang.
Sequence Alignment II CIS 667 Spring Optimal Alignments So we know how to compute the similarity between two sequences  How do we construct an.
Finding the optimal pairwise alignment We are interested in finding the alignment of two sequences that maximizes the similarity score given an arbitrary.
Space-Saving Strategies for Computing Δ-points Kun-Mao Chao ( 趙坤茂 ) Department of Computer Science and Information Engineering National Taiwan University,
Counting Spanning Trees Kun-Mao Chao ( 趙坤茂 ) Department of Computer Science and Information Engineering National Taiwan University, Taiwan
Dynamic-Programming Strategies for Analyzing Biomolecular Sequences Kun-Mao Chao ( 趙坤茂 ) Department of Computer Science and Information Engineering National.
Dynamic-Programming Strategies for Analyzing Biomolecular Sequences.
Dynamic Programming Method for Analyzing Biomolecular Sequences Tao Jiang Department of Computer Science University of California - Riverside (Typeset.
Pairwise Sequence Alignment BMI/CS 776 Mark Craven January 2002.
Minimum Routing Cost Spanning Trees Kun-Mao Chao ( 趙坤茂 ) Department of Computer Science and Information Engineering National Taiwan University, Taiwan.
Dynamic-Programming Strategies for Analyzing Biomolecular Sequences Kun-Mao Chao ( 趙坤茂 ) Department of Computer Science and Information Engineering National.
1 Languages. 2 A language is a set of strings String: A sequence of letters Examples: “cat”, “dog”, “house”, … Defined over an alphabet:
Multiple Sequence Alignment Kun-Mao Chao ( 趙坤茂 ) Department of Computer Science and Information Engineering National Taiwan University, Taiwan WWW:
Sequence Alignment Kun-Mao Chao ( 趙坤茂 ) Department of Computer Science and Information Engineering National Taiwan University, Taiwan
Algorithms for Biological Sequence Analysis Kun-Mao Chao ( 趙坤茂 ) Department of Computer Science and Information Engineering National Taiwan University,
Apple Raises $17 Billion in Record Debt Sale Kun-Mao Chao Kun-Mao Chao ( 趙坤茂 ) Department of Computer Science and Information Engineering National Taiwan.
Eye-Tracking Tech Kun-Mao Chao Kun-Mao Chao ( 趙坤茂 ) Department of Computer Science and Information Engineering National Taiwan University, Taiwan A note.
Heaviest Segments in a Number Sequence Kun-Mao Chao ( 趙坤茂 ) Department of Computer Science and Information Engineering National Taiwan University, Taiwan.
Database Similarity Search. 2 Sequences that are similar probably have the same function Why do we care to align sequences?
Dynamic Programming Kun-Mao Chao ( 趙坤茂 ) Department of Computer Science and Information Engineering National Taiwan University, Taiwan
Never-ending stories Kun-Mao Chao ( 趙坤茂 ) Dept. of Computer Science and Information Engineering National Taiwan University, Taiwan
Trees Kun-Mao Chao ( 趙坤茂 ) Department of Computer Science and Information Engineering National Taiwan University, Taiwan
Space-Saving Strategies for Analyzing Biomolecular Sequences Kun-Mao Chao ( 趙坤茂 ) Department of Computer Science and Information Engineering National Taiwan.
On the R ange M aximum-Sum S egment Q uery Problem Kuan-Yu Chen and Kun-Mao Chao Department of Computer Science and Information Engineering, National Taiwan.
PageRank Google : its search listings always seemed deliver the “good stuff” up front. 1 2 Part of the magic behind it is its PageRank Algorithm PageRank™
D ESIGN & A NALYSIS OF A LGORITHM 13 – D YNAMIC P ROGRAMMING (C ASE S TUDIES ) Informatics Department Parahyangan Catholic University.
Homology Search Tools Kun-Mao Chao (趙坤茂)
Languages Costas Busch - LSU.
Sequence Alignment Kun-Mao Chao (趙坤茂)
Homology Search Tools Kun-Mao Chao (趙坤茂)
Dynamic-Programming Strategies for Analyzing Biomolecular Sequences
Homology Search Tools Kun-Mao Chao (趙坤茂)
SMA5422: Special Topics in Biotechnology
Shortest-Paths Trees Kun-Mao Chao (趙坤茂)
Heaviest Segments in a Number Sequence
Sequence Alignment Kun-Mao Chao (趙坤茂)
A Quick Note on Useful Algorithmic Strategies
On the Range Maximum-Sum Segment Query Problem
A Note on Useful Algorithmic Strategies
A Note on Useful Algorithmic Strategies
A Note on Useful Algorithmic Strategies
Sequence Alignment Kun-Mao Chao (趙坤茂)
A Note on Useful Algorithmic Strategies
Sequence Alignment Kun-Mao Chao (趙坤茂)
Space-Saving Strategies for Analyzing Biomolecular Sequences
Multiple Sequence Alignment
Approximation Algorithms for the Selection of Robust Tag SNPs
Space-Saving Strategies for Computing Δ-points
Space-Saving Strategies for Computing Δ-points
Space-Saving Strategies for Computing Δ-points
Space-Saving Strategies for Analyzing Biomolecular Sequences
A Note on Useful Algorithmic Strategies
A Note on Useful Algorithmic Strategies
Homology Search Tools Kun-Mao Chao (趙坤茂)
Trees Kun-Mao Chao (趙坤茂)
Multiple Sequence Alignment
Space-Saving Strategies for Computing Δ-points
Space-Saving Strategies for Computing Δ-points
Dynamic Programming Kun-Mao Chao (趙坤茂)
Presentation transcript:

Heaviest Segments in a Number Sequence Kun-Mao Chao ( 趙坤茂 ) Department of Computer Science and Information Engineering National Taiwan University, Taiwan WWW:

2 Maximum-sum segment Given a sequence of real numbers a 1 a 2 …a n, find a consecutive subsequence with the maximum sum. 9 –3 1 7 – –4 2 –7 6 – For each position, we can compute the maximum- sum interval ending at that position in O(n) time. Therefore, a naive algorithm runs in O(n 2 ) time.

3 Maximum-sum segment (The recurrence relation) Define S(i) to be the maximum sum of the segments ending at position i. aiai If S(i-1) < 0, concatenating a i with its previous segment gives less sum than a i itself.

4 Maximum-sum segment (Tabular computation) 9 –3 1 7 – –4 2 –7 6 – S(i) – – The maximum sum

5 Maximum-sum interval (Traceback) 9 –3 1 7 – –4 2 –7 6 – S(i) – – The maximum-sum segment:

6 Computing segment sum in O(1) time? Input: a sequence of real numbers a 1 a 2 …a n Query: the sum of a i a i+1 …a j

7 Computing segment sum in O(1) time prefix-sum(i) = S[1]+S[2]+…+S[i], –all n prefix sums are computable in O(n) time. sum(i, j) = prefix-sum(j) – prefix-sum(i-1) prefix-sum(j) i j prefix-sum(i-1)

8 Computing segment average in O(1) time prefix-sum(i) = S[1]+S[2]+…+S[i], –all n prefix sums are computable in O(n) time. sum(i, j) = prefix-sum(j) – prefix-sum(i-1) density(i, j) = sum(i, j) / (j-i+1) prefix-sum(j) i j prefix-sum(i-1)

9 Maximum-average segment Maximum-average interval The maximum element is the answer. It can be done in O(n) time.

10 Maximum average segments Define A(i) to be the maximum average of the segments ending at position i. How to compute A(i) efficiently?

11 Left-Skew Decomposition Partition S into substrings S 1,S 2,…,S k such that –each S i is a left-skew substring of S the average of any suffix is always less than or equal to the average of the remaining prefix. –density(S 1 ) < density(S 2 ) < … < density(S k ) Compute A(i) in linear time

12 Left-Skew Decomposition Increasingly left-skew decomposition (O(n) time)

13 Right-Skew Decomposition Partition S into substrings S 1,S 2,…,S k such that –each S i is a right-skew substring of S the average of any prefix is always less than or equal to the average of the remaining suffix. –density(S 1 ) > density(S 2 ) > … > density(S k ) [Lin, Jiang, Chao] –Unique –Computable in linear time. –The Inventors of the Right-Skew Decomposition (Oops! Wrong photo!)The Inventors of the Right-Skew Decomposition –The Inventors of the Right-Skew Decomposition (This is a right one. more)The Inventors of the Right-Skew Decomposition more

14 Right-Skew Decomposition Decreasingly right-skew decomposition (O(n) time)

15 Right-Skew pointers p[ ] p[ ]

16 C+G rich regions locate a region with high C+G ratio ATGACTCGAGCTCGTCA Average C+G ratio

17 Defining scores for alignment columns infocon [Stojanovic et al., 1999] –Each column is assigned a score that measures its information content, based on the frequencies of the letters both within the column and within the alignment. CGGATCAT—GGA CTTAACATTGAA GAGAACATAGTA