Download presentation
Presentation is loading. Please wait.
1
1 Convolution and Its Applications to Sequence Analysis Student: Bo-Hung Wu Advisor: Professor Herng-Yow Chen & R. C. T. Lee Department of Computer Science & Information Engineering National Chi Nan University
2
2 The Definition of Convolution in the Continuous Case Reference: Lecture notes, “Introduction to communication”, R. C. T. Lee et al. Example
3
3
4
4 Exact String-Matching Problem Input. Text string T=T 1 T 2 …T n and pattern string P=P 1 P 2 …P m where T i, P i ∑(alphabet) and m<=n. Output. All locations i in T where T i T i+1 T i+2 …T i+m-1 =P 1 P 2 …P m It is obvious that string matching is related to convolution.
5
5 Convolution in the Discrete Case for k=0~ m+n Then the convolution of X and Y with respect to and is Definition: Let X=, Y= be two given vectors, x i, y i D. Let and be two given functions, where
6
6 Consider the exact string-matching problem, how can we use convolution to solve it?[FP74] First we reverse Y to be Second we define the functions and to be as follows: Note that the process of this convolution is equal to the one of the sliding window approach. [FP74]
7
7 Applying Convolution to Sequence Analysis (1)The common substring with k-mismatch allowed problem (2)Common substrings with k-mismatches allowed among multiple sequences problem (3)Determining the similarity of two DNA sequences (4)Searching in a DNA sequences database (5)Finding repeating groups in a DNA sequence (6)An aid for detection in transposition (7)An aid for detecting insertion/deletion (8)An aid for detecting the overlapping of segments resulting from the shot-gun operations (9)The corresponding pair-wise nucleotides in a DNA sequence (10)An aid for looking for similar regions in a DNA sequence with a distance constraint
8
8 The Corresponding Pair-wise Nucleotides in a DNA Sequence Substitution rule: A T T A C G G C Example: S=”acttgacgtgaac”
9
9 Experiments We apply convolution on DNA sequences and English compositions to find the similarity of them. In the following experiments, we used the following DNA sequences as the input data. (Clustering was known in advance for evaluating.) C1(0-25) : Hepatitis B virus; C2(26-162) : Human mitochondrion; C3(163-1041): Other viruses
10
10
11
11 Experiment : The Comparison of English compositions. We applied convolution on two English compositions to detect whether they are similar or not.
12
12
13
13 Conclusion and Future Work We have shown that several applications related to sequences analysis which we discovered can be solved by means of convolution. Convolution can be used as a negative answer filter. In practical parts, we did some experiments. The experimental results confirm that this approach is feasible. By arranging appropriate operations to be the functions in the convolution, we can solve more problems related to sequences analysis. For example, we hope that we may apply convolution to help solve protein structure comparison.
14
14 Thank you.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.