Download presentation
Presentation is loading. Please wait.
1
Algorithms for Biological Sequence Analysis
Kun-Mao Chao (趙坤茂) Department of Computer Science and Information Engineering National Taiwan University, Taiwan Date: Feb. 22, 2011 WWW:
2
About this course Course: Algorithms for biological sequence analysis
Some basic knowledge on algorithm development and program design is required. We will be focused on the sequence-related algorithmic problems. Genomic sequences are our main target. The oldest language The largest program Spring semester, 2011 9: :10 Tuesday, 107 CSIE Building 3 credits Web site:
3
Coursework: Homework assignments and Class participation (10%)
Two midterm exams (70%; 35% each): April 12, 2011 (tentatively) May 24, 2011 (tentatively) Oral presentation of selected papers (20%)
4
Outlines Part I: Sequence Homology Part II: Sequence Composition
Introduction to basic algorithmic strategies Pairwise sequence alignment Multiple sequence alignment Chaining algorithms for genomic sequence analysis Suboptimal alignment Comparative genomics Compressed / constrained sequence comparison Hidden Markov models (the Viterbi algorithm et al.) Part II: Sequence Composition Maximum-sum and maximum-density segments SNP and haplotype data analysis Approximate gapped palindrome Genome annotation Other advanced topics
5
A Brief History of Genetics
1859 Charles Darwin published “The Origin of Species.” 1865 Genes are particular factors. [Gregor Mendel] 1869 Discovery of nucleic acid [Friedrich Miescher] 1903 Chromosomes are hereditary units. [Walter Sutton] 1910 Genes lie on chromosomes. [Thomas Hunt Morgan] 1913 Chromosomes are linear arrays of genes. [Alfred Sturtevant]
6
A Brief History of Genetics (cont’d)
1931 Recombination occurs by crossing over. [Harriet Creighton and Barbara McClintock] 1944 DNA is the genetic material. [Oswald Avery, Colin McLeod and Maclyn McCarty] 1953 DNA is a double helix. [James Watson and Francis Crick] Genetic code is triplet. [Marshall Nirenberg, Har Gobind Khorana, Sydney Brenner & Francis Crick] 1977 DNA was sequenced for the first time. [Fred Sanger, Walter Gilbert, and Allan Maxam] 21th Century: Many genomes completely sequenced
7
Milestones of Bioinformatics
1962 Pauling's theory of molecular evolution 1965 Margaret Dayhoff's Atlas of Protein Sequences 1970 Needleman-Wunsch algorithm 1977 DNA sequencing and software to analyze it (Staden) 1981 Smith-Waterman algorithm developed 1981 The concept of a sequence motif (Doolittle) 1982 GenBank Release 3 made public 1982 Phage lambda genome sequenced
8
Milestones of Bioinformatics (cont’d)
1983 Sequence database searching algorithm (Wilbur-Lipman) 1985 FASTP/FASTN: fast sequence similarity searching 1988 National Center for Biotechnology Information (NCBI) created at NIH/NLM 1988 EMBnet network for database distribution 1990 BLAST: fast sequence similarity searching 1991 EST: expressed sequence tag sequencing 1993 Sanger Centre, Hinxton, UK 1994 EMBL European Bioinformatics Institute, Hinxton, UK
9
Milestones of Bioinformatics (cont’d)
1995 First bacterial genomes completely sequenced 1996 Yeast genome completely sequenced 1997 PSI-BLAST 1998 Worm (multicellular) genome completely sequenced 1999 Fly genome completely sequenced
10
Milestones of Bioinformatics (cont’d)
Human Genome Project ( ) Mouse 2002 Rat 2004 Chimpanzee 2005 Completed Genomes
11
Chimpanzee Genome
12
The Primate Family Tree
Source: Nature
13
A Sequence Analysis Book Published by Springer in 2009 (ISBN 978-1848003194)
Sequence Comparison: Theory and Methods by Kun-Mao Chao and Louxin Zhang
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.