Presentation is loading. Please wait.

Presentation is loading. Please wait.

Algorithms for Biological Sequence Analysis Kun-Mao Chao ( 趙坤茂 ) Department of Computer Science and Information Engineering National Taiwan University,

Similar presentations

Presentation on theme: "Algorithms for Biological Sequence Analysis Kun-Mao Chao ( 趙坤茂 ) Department of Computer Science and Information Engineering National Taiwan University,"— Presentation transcript:

1 Algorithms for Biological Sequence Analysis Kun-Mao Chao ( 趙坤茂 ) Department of Computer Science and Information Engineering National Taiwan University, Taiwan Date: September 19, 2006 E-mail: WWW:

2 2 About this course Course: Algorithms for biological sequence analysis We will be focused on the sequence-related algorithmic problems. Genomic sequences are our main target. –The oldest language –The largest program Fall semester, 2006 Tuesday 10:20 – 13:10, 111 CSIE Building. 3 credits Web site:

3 3 Coursework: Homework assignments and Class participation (15%) Two midterm exams (30% each): –November 7, 2006 (tentatively) –December 19, 2006 (tentatively) Oral presentation of selected papers (25%)

4 4 Outlines Part I: Sequence Homology –Introduction to genomes –Dynamic programming strategy revisited –Pairwise sequence alignment –Multiple sequence alignment –Chaining algorithms for genomic sequence analysis –Suboptimal alignment –Comparative genomics –Hidden Markov models (the Viterbi algorithm et al.) Part II: Sequence Composition –Maximum-sum and maximum-density segments –SNP and haplotype data analysis –Genome annotation –Other advanced topics

5 5 A Brief History of Genetics 1859 Darwin publishes The Origin of Species 1865 Genes are particular factors 1871 Discovery of nucleic acid 1903 Chromosomes are hereditary units 1910 Genes lie on chromosomes 1913 Chromosomes are linear arrays of genes 1931 Recombination occurs by crossing over

6 6 A Brief History of Genetics (cont’d) 1944 DNA is the genetic material 1945 A gene codes for protein 1951 First protein sequence 1953 DNA is a double helix 1961 Genetic code is triplet 1977 Eukaryotic genes are interrupted 1977 DNA can be sequenced 21th Century: Many genomes completely sequenced

7 7 Milestones of Bioinformatics 1962 Pauling's theory of molecular evolution 1965 Margaret Dayhoff's Atlas of Protein SequencesMargaret Dayhoff's 1970 Needleman-Wunsch algorithmNeedleman-Wunsch 1977 DNA sequencing and software to analyze it (Staden)Staden 1981 Smith-Waterman algorithm developedSmith-Waterman algorithm 1981 The concept of a sequence motif (Doolittle)Doolittle 1982 GenBank Release 3 made public 1982 Phage lambda genome sequenced

8 8 Milestones of Bioinformatics (cont’d) 1983 Sequence database searching algorithm (Wilbur- Lipman)Wilbur- Lipman 1985 FASTP/FASTN: fast sequence similarity searchingFASTP/FASTN 1988 National Center for Biotechnology Information (NCBI) created at NIH/NLMNational Center for Biotechnology Information 1988 EMBnet network for database distributionEMBnet network 1990 BLAST: fast sequence similarity searchingBLAST 1991 EST: expressed sequence tag sequencing 1993 Sanger Centre, Hinxton, UKSanger Centre 1994 EMBL European Bioinformatics Institute, Hinxton, UKEMBL European Bioinformatics Institute

9 9 Milestones of Bioinformatics (cont’d) 1995 First bacterial genomes completely sequencedbacterial genomes 1996 Yeast genome completely sequencedYeast genome completely sequenced 1997 PSI-BLASTPSI-BLAST 1998 Worm (multicellular) genome completely sequencedWorm (multicellular) genome completely sequenced 1999 Fly genome completely sequenced

10 10 Milestones of Bioinformatics (cont’d) Human Genome Project (1990-2003)Human Genome Project Mouse 2002 Rat 2004 Chimpanzee 2005 Completed Genomes

11 11 Chimpanzee Genome

12 12 The Primate Family Tree Source: Nature

13 13 Source: My niece’s email

14 14 Source: My niece’s email

15 15 Source: My niece’s email

16 16 Count every " F" in the following text: FINISHED FILES ARE THE RE SULT OF YEARS OF SCIENTI FIC STUDY COMBINED WITH THE EXPERIENCE OF YEARS... Source: My niece’s email

17 17 Olny srmat poelpe can raed tihs. cdnuolt blveiee taht I cluod aulaclty uesdnatnrd waht I was rdanieg. The phaonmneal pweor of the hmuan mnid, aoccdrnig to a rscheearch at Cmabrigde Uinervtisy, it deosn't mttaer in waht oredr the ltteers in a wrod are, the olny iprmoatnt tihng is taht the frist and lsat ltteer be in the rghit pclae. The rset can be a taotl mses and you can sitll raed it wouthit a porbelm. Source: My niece’s email

18 18 “Discovery is to see what everyone else has seen, but think what no one else has thought.” Albert Szent-Györgyi Albert Szent-Györgyi (The Nobel Prize in Physiology or Medicine, 1937 ) “By inventing elegant software tools, we can help biologists see and think.” “Invention  Discovery” Kun-Mao Chao

19 19 Source: My niece’s email

Download ppt "Algorithms for Biological Sequence Analysis Kun-Mao Chao ( 趙坤茂 ) Department of Computer Science and Information Engineering National Taiwan University,"

Similar presentations

Ads by Google