Algorithms for Biological Sequence Analysis

Slides:



Advertisements
Similar presentations
Creating NCBI The late Senator Claude Pepper recognized the importance of computerized information processing methods for the conduct of biomedical research.
Advertisements

BIOLOGY 12 Genetics: An Introduction. A little motivational video:
DNA Structure and Replication. Lifespan Gene In The News.
BIOINFORMATICS Ency Lee.
CS 177 Introduction to Bioinformatics Fall 2005
Introduction to Bioinformatics Burkhard Morgenstern Institute of Microbiology and Genetics Department of Bioinformatics Goldschmidtstr. 1 Göttingen, March.
PLANT MOLECULAR GENETICS
Genetics All Your Hopes and All Your Fears. Genetics Classical Genetics –Mendelian genetics Fundamental principles underlying transmission of genetic.
Introduction to Bioinformatics Spring 2008 Yana Kortsarts, Computer Science Department Bob Morris, Biology Department.
Bio 344 Molecular Biology Old web site: bio344/
Bioinformatics: a Multidisciplinary Challenge Ron Y. Pinter Dept. of Computer Science Technion March 12, 2003.
Bioinformatics and Phylogenetic Analysis
BME 130 – Genomes Lecture 1 History of Genomics.
Origins of Modern Genetics ► Jean Baptiste Lamarck (French, early 19 th c.): “The Inheritance of Acquired Characteristics” ► Charles Darwin (English, 1859):
--- History of Molecular Biology
A Whirlwind Tour of Bioinformatics Kun-Mao Chao ( 趙坤茂 ) National Taiwan University
What is Bioinformatics?. Conceptualizing biology in terms of molecules and then applying “informatics” techniques from math, computer science, and statistics.
Bioinformatics.
Algorithms for Biological Sequence Analysis Kun-Mao Chao ( 趙坤茂 ) Department of Computer Science and Information Engineering National Taiwan University,
Chapter 14 Genomes and Genomics. Sequencing DNA dideoxy (Sanger) method ddGTP ddATP ddTTP ddCTP 5’TAATGTACG TAATGTAC TAATGTA TAATGT TAATG TAAT TAA TA.
Manifestations of a Code Genes, genomes, bioinformatics and cyberspace – and the promise they hold for biology education.
Lecture 02: Progress of Modern Molecular Biology.
DNA: Deoxyribose Nucleic Acid The Genetic Material Chapter 2: Introduction to DNA Ms. Gaynor Honors Genetics.
Molecular Biology Primer. Starting 19 th century… Cellular biology: Cell as a fundamental building block 1850s+: ``DNA’’ was discovered by Friedrich Miescher.
MANDATORY READING THE PLANT AND ANIMAL CELL Diversity between individuals Diversity Among individuals.
UNIT 2: Mechanisms of Inheritance
1 Data structure:Lookup Table Application:BLAST. 2 The Look-up Table Data Structure A k-mer is a string of length k. A lookup table is a table of size.
Introduction to Bioinformatics Biostatistics & Medical Informatics 576 Computer Sciences 576 Fall 2008 Colin Dewey Dept. of Biostatistics & Medical Informatics.
AdvancedBioinformatics Biostatistics & Medical Informatics 776 Computer Sciences 776 Spring 2002 Mark Craven Dept. of Biostatistics & Medical Informatics.
A brief history of Molecular biology. Big names Charles Darwin ( ) On the origin of species by means of the natural selection (1859)
A Whirlwind Tour of Bioinformatics Kun-Mao Chao ( 趙坤茂 ) National Taiwan University
Algorithms for Biological Sequence Analysis Kun-Mao Chao ( 趙坤茂 ) Department of Computer Science and Information Engineering National Taiwan University,
EB3233 Bioinformatics Introduction to Bioinformatics.
INTRODUCTION TO PLANT MOLECULAR GENETICS. Why young breeders must study genetics?
Algorithms for Biological Sequence Analysis Kun-Mao Chao ( 趙坤茂 ) Department of Computer Science and Information Engineering National Taiwan University,
Bioinformatics and Computational Biology
Biocomputation: Comparative Genomics Tanya Talkar Lolly Kruse Colleen O’Rourke.
The iPlant Collaborative Vision Enable life science researchers and educators to use and extend cyberinfrastructure.
Gene editing: From biblical times to the present
The iPlant Collaborative Vision Enable life science researchers and educators to use and extend cyberinfrastructure.
Trees Kun-Mao Chao ( 趙坤茂 ) Department of Computer Science and Information Engineering National Taiwan University, Taiwan
NCBI: something old, something new. What is NCBI? Create automated systems for knowledge about molecular biology, biochemistry, and genetics. Perform.
REVIEW ______________________ contains all the genetic instructions to create all the cells in your body. What Does DNA stand For? ________ -part of a.
CS515: Bioinformatic Algorithms
DNA: Deoxyribose Nucleic Acid The Genetic Material
GENETICS AND MOLECULAR BIOLOGY
A Whirlwind Tour of Bioinformatics
BME 130 – Genomes Lecture 1 History of Genomics.
Statistical Applications in Biology and Genetics
Homology Search Tools Kun-Mao Chao (趙坤茂)
생물정보학 Bioinformatics.
DNA: Deoxyribose Nucleic Acid The Genetic Material
DNA: Deoxyribose Nucleic Acid The Genetic Material
Mangaldai College, Mangaldai
DNA: Deoxyribose Nucleic Acid The Genetic Material
Homology Search Tools Kun-Mao Chao (趙坤茂)
Genomes and Their Evolution
Unit 5: Genetics Learning Goal 3: Describe the structure of DNA, its discovery, and its importance in DNA replication.
Bioinformatics: Buzzword or Discipline (???)
A Brief History What is molecular biology?
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
CISC 667 Intro to Bioinformatics (Spring 2007) Review session for Mid-Term CISC667, S07, Lec14, Liao.
DNA.
Plant Molecular Biology
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Trees Kun-Mao Chao (趙坤茂)
Homology Search Tools Kun-Mao Chao (趙坤茂)
Introduction to Bioinformatics
Trees Kun-Mao Chao (趙坤茂)
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Presentation transcript:

Algorithms for Biological Sequence Analysis Kun-Mao Chao (趙坤茂) Department of Computer Science and Information Engineering National Taiwan University, Taiwan Date: Feb. 22, 2011 WWW: http://www.csie.ntu.edu.tw/~kmchao

About this course Course: Algorithms for biological sequence analysis Some basic knowledge on algorithm development and program design is required. We will be focused on the sequence-related algorithmic problems. Genomic sequences are our main target. The oldest language The largest program Spring semester, 2011 9:10 - 12:10 Tuesday, 107 CSIE Building 3 credits Web site: http://www.csie.ntu.edu.tw/~kmchao/seq11spr

Coursework: Homework assignments and Class participation (10%) Two midterm exams (70%; 35% each): April 12, 2011 (tentatively) May 24, 2011 (tentatively) Oral presentation of selected papers (20%)

Outlines Part I: Sequence Homology Part II: Sequence Composition Introduction to basic algorithmic strategies Pairwise sequence alignment Multiple sequence alignment Chaining algorithms for genomic sequence analysis Suboptimal alignment Comparative genomics Compressed / constrained sequence comparison Hidden Markov models (the Viterbi algorithm et al.) Part II: Sequence Composition Maximum-sum and maximum-density segments SNP and haplotype data analysis Approximate gapped palindrome Genome annotation Other advanced topics

A Brief History of Genetics 1859 Charles Darwin published “The Origin of Species.” 1865 Genes are particular factors. [Gregor Mendel] 1869 Discovery of nucleic acid [Friedrich Miescher] 1903 Chromosomes are hereditary units. [Walter Sutton] 1910 Genes lie on chromosomes. [Thomas Hunt Morgan] 1913 Chromosomes are linear arrays of genes. [Alfred Sturtevant]

A Brief History of Genetics (cont’d) 1931 Recombination occurs by crossing over. [Harriet Creighton and Barbara McClintock] 1944 DNA is the genetic material. [Oswald Avery, Colin McLeod and Maclyn McCarty] 1953 DNA is a double helix. [James Watson and Francis Crick] 1961-1967 Genetic code is triplet. [Marshall Nirenberg, Har Gobind Khorana, Sydney Brenner & Francis Crick] 1977 DNA was sequenced for the first time. [Fred Sanger, Walter Gilbert, and Allan Maxam] 21th Century: Many genomes completely sequenced

Milestones of Bioinformatics 1962 Pauling's theory of molecular evolution 1965 Margaret Dayhoff's Atlas of Protein Sequences 1970 Needleman-Wunsch algorithm 1977 DNA sequencing and software to analyze it (Staden) 1981 Smith-Waterman algorithm developed 1981 The concept of a sequence motif (Doolittle) 1982 GenBank Release 3 made public 1982 Phage lambda genome sequenced

Milestones of Bioinformatics (cont’d) 1983 Sequence database searching algorithm (Wilbur-Lipman) 1985 FASTP/FASTN: fast sequence similarity searching 1988 National Center for Biotechnology Information (NCBI) created at NIH/NLM 1988 EMBnet network for database distribution 1990 BLAST: fast sequence similarity searching 1991 EST: expressed sequence tag sequencing 1993 Sanger Centre, Hinxton, UK 1994 EMBL European Bioinformatics Institute, Hinxton, UK

Milestones of Bioinformatics (cont’d) 1995 First bacterial genomes completely sequenced 1996 Yeast genome completely sequenced 1997 PSI-BLAST 1998 Worm (multicellular) genome completely sequenced 1999 Fly genome completely sequenced

Milestones of Bioinformatics (cont’d) Human Genome Project (1990-2003) Mouse 2002 Rat 2004 Chimpanzee 2005 Completed Genomes

Chimpanzee Genome

The Primate Family Tree Source: Nature

A Sequence Analysis Book Published by Springer in 2009 (ISBN 978-1848003194) Sequence Comparison: Theory and Methods by Kun-Mao Chao and Louxin Zhang