Integrating Genomes D. R. Zerbino, B. Paten, D. Haussler Science 336, 179 (2012) Teacher: Professor Chao, Kun-Mao Speaker: Ho, Bin-Shenq June 4, 2012.

Slides:



Advertisements
Similar presentations
Martin John Bishop UK HGMP Resource Centre Hinxton Cambridge CB10 1 SB
Advertisements

LG 4 Outline Evolutionary Relationships and Classification
Epistasis, Molecular mechanism, Importance Xudong Zou Prof. Yun-Dong Wu Dr. Zhiqiang Ye 8 th Nov
CITE EVIDENCE THAT ORGANISMS ARE LINKED BY LINES OF DESCENT FROM COMMON ANCESTRY LEARNING GOAL.
A Lite Introduction to (Bioinformatics and) Comparative Genomics Chris Mueller August 10, 2004.
Lesson Overview 1.3 Studying Life.
Ontology annotation: mapping genomic regions biological function Paul D Thomas, Huaiyu Mi and Suzanna Lewis.
Phylogenetic Trees Understand the history and diversity of life. Systematics. –Study of biological diversity in evolutionary context. –Phylogeny is evolutionary.
Classification of Living Things. 2 Taxonomy: Distinguishing Species Distinguishing species on the basis of structure can be difficult  Members of the.
Summer Bioinformatics Workshop 2008 Comparative Genomics and Phylogenetics Chi-Cheng Lin, Ph.D., Professor Department of Computer Science Winona State.
Phylogenetic reconstruction
August 19, 2002Slide 1 Bioinformatics at Virginia Tech David Bevan (BCHM) Lenwood S. Heath (CS) Ruth Grene (PPWS) Layne Watson (CS) Chris North (CS) Naren.
Bioinformatics at IU - Ketan Mane. Bioinformatics at IU What is Bioinformatics? Bioinformatics is the study of the inherent structure of biological information.
1 Genetics The Study of Biological Information. 2 Chapter Outline DNA molecules encode the biological information fundamental to all life forms DNA molecules.
Systems Biology Existing and future genome sequencing projects and the follow-on structural and functional analysis of complete genomes will produce an.
Bioinformatics Dr. Aladdin HamwiehKhalid Al-shamaa Abdulqader Jighly Lecture 1 Introduction Aleppo University Faculty of technical engineering.
. Class 1: Introduction. The Tree of Life Source: Alberts et al.
Introduction to Bioinformatics Spring 2008 Yana Kortsarts, Computer Science Department Bob Morris, Biology Department.
Computational Molecular Biology (Spring’03) Chitta Baral Professor of Computer Science & Engg.
Introduction to BioInformatics GCB/CIS535
Bio 465 Summary. Overview Conserved DNA Conserved DNA Drug Targets, TreeSAAP Drug Targets, TreeSAAP Next Generation Sequencing Next Generation Sequencing.
Genomic Rearrangements CS 374 – Algorithms in Biology Fall 2006 Nandhini N S.
Topic : Phylogenetic Reconstruction I. Systematics = Science of biological diversity. Systematics uses taxonomy to reflect phylogeny (evolutionary history).
Scientific FieldsScientific Fields  Different fields of science have contributed evidence for the theory of evolution  Anatomy  Embryology  Biochemistry.
Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.
Computational Molecular Biology Biochem 218 – BioMedical Informatics Gene Regulatory.
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
Statistical Bioinformatics QTL mapping Analysis of DNA sequence alignments Postgenomic data integration Systems biology.
The Science of Life Biology unifies much of natural science
1 Bio + Informatics AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC An Overview پرتال پرتال بيوانفورماتيك ايرانيان.
Genetic network inference: from co-expression clustering to reverse engineering Patrik D’haeseleer,Shoudan Liang and Roland Somogyi.
Chapter 26: Phylogeny and the Tree of Life Objectives 1.Identify how phylogenies show evolutionary relationships. 2.Phylogenies are inferred based homologies.
Gene Regulatory Network Inference. Progress in Disease Treatment  Personalized medicine is becoming more prevalent for several kinds of cancer treatment.
Igor Ulitsky.  “the branch of genetics that studies organisms in terms of their genomes (their full DNA sequences)”  Computational genomics in TAU ◦
Ecology and Evolutionary Biology of Viruses. SOME CONSEQUENCES AND EFFECTS OF VIRUS INFECTION Like other life forms, viruses promote the propagation of.
Genomes and Their Evolution. GenomicsThe study of whole sets of genes and their interactions. Bioinformatics The use of computer modeling and computational.
Molecular Biology Primer. Starting 19 th century… Cellular biology: Cell as a fundamental building block 1850s+: ``DNA’’ was discovered by Friedrich Miescher.
Biology EOC Review Evolution. Evolution Explain biological evolution as the consequence of the interaction of population growth, inherited variability.
CS177 Lecture 10 SNPs and Human Genetic Variation
CSCI 6900/4900 Special Topics in Computer Science Automata and Formal Grammars for Bioinformatics Bioinformatics problems sequence comparison pattern/structure.
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
Chapter 21 Eukaryotic Genome Sequences
Chapter 24: Molecular and Genomic Evolution CHAPTER 24 Molecular and Genomic Evolution.
1 What is Life? – Living organisms: – are composed of cells – are complex and ordered – respond to their environment – can grow and reproduce – obtain.
Biological Signal Detection for Protein Function Prediction Investigators: Yang Dai Prime Grant Support: NSF Problem Statement and Motivation Technical.
EB3233 Bioinformatics Introduction to Bioinformatics.
Recombination breakpoints Family Inheritance Me vs. my brother My dad (my Y)Mom’s dad (uncle’s Y) Human ancestry Disease risk Genomics: Regions  mechanisms.
Ayesha M.Khan Spring Phylogenetic Basics 2 One central field in biology is to infer the relation between species. Do they possess a common ancestor?
The Future of Genetics Research Lesson 7. Human Genome Project 13 year project to sequence human genome and other species (fruit fly, mice yeast, nematodes,
Bioinformatics Dipl. Ing. (FH) Patrick Grossmann
Computational Biology and Genomics at Boston College Biology Gabor T. Marth Department of Biology, Boston College
Bioinformatics Research Overview Li Liao Develop new algorithms and (statistical) learning methods > Capable of incorporating domain knowledge > Effective,
HW7: Evolutionarily conserved segments ENCODE region 009 (beta-globin locus) Multiple alignment of human, dog, and mouse 2 states: neutral (fast-evolving),
Different microarray applications Rita Holdhus Introduction to microarrays September 2010 microarray.no Aim of lecture: To get some basic knowledge about.
Taxonomy & Phylogeny. B-5.6 Summarize ways that scientists use data from a variety of sources to investigate and critically analyze aspects of evolutionary.
1 Finding disease genes: A challenge for Medicine, Mathematics and Computer Science Andrew Collins, Professor of Genetic Epidemiology and Bioinformatics.
BME435 BIOINFORMATICS.
Phylogeny and the Tree of Life
Bioinformatics Overview
Gil McVean Department of Statistics
CSCI2950-C Genomes, Networks, and Cancer
Statistical Applications in Biology and Genetics
Pipelines for Computational Analysis (Bioinformatics)
Genomes and Their Evolution
Bioinformatics: Buzzword or Discipline (???)
Genetics: From Genes to Genomes
The Study of Biological Information
Network Inference Chris Holmes Oxford Centre for Gene Function, &,
-The relationship between genes and traits. -Fields of Genetics.
Unit Genomic sequencing
Presentation transcript:

Integrating Genomes D. R. Zerbino, B. Paten, D. Haussler Science 336, 179 (2012) Teacher: Professor Chao, Kun-Mao Speaker: Ho, Bin-Shenq June 4, 2012

Outline Overview Obtaining Genomic Sequences Modeling Evolution of Genotype From Genotype to Phenotype Looking Ahead to Applications Conclusion

Overview Specialization in computational genomics Integration of genetic, molecular, and phenotypic information Impact on diverse fields of science New window into the story of life population genetics, phylogenetics human disease genetics + graph theory, signal processing statistics, computer science

Milestones First genome sequences_1970s Bacteriophage MS2 RNA: 3,569 nucleotides long_1976 Computational genomics_1980 Smith and Waterman Stormo et al. 16-fold improvement in computational power under Moore’s law A 10,000-fold sequencing performance improvement in the past 8 years

Computational Genomics Genomic data Evolution Molecular phenotype Organismal phenotype DNA sequence evolving in time ( history ) chromatin piece interacting with other molecules ( mechanism ) gene product acting in cellular pathways affecting organisms ( function )

Obtaining Genomic Sequences Genome assembly given sufficient read redundancy Large redundant regions (repeats) → complex networks of read-to-read overlaps not all reflecting actual overlaps → to determine which overlaps being legitimate and which being spurious → NP-hard problem → undetermined, prone-to-errors, costly-to-finish regions Newer sequencing technologies with longer reads

Obtaining Genomic Sequences Reference-based assembly Tendency of bias toward reference genome Newer sequencing technologies with longer reads

Modeling Evolution of Genotype Diversity of Genomes Alignment Phylogenetic analysis

Diversity of Genomes every genome being the result of a 3.8-billion-year evolutionary journey from the origin of life Mostly shared and partly unique Single-base change_substitution, SNP Indel_insertion, deletion Tandem duplication Recombination Transposition Rearrangement_inversion, segmental deletion, segmental duplication, fusion, fission, translocation Whole genome duplication

Diversity of Genomes Germline selections ↓ Evolution Somatic selections ↓ Cancer / Immunity

Assembly and Alignment Fig. 1. Assembly and alignment.

Alignment Alignment with assumption of derivation from a suitably recent common ancestor What being conserved or changed during the evolution from common ancestor Substitution, indel, segment order, copy number Local alignment for conserved functional regions of more distantly related genomes Global / Genome alignment for genomes from closely related species

Phylogenetic Analysis Single tree providing an explicit order of gene descent through shared ancestry Finding optimal phylogeny under probabilistic or parsimony models of substitutions and indels being NP-hard Being complicated by homologous recombination Intending to construct a tractable unified theory of genome evolution with stochastic processes jointly describing diversification events of genome

From Genotype to Phenotype Fig. 2. The dynamic processes that affect and are affected by the genome.

Genomes_Mechanisms_Functions Active molecules of the cell, including proteins, messenger RNAs, other functional RNAs Epigenetic mechanisms regulating RNA and protein production and function Gene regulatory networks Protein signaling cascades Metabolic pathways Regulatory network motifs

From Genotype to Phenotype Exploring unfolding history and diversity of life Deriving experimental data from an expansion of cell culture resources for diverse species / tissues and newer single-cell assay methodologies Correlating specific segregating variants with phenotypic traits or diseases Identifying causal variants by complete genome analysis in related as well as unrelated cases and controls and in combination with better prediction of possible effects of genome variants

From Genotype to Phenotype Constructing models of molecular phenotypes involving epigenetic state, RNA expression, and (inferred) protein levels through hidden Markov models, factor graphs, Bayesian networks, and Markov random fields Incorporating biological knowledge into classification and regression methods (e.g., general linear models, neural networks, and support vector machines)

Looking Ahead to Applications Genome data growth collectively from petabytes (10 15 bytes) today to exabytes (10 18 bytes) tomorrow Cancer diagnosis and treatment Immunology Stem cell therapy Agriculture Human prehistory study

Conclusion Facing challenges of obtaining maximum information from every sequencing experiment To borrow and tie together advances from a spectrum of different research fields into foundational mathematical models Between model comprehensiveness and computational efficiency To be shaped by increasing knowledge of biology

Thank You For Your Attention