Comparative genomics of 29 eutherian mammals

Slides:



Advertisements
Similar presentations
Periodic clusters. Non periodic clusters That was only the beginning…
Advertisements

Manolis Kellis: Research synopsis Brief overview 1 slide each vignette Why biology in a computer science group? Big biological questions: 1.Interpreting.
Finding regulatory modules from local alignment - Department of Computer Science & Helsinki Institute of Information Technology HIIT University of Helsinki.
Mouse Genome Annotation Summit, 12 Mar 2008 The Status of the Mouse Genome.
A turbo intro to (the bioinformatics of) microRNAs 11/ Peter Hagedorn.
Comparative Motif Finding
Genome Browsers Ensembl (EBI, UK) and UCSC (Santa Cruz, California)
Evolutionary Genome Biology Gabor T. Marth, D.Sc. Department of Biology, Boston College Medical Genomics Course – Debrecen, Hungary, May 2006.
28-Way vertebrate alignment and conservation track in the UCSC Genome Browser Journal club Dec. 7, 2007.
Whole Genome Polymorphism Analysis of Regulatory Elements in Breast Cancer AAGTCGGTGATGATTGGGACTGCTCT[C/T]AACACAAGCGAGATGAAGAAACTGA Jacob Biesinger Dr.
Inference of Genealogies for Recombinant SNP Sequences in Populations Yufeng Wu Computer Science and Engineering Department University of Connecticut
Promoter Analysis using Bioinformatics, Putting the Predictions to the Test Amy Creekmore Ansci 490M November 19, 2002.
A high-resolution map of human
BNFO 602/691 Biological Sequence Analysis Mark Reimers, VIPBG
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
Special Topics in Genomics Lecture 1: Introduction Instructor: Hongkai Ji Department of Biostatistics
BNFO 602/691 Biological Sequence Analysis Mark Reimers, VIPBG
ENCODE The Human Genome project sequenced “the human genome” “the human genome” that we have labeled as such doesn’t actually exist What we call.
P300 Marks Active Enhancers Ruijuan LiChao HeRui Fu.
What is comparative genomics? Analyzing & comparing genetic material from different species to study evolution, gene function, and inherited disease Understand.
Computational personal genomics: selection, regulation, epigenomics, disease Manolis Kellis MIT Computer Science & Artificial Intelligence Laboratory Broad.
Igor Ulitsky.  “the branch of genetics that studies organisms in terms of their genomes (their full DNA sequences)”  Computational genomics in TAU ◦
Klaudia Walter, Wally Gilks, Lorenz Wernisch 12 th December 2006 HUMANHUMAN Modelling the Boundary of Highly Conserved Non-Coding DNA.
Regulation of Gene Expression: An Overview  Transcriptional  Tissue-specific transcription factors  Direct binding of hormones, growth factors, etc.
Molecular Biology Primer. Starting 19 th century… Cellular biology: Cell as a fundamental building block 1850s+: ``DNA’’ was discovered by Friedrich Miescher.
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
Integrative fly analysis: specific aims Aim 1: Comprehensive data collection – Data QC / data standards / – consistent pipelines Aim 2: Integrative annotation.
Web Databases for Drosophila Introduction to FlyBase and Ensembl Database Wilson Leung6/06.
SCRIPPS GENOME ADVISER Galina Erikson Senior Bioinformatics Programmer The Scripps Translational Science Institute Scripps Translational Science Institute.
Overview of NSF and the Directorate for Biological Sciences (BIO) Overview of NSF and the Directorate for Biological Sciences (BIO) Tom Brady Division.
Mark D. Adams Dept. of Genetics 9/10/04
Signatures of Accelerated Somatic Evolution in Gene Promoters in Multiple Cancer Types Update Talk Kyle Smith De Lab.
Lecture 6. Functional Genomics: DNA microarrays and re-sequencing individual genomes by hybridization.
MEME homework: probability of finding GAGTCA at a given position in the yeast genome, based on a background model of A = 0.3, T = 0.3, G = 0.2, C = 0.2.
Recombination breakpoints Family Inheritance Me vs. my brother My dad (my Y)Mom’s dad (uncle’s Y) Human ancestry Disease risk Genomics: Regions  mechanisms.
Translational evidence and the accuracy of prokaryotic gene annotation Luciano Brocchieri Department of Molecular Genetics & Microbiology and Genetics.
Comparative genomics of 24 mammals Manolis Kellis MIT MIT Computer Science & Artificial Intelligence Laboratory Broad Institute of MIT and Harvard.
Manolis Kellis Broad Institute of MIT and Harvard
Comparative Genomics Methods for Alternative Splicing of Eukaryotic Genes Liliana Florea Department of Computer Science Department of Biochemistry GWU.
Comparative Genomics I: Tools for comparative genomics
Can genes help explain our evolution? - What type of changes (regulatory or structural mutations?) - How many genes are involved?
Computational Biology and Genomics at Boston College Biology Gabor T. Marth Department of Biology, Boston College
Evolutionary Genome Biology Gabor T. Marth, D.Sc. Department of Biology, Boston College
A high-resolution map of human evolutionary constraints using 29 mammals Kerstin Lindblad-Toh et al Presentation by Robert Lewis and Kaylee Wells.
Using public resources to understand associations Dr Luke Jostins Wellcome Trust Advanced Courses; Genomic Epidemiology in Africa, 21 st – 26 th June 2015.
Katherine S. Pollard Gladstone Institutes, Institute for Human Genetics and Division of Biostatistics - UCSF What makes us human?
bacteria and eukaryotes
BioForum - California Academy of Sciences
Kerstin Lindblad-Toh1 et al.
Week-6: Genomics Browsers
The Transcriptional Landscape of the Mammalian Genome
Comparative genomics in flies and mammals
Detection of the footprint of natural selection in the genome
Very important to know the difference between the trees!
Manolis Kellis Broad Institute of MIT and Harvard
Structure of proximal and distant regulatory elements in the human genome Ivan Ovcharenko Computational Biology Branch National Center for Biotechnology.
Genomes and their evolution
Genomes and their evolution
Eukaryotic Gene Finding
Recitation 7 2/4/09 PSSMs+Gene finding
Genomes and Their Evolution
Genome organization and Bioinformatics
Chapter 4 The Interrupted Gene.
In collaboration with Mikkelsen Lab
From Mendel to Genomics
The Content of the Genome
.1Sources of DNA and Sequencing Methods 2 Genome Assembly Strategy and Characterization 3 Gene Prediction and Annotation 4 Genome Structure 5 Genome.
Section 20.4 Mutations and Genetic Variation
Schematic representation of the SyngenicDNA approach.
Volume 11, Issue 7, Pages (May 2015)
Presentation transcript:

Comparative genomics of 29 eutherian mammals Kerstin Lindblad-Toh1,2, Manuel Garber1*, Or Zuk1*, Michael F. Lin1,3*, Brian J. Parker4*, Stefan Washietl3*, Pouya Kheradpour1,3*, Jason Ernst1,3*, Gregory Jordan5*, Evan Mauceli1*, Lucas D. Ward1,3*, Craig B. Lowe6,7,8*, Alisha Holloway9*, Michele Clamp1,10*, Sante Gnerre1*, Broad Institute Sequencing Platform Whole Genome Assembly Team, Kim C. Worley14, Christie L. Kovar14, Donna M. Muzny14, Richard A. Gibbs14, Baylor College of Medicine Human Genome Sequencing Center, Wesley C. Warren15, Elaine R Mardis15, George M. Weinstock14,15, Richard K. Wilson15, Washington University Genome Center, Ewan Birney5, Elliott Margulies16, Javier Herrero5, Eric Green17, David Haussler6,8, Adam Siepel12, Nick Goldman5, Katherine S. Pollard9,18, Jakob S. Pedersen4,19, Eric S. Lander1, Manolis Kellis1,3 Author Affiliations: 1 Broad Institute of Harvard and Massachusetts Institute of Technology (MIT), 2 Science for Life Laboratory, Uppsala University. 3 MIT Computer Science and Artificial Intelligence Laboratory. 4 The Bioinformatics Centre, Department of Biology, University of Copenhagen. 5 EMBL-EBI, Wellcome Trust Genome Campus. 6 Center for Biomolecular Science and Engineering, University of California. 7 Department of Developmental Biology, Stanford University. 8 Howard Hughes Medical Institute. 9 Gladstone Institutes, University of California. 10 BioTeam Inc. 14 Human Genome Sequencing Center, Baylor College of Medicine. 15 Washington University School of Medicine in Saint Louis. 16 Genome Informatics Section, National Human Genome Research Institute. 17 NISC Comparative Sequencing Program, NHGRI. 18 Institute for Human Genetics, and Division of Biostatistics, UC San Francisco. 19 Department of Molecular Medicine (MOMA), Aarhus University Hospital. Funding: This work was supported by the National Human Genome Research Institute (NHGRI), including grant U54 HG003273 (R.A.G), National Institute for General Medicine (NIGMS) grant #GM82901 (Pollard lab) and the European Science Foundation (EURYI award to K.L-T.), NSF National Science Foundation (NSF) postdoctoral fellowship award 0905968 (J.E.), NSF CAREER 0644282 and NIH R01 HG004037 and the Sloan Foundation (M.K.), and an Erwin Schrödinger Fellowship of the Austrian Fonds zur Förderung der Wissenschaftlichen Forschung (SW), the Gates Cambridge Trust (GJ), Novo Nordisk Foundation (BJP and JW); a Statistics Network Fellowship, Department of Mathematical Sciences, University of Copenhagen (BJP); the David and Lucile Packard Foundation (AS); the Danish Council for Independent Research | Medical Sciences (JSP); Lundbeck Foundation (JSP). Nucleotide-level measures of evolut. constraint Sequencing strategy for human genome annotation Evolutionary signatures for genome annotation Mammalian constraint matches Human SNPs Protein-coding genes - Codon Substitution Frequencies - Reading Frame Conservation Broad, Baylor, WashU RNA structures - Compensatory changes - Silent G-U substitutions microRNAs - Shape of conservation profile - Structural features: loops, pairs - Relationship with 3’UTR motifs 4  29 mammals 1  4 substs / site 50  10 bp detection Regulatory motifs - Mutations preserve consensus - Increased Branch Length Score - Genome-wide conservation Position-specific constraint measures bias in subst. patterns Many closely related species. Maximize total branch length Protein-coding genes Coverage depth higher in functional regions Human SNPs match mammalian-wide twofold constraint rhombomere 4 expression (Lampe et al., NAR 2008) rhombomere 2 expr. (Tümpel PNAS 2008) RNA structures Evidence of selection against deletions in functional regions Genome-wide agreement in selection vs. polymorphisms Detection of evolutionarily constrained elements Position-specific constraint in human popul. Promoter elements TSS, Motif TSS, Bound TSS, Motif, π log-odds (12mers) π log-odds (50mers) ω (12mers) 29 mammals 7.1/1.5/4.6 6.8/1.8/4.1 5.7/ 1.1/3.8 5.7/1.8/3.0 (HMRD) Human Mouse Rat Dog 4.2/0.0/0.0 5.3/0.1/0.3 4.5/0.0/0.0 5.1/0.6/1.7 Regulatory motifs Reduced heterozygocity evidence of human selection Even in absence of species-level conservation Estimated / kmers detectable at 5% FDR / base pairs detectable at 5% FDR Codon-specific measures of positive selection Chromatin states shed light on intergenic functions Gene-wide vs. punctate regions of exons positive selection