Kerstin Lindblad-Toh1 et al.

Slides:



Advertisements
Similar presentations
Accurate Assembly of Maize BACs Patrick S. Schnable Srinivas Aluru Iowa State University.
Advertisements

PRIORITIZING REGIONS OF CANDIDATE GENES FOR EFFICIENT MUTATION SCREENING.
The Concept of Functional Constraint. The intensity of purifying selection is determined by the degree of intolerance characteristic of a site or a genomic.
Speaker: HU Xue-Jia Supervisor: WU Yun-Dong Date: 19/12/2013.
Plant of the day! Pebble plants, Lithops, dwarf xerophytes Aizoaceae
Molecular Clock I. Evolutionary rate Xuhua Xia
Bioinformatics for Whole-Genome Shotgun Sequencing of Microbial Communities By Kevin Chen, Lior Pachter PLoS Computational Biology, 2005 David Kelley.
Genetica per Scienze Naturali a.a prof S. Presciuttini Human and chimpanzee genomes The human and chimpanzee genomes—with their 5-million-year history.
Mining SNPs from EST Databases Picoult-Newberg et al. (1999)
Evolutionary Genome Biology Gabor T. Marth, D.Sc. Department of Biology, Boston College Medical Genomics Course – Debrecen, Hungary, May 2006.
Novel multi-platform next generation assembly methods for mammalian genomes The Baylor College of Medicine, Australian Government and University of Connecticut.
28-Way vertebrate alignment and conservation track in the UCSC Genome Browser Journal club Dec. 7, 2007.
A high-resolution map of human
Bioinformatics Genome anatomy Comparisons of some eukaryotic genomes Allignment of long genomic sequences Comparative genomics Oxford Grid Reconstruction.
Human Genome Project Seminal achievement. Scientific milestone. Scientific implications. Social implications.
BNFO 602/691 Biological Sequence Analysis Mark Reimers, VIPBG
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
BNFO 602/691 Biological Sequence Analysis Mark Reimers, VIPBG
Todd J. Treangen, Steven L. Salzberg
Ultraconserved Elements in the Human Genome Bejerano, G., et.al. Katie Allen & Megan Mosher.
What is comparative genomics? Analyzing & comparing genetic material from different species to study evolution, gene function, and inherited disease Understand.
Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.
1 Classification of real and pseudo microRNA precursors using local structure-sequence features and support vector machine Chenghai Xue, Fei Li, Tao He,
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
Ch. 21 Genomes and their Evolution. New approaches have accelerated the pace of genome sequencing The human genome project began in 1990, using a three-stage.
1 Genome Evolution Chapter Introduction Genomes contain the raw material for evolution; Comparing whole genomes enhances – Our ability to understand.
SHI Meng. Abstract Changes in gene expression are thought to underlie many of the phenotypic differences between species. However, large-scale analyses.
Lecture 6. Functional Genomics: DNA microarrays and re-sequencing individual genomes by hybridization.
MEME homework: probability of finding GAGTCA at a given position in the yeast genome, based on a background model of A = 0.3, T = 0.3, G = 0.2, C = 0.2.
February 20, 2002 UD, Newark, DE SNPs, Haplotypes, Alleles.
MPL The DNA Sequence of chimpanzee chromosome 22 and comparative analysis with its human ortholog, chromosome 21 Bioinformatics Dae-Soo Kim.
NEW TOPIC: MOLECULAR EVOLUTION.
Accessing and visualizing genomics data
A high-resolution map of human evolutionary constraints using 29 mammals Kerstin Lindblad-Toh et al Presentation by Robert Lewis and Kaylee Wells.
Katherine S. Pollard Gladstone Institutes, Institute for Human Genetics and Division of Biostatistics - UCSF What makes us human?
Accelerating positional cloning in mice using ancestral haplotype patterns Mark Daly Whitehead Institute for Biomedical Research.
Looking Within Human Genome King abdulaziz university Dr. Nisreen R Tashkandy GENOMICS ; THE PIG PICTURE.
Interpreting exomes and genomes: a beginner’s guide
Lesson: Sequence processing
SNP Detection Congtam Pham 2/24/04 Dr. Marth’s Class.
The Transcriptional Landscape of the Mammalian Genome
Nucleotide variation in the human genome
Title: Different Types of Evolution
Detection of the footprint of natural selection in the genome
Complex disease and long-range regulation: Interpreting the GWAS using a Dual Colour Transgenesis Strategy in Zebrafish.
Genetics and Evolutionary Biology
Basics of Comparative Genomics
Very important to know the difference between the trees!
In-Text Art, Ch. 16, p. 316 (1).
Genomes and Their Evolution
Genomes and Their Evolution
The Chimpanzee Genome Motivation for sequencing
Genome Projects Maps Human Genome Mapping Human Genome Sequencing
Today… Review a few items from last class
Fig Figure 21.1 What genomic information makes a human or chimpanzee?
Gene duplications: evolutionary role
Gene Density and Noncoding DNA
Volume 21, Issue 3, Pages (October 2017)
First Draft of Chimpanzee Genome
Volume 21, Issue 3, Pages (October 2017)
Structural Analysis of Insulin Minisatellite Alleles Reveals Unusually Large Differences in Diversity between Africans and Non-Africans  John D.H. Stead,
Chapter 6 Clusters and Repeats.
GT repeats are unique to Cdk6 and are conserved in different mammals.
Introduction to Sequencing
The Content of the Genome
Unit Genomic sequencing
Basics of Comparative Genomics
Human Genome Project Seminal achievement. Scientific milestone.
Reminder The AP Exam registration is open in Naviance. The Exam is on Monday, May 13. I’ll let you know when the next test/homework will be.
Volume 11, Issue 7, Pages (May 2015)
Presentation transcript:

Kerstin Lindblad-Toh1 et al. A high-resolution map of human evolutionary constraint using 29 mammals Kerstin Lindblad-Toh1 et al. Presentation by: Keara Flores Gustavo Diaz Cruz

Background Human genome sequenced in 2003 and researchers learned 1.5% of it codes proteins. 2005 Lindblad-Toh and colleagues begin effort to sequence domestic dog producing a SNP map allows for comparative analysis of the mammalian genome Mouse, rat, and dog sequenced early on and compared to human genome Assisted assembly-2009

Background Comparison revealed a 5% overlap in constrained/conserved sequences of the 4 genomes (HMRD) Led to a desire/need for more mammals to be sequenced *Constrained= conserved region which is resistant to change with low allelic variation.

Background Assisted assembly- improving low coverage sequences de novo using genomes from related species Next generation sequencing- faster and more affordable. Sequencing methods kept the same for consistency Assisted assembly- improvement of a low coverage sequences by using related genomes from related species for de novo assembly The 2x genomes utilized in the paper were produced over several years, whereas NGS- time and emergence- around 2008; Illumina launches its first analyzer in 2006 For the purposes of this experiment, the sequencing methods were kept the same, even though NGS was faster and more affordable. Gave consistency to the experiment

Why sequence 29 mammals? More feasible at this time HMRD comparison could not identify small regions of constraint Discover class/family/genus specific mutations/sequences Possibility of finding new functional elements in human genome ID mutations which may cause disease

Methods Sequencing of 20 genomes + 9 previously sequenced 2x coverage provided from pair end reads and fosmids Constraint estimation- use of : Siphy-ω- rate based method Siphy-π- biased substitution method Assisted assembly method De novo assembly Using related genomes reads can be aligned 2x coverage was provided from pair end reads (~1.8x) and the use of fosmids (~.2x)

Method: Assisted Assembly Extension of de novo contig due to alignemnt to reference genome (green and red) normally these cannot be joined. Scaffold can be mapped to the genome via the green reads. Mapping of scaffold to genome via anchor portions. Stingent test for missabsembly How completely one can reconstruct a genome sequence from whole-genome shotgun (WGS) reads depends on the depth of sequence coverage generated [1]. Additionally, longer reads and better base quality in reads provides more information and, therefore, allows any assembler to perform a better task, resulting in both the generation of bigger contigs/scaffolds and improvements in the quality of the assembly. low coverage (either global or local) makes the assembly problem much harder to deal with, since it affects our capability of both distinguishing true from false read-read alignments and building a list of confirmed non-chimeric read pair links. Since an important step of the assembly process is to generate a set of read-read alignments, errors introduced in this step will have a major effect on the final product.

Evolutionary depth and genomic features Frequency of genomic features varies with the amount of species being compared The more functionally important an element is, the higher the frequency/density of that component AR- neutrally evolving repeats Branch length- shows the relatedness between species (placement of the evolutionary tree)

Detection of constrained elements Comparison of constrained elements in 29 mammals vs HMRD model + siepel vertebrae b) Comparison between 29 mammals vs HMRD Blue-overlap Grey-differences Seipel vertabrae- human, mouse, rat, chicken, and fugu

Uncovering of binding sites HMRD model reveals 1 constrained element b) 29 mammals; reveals 5 new constrained elements C- (4 of the elements) TF binding site for NPAS4 D- could not be determined until other mammals were included in the comparison, provided new evidence for constraint. HMRD showed no constraint due to high variation between rat and dog models

Constraint and SNPs Constraint is observed in nucleotide bases across mammalian genome.

Candidate structural elements Comparison of hairpin D of MAT2A. c) predicted secondary structure of hairpin D d)Hairpin D used as reference to align 5 other hairpin sequences in MAT2A e)Comparison of hairpin D across multiple vertebrates Structural representation of constraint in RNA in the MAT2A gene. Note the similarities in closely related verts.

Degrees of Constraint in Promoters High constraint- developmental regions Medium- metabolic processes Low- sensory/ immune response Constraint related to importance of function and requirement for genetic variation.

Rapid evolution in humans may lead to disease Insertions in non human species shown as yellow triangles Human specific substitutions in red/ oranges Rapid evolution Involvement of 5 polymorphic sites in humans. High mutagen rate suggests divergence in humans from ancestral (chimpanzee state). Rapid evolution in the 5’UTR of FGF13 are potential candidates of BFLS (BorjesonForssman-Lehmann syndrome) and other diseases.

Significance Identified regions under constraint (3.5 million) which HRMD model could not identify alone Discovery of functional elements specific to Eutheria (placental mammals) Possibility of finding disease associated variants through constrained regions ID accelerated evolution in primate and human lineages Predictive capability of finding functional elements Significance→ newly found exons, newly found regions under constraint that were unidentifiable before with the HMRD data set Possibility of finding disease associated variants with constrained regions

Conclusions More species= better resolution of constrained elements Comparative analysis within the same clade reveals previously undetected functional elements including functional innovations Important for discovering regulatory elements Constrained elements can help focus disease studies

Looking Forward/ Further Reading The Genome 10K Project: a way forward. https://www.ncbi.nlm.nih.gov/pubmed/25689317 Bird 10K Project http://b10k.genomics.cn/progress.html

Reference Lindblad-Toh, K. et al. Genome sequence, comparative analysis and haplotype structure of the domestic dog. Nature 438, 803–819 (2005). Gnerre, S., Lander, E. S., Lindblad-Toh, K. & Jaffe, D. B. Assisted assembly: how to improve a de novo genome assembly by using related species. Genome Biol. 10, R88 (2009) Illumina timeline:https://www.illumina.com/technology/next-generation-sequencing/solexa-technology.html PAML-http://web.mit.edu/6.891/www/lab/paml.html How bac end sequencing works http://www.genomenewsnetwork.org/articles/06_00/sequence_primer.shtml Assisted assembly paper https://genomebiology.biomedcentral.com/articles/10.1186/gb-2009-10-8-r88