Presentation is loading. Please wait.

Presentation is loading. Please wait.

Kerstin Lindblad-Toh1 et al.

Similar presentations


Presentation on theme: "Kerstin Lindblad-Toh1 et al."— Presentation transcript:

1 Kerstin Lindblad-Toh1 et al.
A high-resolution map of human evolutionary constraint using 29 mammals Kerstin Lindblad-Toh1 et al. Presentation by: Keara Flores Gustavo Diaz Cruz

2 Background Human genome sequenced in 2003 and researchers learned 1.5% of it codes proteins. 2005 Lindblad-Toh and colleagues begin effort to sequence domestic dog producing a SNP map allows for comparative analysis of the mammalian genome Mouse, rat, and dog sequenced early on and compared to human genome Assisted assembly-2009

3 Background Comparison revealed a 5% overlap in constrained/conserved sequences of the 4 genomes (HMRD) Led to a desire/need for more mammals to be sequenced *Constrained= conserved region which is resistant to change with low allelic variation.

4 Background Assisted assembly- improving low coverage sequences de novo using genomes from related species Next generation sequencing- faster and more affordable. Sequencing methods kept the same for consistency Assisted assembly- improvement of a low coverage sequences by using related genomes from related species for de novo assembly The 2x genomes utilized in the paper were produced over several years, whereas NGS- time and emergence- around 2008; Illumina launches its first analyzer in 2006 For the purposes of this experiment, the sequencing methods were kept the same, even though NGS was faster and more affordable. Gave consistency to the experiment

5 Why sequence 29 mammals? More feasible at this time
HMRD comparison could not identify small regions of constraint Discover class/family/genus specific mutations/sequences Possibility of finding new functional elements in human genome ID mutations which may cause disease

6 Methods Sequencing of 20 genomes + 9 previously sequenced
2x coverage provided from pair end reads and fosmids Constraint estimation- use of : Siphy-ω- rate based method Siphy-π- biased substitution method Assisted assembly method De novo assembly Using related genomes reads can be aligned 2x coverage was provided from pair end reads (~1.8x) and the use of fosmids (~.2x)

7 Method: Assisted Assembly
Extension of de novo contig due to alignemnt to reference genome (green and red) normally these cannot be joined. Scaffold can be mapped to the genome via the green reads. Mapping of scaffold to genome via anchor portions. Stingent test for missabsembly How completely one can reconstruct a genome sequence from whole-genome shotgun (WGS) reads depends on the depth of sequence coverage generated [1]. Additionally, longer reads and better base quality in reads provides more information and, therefore, allows any assembler to perform a better task, resulting in both the generation of bigger contigs/scaffolds and improvements in the quality of the assembly. low coverage (either global or local) makes the assembly problem much harder to deal with, since it affects our capability of both distinguishing true from false read-read alignments and building a list of confirmed non-chimeric read pair links. Since an important step of the assembly process is to generate a set of read-read alignments, errors introduced in this step will have a major effect on the final product.

8 Evolutionary depth and genomic features
Frequency of genomic features varies with the amount of species being compared The more functionally important an element is, the higher the frequency/density of that component AR- neutrally evolving repeats Branch length- shows the relatedness between species (placement of the evolutionary tree)

9 Detection of constrained elements
Comparison of constrained elements in 29 mammals vs HMRD model + siepel vertebrae b) Comparison between 29 mammals vs HMRD Blue-overlap Grey-differences Seipel vertabrae- human, mouse, rat, chicken, and fugu

10 Uncovering of binding sites
HMRD model reveals 1 constrained element b) 29 mammals; reveals 5 new constrained elements C- (4 of the elements) TF binding site for NPAS4 D- could not be determined until other mammals were included in the comparison, provided new evidence for constraint. HMRD showed no constraint due to high variation between rat and dog models

11 Constraint and SNPs Constraint is observed in nucleotide bases across mammalian genome.

12 Candidate structural elements
Comparison of hairpin D of MAT2A. c) predicted secondary structure of hairpin D d)Hairpin D used as reference to align 5 other hairpin sequences in MAT2A e)Comparison of hairpin D across multiple vertebrates Structural representation of constraint in RNA in the MAT2A gene. Note the similarities in closely related verts.

13 Degrees of Constraint in Promoters
High constraint- developmental regions Medium- metabolic processes Low- sensory/ immune response Constraint related to importance of function and requirement for genetic variation.

14 Rapid evolution in humans may lead to disease
Insertions in non human species shown as yellow triangles Human specific substitutions in red/ oranges Rapid evolution Involvement of 5 polymorphic sites in humans. High mutagen rate suggests divergence in humans from ancestral (chimpanzee state). Rapid evolution in the 5’UTR of FGF13 are potential candidates of BFLS (BorjesonForssman-Lehmann syndrome) and other diseases.

15 Significance Identified regions under constraint (3.5 million) which HRMD model could not identify alone Discovery of functional elements specific to Eutheria (placental mammals) Possibility of finding disease associated variants through constrained regions ID accelerated evolution in primate and human lineages Predictive capability of finding functional elements Significance→ newly found exons, newly found regions under constraint that were unidentifiable before with the HMRD data set Possibility of finding disease associated variants with constrained regions

16 Conclusions More species= better resolution of constrained elements
Comparative analysis within the same clade reveals previously undetected functional elements including functional innovations Important for discovering regulatory elements Constrained elements can help focus disease studies

17 Looking Forward/ Further Reading
The Genome 10K Project: a way forward. Bird 10K Project

18

19 Reference Lindblad-Toh, K. et al. Genome sequence, comparative analysis and haplotype structure of the domestic dog. Nature 438, 803–819 (2005). Gnerre, S., Lander, E. S., Lindblad-Toh, K. & Jaffe, D. B. Assisted assembly: how to improve a de novo genome assembly by using related species. Genome Biol. 10, R88 (2009) Illumina timeline: PAML- How bac end sequencing works Assisted assembly paper


Download ppt "Kerstin Lindblad-Toh1 et al."

Similar presentations


Ads by Google