Presentation is loading. Please wait.

Presentation is loading. Please wait.

Bioinformatics Genome anatomy Comparisons of some eukaryotic genomes Allignment of long genomic sequences Comparative genomics Oxford Grid Reconstruction.

Similar presentations


Presentation on theme: "Bioinformatics Genome anatomy Comparisons of some eukaryotic genomes Allignment of long genomic sequences Comparative genomics Oxford Grid Reconstruction."— Presentation transcript:

1 Bioinformatics Genome anatomy Comparisons of some eukaryotic genomes Allignment of long genomic sequences Comparative genomics Oxford Grid Reconstruction of ancestral mammalian karyotype Lecture 16

2 Genome anatomy Anatomy of different genomes, particularly such remote as eukaryotes and prokaryotes, differ very significantly. This includes size of genome; thousand fold difference between eukarytes and prokaryotes and 30 fold between different eukaryotes. The number of genes per genome varies less significantly; in humans ~30,000-35,000 and in bacterial genomes ~1,500 – 2,000 genes. Eukaryotic genomes are full of simple repeats, numerous types of transposable elements and other sequences. Prokaryotes have a few repeats and transposable elements and their genomes consist mainly from genes.

3 Homologous sequences: orthologous and paralogous Genes or sequences are orthologous if their most recent common ancestor did not undergo a gene duplication and they represent the same genetic element in a number of species. Paralogous genes and sequences are always a result of a duplication. Orthologous and paralogous could be very similar and their discrimination is not always a simple operation.

4 Taxonomic breakdown of homologous mouse proteins according to taxonomic range

5 Comparisons of some eukaryotic genomes

6

7 Comparative genomics Comparisons of very different genomes, while being useful for general purposes, does not allow more detailed analysis. On contrary comparison of genomes belonging to relatively similar group, like mammals, reveals some evolutionary trends. Comparative genome analysis of related species provides a powerful and general approach to identify functional elements without previous knowledge of function Reconstruction of an ancestral genome for a group like mammals is within a reach.

8 Allignment of long genomic sequences Allignment of long sequences is the essential part of comparative genomics. PipMaker is one of a few novel programs that compute alignments of similar regions in two DNA sequences. The resulting alignments are summarized with a ``percent identity plot'', or ``pip'' for short. MultiPipMaker allows the user to see relationships among more than two sequences. All pairwise alignments with the first sequence are computed and then returned as interleaved pips. MultiPipMaker can be requested to compute a true multiple alignment of the input sequences and return a nucleotide-level view of the results.

9 Allignment of long genomic sequences: PipMaker

10 Comparison of genome regions Dotplot of the human-mouse comparison of the ApoE region. Note the yellow for exonic sequences and red for the upstream regulatory region.

11 Identification of functional genomic sequences Pipmaker representation in a scale of 50% to 100% conservation for the same region.

12 Identification of functional genomic sequences Little is known about the actual fraction of the mammalian genome that is functional, but recent estimates based on sequence conservation patterns suggest that it is at least 5%. Given that the protein-coding fraction of the genome is about 1.5%, there is a lot of room for the identification of additional functional elements. Sequence conservation does not reveal the total fraction of the functional genome but simply the fraction of the genome that has remained functional within the group of species compared. It is expected therefore that an additional fraction will be species- specific or at least lineage-specific, and not conserved across large evolutionary distances such as human and mouse or across all vertebrate lineages.

13 Conserved non-coding sequences PipMaker alignment of a gene-poor region of human chromosome 21. Blocks in red indicate regions of the human genome that are at least 100 bps and at least 70% identity between human and mouse (Conserved Non-Genic sequences: CNGs).

14 Databases and programs for genome comparisons Several databases provide characteristics of the genome, such as genes, expressed sequence tags (ESTs), repeats, computational predictions and other information in the context of genome conservation. Such databases are: UCSC browser (genome.ucsc.edu), Ensembl (www.ensembl.org) and NCBI (www.ncbi.nlm,nih.gov)www.ncbi.nlm,nih.gov In these databases one can find information about the levels of conservation of genes and also non-genic regions between a number of species.

15 Oxford Grid Each cell in the Oxford Grid represents a comparison of two chromosomes, one from each of the selected species. The number of orthologies appears inside each colored cell and the color indicates a range in the number: Grey (1), Blue (2-10), Green (11-25), Orange (26-50), Yellow (50+). Clicking on a colored cell will retrieve orthology details. Clicking on a mouse chromosome (blue numbers next to grid frame) will retrieve a comparative map showing all orthologies between the selected mouse chromosome and the comparison species displayed on the grid. Total orthologies observed between mouse and human genomes: 12,435, total mapped in both species: 12,290

16 Oxford Grid: comparison of mouse and human genomes

17 Comparative maps: Mouse Chromosome 1 Linkage Map versus human Mouse chromosome 1 Human orthologues located on different chromosomes

18 Whole chromosome paints from the tammar wallaby were hybridized to chromosomes of the swamp wallaby, which has the record lowest chromosome number in marsupials (2n=10 in females and 11 in males). Complex origin of some chromosomes using fusions can be seen on the picture

19 1pq, 1q, 2pter-q13, 2q13-qter, 3, 4, 5, 6, 7pq, 7q, 8p, 8q, 10p, 10q,11, 12pq, 12q, 13, 14, 15, 16p, 16q, 17, 18, 19p, 19q, 20, 21, 22q, 22qdist, X and Y Ancestral chromosome pool (numbers correspond to human homologs) Whole chromosomesLarge segmentsNeighboring segments 3, 5, 6, 9, 11, 13, 14, 15, 17, 18, 20, 21, X, Y 1q, 1pq, 2pter-q13, 2q13-qter,7pq, 7q, 8p, 8q,10p, 10q, 12pq, 12q, 16p, 16q, 19p, 19q, 22q, 22qdist 3/21, 4/8p, 7q/16p, 14/15, 16q/19q, 12pq/22q, 12q/22qdist Suggested ancestral eutherian karyotypes: 1.Chowdhary et al. 1998: 2n=48 2.Murphy et al. 2001: 2n=50 3.Yang et al. 2003: 2n=44 4.Richard et al. 2003: 2n=50 5.Fronicke et al. 20032n=48 The likely ancestral mammalian (eutherian) karyotype represented as homologues of human chromosomes and the three major conserved components it comprises. The conserved components, viz., whole human chromosomes, large segments of human chromosomes and combination of neighboring segments of human chromosomes are commonly seen as conserved blocks in other evolutionarily diverged species.

20 Reconstruction of the putative ancestral eutherian karyotype. It was assumed the ancestor had 23 pairs of chromosomes and human chromosomes were superimposed on them.

21 Segments and blocks >300kb in size with conserved synteny in human are superimposed on the mouse genome. Each colour corresponds to a particular human chromosome. The 342 segments are separated from each other by thin, white lines within the 217 blocks of consistent colour

22 Dot plots of conserved syntenic segments in the three human and three mouse chromosomes For each of three human (a–c) and mouse (d–f) chromosomes, the positions of orthologous landmarks are plotted along the x axis and the corresponding position of the landmark on chromosomes in the other genome is plotted on the y axis. Different chromosomes in the corresponding genome are differentiated with distinct colours. In a remarkable example of conserved synteny, human chromosome 20 (a) consists of just three segments from mouse chromosome 2 (d), with only one small segment altered in order. Human chromosome 17 (b) also shares segments with only one mouse chromosome (11) (e), but the 16 segments are extensively rearranged. However, most of the mouse and human chromosomes consist of multiple segments from multiple chromosomes, as shown for human chromosome 2 (c) and mouse chromosome 12 (f). Circled areas and arrows denote matching segments in mouse and human.

23


Download ppt "Bioinformatics Genome anatomy Comparisons of some eukaryotic genomes Allignment of long genomic sequences Comparative genomics Oxford Grid Reconstruction."

Similar presentations


Ads by Google