Figure 1. P. Knowlesi top, six frame translation showing snap generated gene models (blue), contigs depicted alternate brown and orange. P falciparum (bottom)

Slides:



Advertisements
Similar presentations
Introduction 1.Ordering of P. knowlesi contigs v P. falciparum methodology progress/status towards a synteny map – ‘true’ scaffold 2. Gene prediction generating.
Advertisements

Basics of Comparative Genomics Dr G. P. S. Raghava.
1 Gene Finding Charles Yan. 2 Gene Finding Genomes of many organisms have been sequenced. We need to translate the raw sequences into knowledge. Where.
Human Genome Project. Basic Strategy How to determine the sequence of the roughly 3 billion base pairs of the human genome. Started in Various side.
Malaria Jonathan Kidd Jennifer Koehl Heather Louch Edwin Wong
Sequence Analysis with Artemis & Artemis Comparison Tool (ACT) South East Asian Training Course on Bioinformatics Applied to Tropical Diseases (Sponsored.
Hi Kathy, I’ve had a look at the remapped version of chr7 (MAL7.remapped this is the cons file you gave me) and the old version (MAL7.embl) in order to.
Locating genes in Plasmodium falciparum You have seen how artemis is used to view, analyse and annotate bacterial genomes, but now we are going to move.
Mouse Genome Sequencing
What is comparative genomics? Analyzing & comparing genetic material from different species to study evolution, gene function, and inherited disease Understand.
LOC_Os02g08480 Supplementary Figure S1. Exons shorter than a read length have few or no reads aligned. The gene at LOC_Os02g08040 contains exons shorter.
ANALYSIS AND VISUALIZATION OF SINGLE COPY ORTHOLOGS IN ARABIDOPSIS, LETTUCE, SUNFLOWER AND OTHER PLANT SPECIES. Alexander Kozik and Richard W. Michelmore.
Malaria Introduction Daniel Aaen Hansen October 8, 2010 Center for Biological Sequence Analysis Technical University of Denmark.
Ch. 21 Genomes and their Evolution. New approaches have accelerated the pace of genome sequencing The human genome project began in 1990, using a three-stage.
Web Databases for Drosophila Introduction to FlyBase and Ensembl Database Wilson Leung6/06.
Ishida et al. Supplementary Figures 1-3 Page 1 Supplementary Fig. 1. Stepwise determination of genomic aberrations on chr-13 in medulloblastomas from Ptch1.
Mojavensis: Issues of Polymorphisms Chris Shaffer GEP 2009 Washington University.
MAL7 MAL7.remapped No telomere present at the left-end. A GC plateau (arrowed) is characteristic due to the terminal 7 bp repeat (not shown). Files: MAL7.embl.
Plasmodium falciparum (3D7) - published in Draft coverage. No sequence updates for a year. No new annotation since? Leishmania major Friedlin - version.
The Artemis Comparison Tool
Human Genome Project.
Basics of Comparative Genomics
Supplementary Materials
Latifah Ibrahim, Normaznah Yahaya, Amal Nasir Mustafa.
5' breakpoint in intron 2 (chr19:1,219,187-1,219,238 shown)
Frequency of Nonallelic Homologous Recombination Is Correlated with Length of Homology: Evidence that Ectopic Synapsis Precedes Ectopic Crossing-Over 
Hair Keratin Associated Proteins: Characterization of a Second High Sulfur KAP Gene Domain on Human Chromosome 211  Michael A. Rogers, Hermelita Winter,
Resolving the Breakpoints of the 17q21
Marrying structure and genomics
Volume 18, Issue 9, Pages (February 2017)
Reciprocal Crossovers and a Positional Preference for Strand Exchange in Recombination Events Resulting in Deletion or Duplication of Chromosome 17p11.2 
Volume 7, Issue 3, Pages (March 2010)
Discovery and Characterization of piRNAs in the Human Fetal Ovary
Conserved Seed Pairing, Often Flanked by Adenosines, Indicates that Thousands of Human Genes are MicroRNA Targets  Benjamin P. Lewis, Christopher B. Burge,
Recombination between Palindromes P5 and P1 on the Human Y Chromosome Causes Massive Deletions and Spermatogenic Failure  Sjoerd Repping, Helen Skaletsky,
Paths to a malaria vaccine illuminated by parasite genomics
Molecular Characterization and Gene Content of Breakpoint Boundaries in Patients with Neurofibromatosis Type 1 with 17q11.2 Microdeletions  Dieter E.
A Gene Mutated in Nephronophthisis and Retinitis Pigmentosa Encodes a Novel Protein, Nephroretinin, Conserved in Evolution  Edgar Otto, Julia Hoefele,
The Release 5.1 Annotation of Drosophila melanogaster Heterochromatin
What Integration Sites Tell Us about HIV Persistence
Diverse abnormalities manifest in RNA
Chromatin Insulators: Linking Genome Organization to Cellular Function
Volume 28, Issue 2, Pages e5 (January 2018)
Recurrent 10q22-q23 Deletions: A Genomic Disorder on 10q Associated with Cognitive and Behavioral Abnormalities  Jorune Balciuniene, Ningping Feng, Kelly.
Integrative Multi-omic Analysis of Human Platelet eQTLs Reveals Alternative Start Site in Mitofusin 2  Lukas M. Simon, Edward S. Chen, Leonard C. Edelstein,
Complex Polymorphisms in an ∼330 kDa Protein Are Linked to Chloroquine-Resistant P. falciparum in Southeast Asia and Africa  Xin-zhuan Su, Laura A. Kirkman,
Joseph Rodriguez, Jerome S. Menet, Michael Rosbash  Molecular Cell 
Volume 39, Issue 5, Pages (May 2018)
Volume 23, Issue 3, Pages e8 (March 2018)
Alex M. Plocik, Brenton R. Graveley  Molecular Cell 
Volume 19, Issue 15, Pages (August 2009)
Structural Basis for the EBA-175 Erythrocyte Invasion Pathway of the Malaria Parasite Plasmodium falciparum  Niraj H. Tolia, Eric J. Enemark, B. Kim Lee.
Volume 128, Issue 6, Pages (March 2007)
Analysis of the complete genome sequences of human rhinovirus
Jeffrey A. Fawcett, Hideki Innan  Trends in Genetics 
Beth Elliott, Christine Richardson, Maria Jasin  Molecular Cell 
A DNA Replication Mechanism for Generating Nonrecurrent Rearrangements Associated with Genomic Disorders  Jennifer A. Lee, Claudia M.B. Carvalho, James.
Volume 23, Issue 10, Pages (June 2018)
Reciprocal Crossovers and a Positional Preference for Strand Exchange in Recombination Events Resulting in Deletion or Duplication of Chromosome 17p11.2 
Basics of Comparative Genomics
Complete Haplotype Sequence of the Human Immunoglobulin Heavy-Chain Variable, Diversity, and Joining Genes and Characterization of Allelic and Copy-Number.
An AT-Rich Sequence in Human Common Fragile Site FRA16D Causes Fork Stalling and Chromosome Breakage in S. cerevisiae  Haihua Zhang, Catherine H. Freudenreich 
Basic Local Alignment Search Tool
Global Analysis of Palmitoylated Proteins in Toxoplasma gondii
Promoting in Tandem: The Promoter for Telomere Transposon HeT-A and Implications for the Evolution of Retroviral LTRs  O.N Danilevskaya, I.R Arkhipova,
Identification of TSIX, Encoding an RNA Antisense to Human XIST, Reveals Differences from its Murine Counterpart: Implications for X Inactivation  Barbara.
Hypothetical model of var gene evolution
Volume 21, Issue 23, Pages (December 2011)
Christa Lese Martin, Andrew Wong, Alyssa Gross, June Chung, Judy A
The Breakpoint Region of the Most Common Isochromosome, i(17q), in Human Neoplasia Is Characterized by a Complex Genomic Architecture with Large, Palindromic,
Presentation transcript:

Figure 1. P. Knowlesi top, six frame translation showing snap generated gene models (blue), contigs depicted alternate brown and orange. P falciparum (bottom) as for P. knowlesi. Near vertical red bars joining the sequences represent tblastx hits above a score threshold of 135 bits. Conservation of gene order, and to a lesser extent exon organisation, is apparent. Yellow near vertical bars show a break in conservation of synteny. A putative orthologue of a proposed lysophospholipase is duplicated in P. knowlesi but Is in single copy in P. falciparum. Assembly of scaffolds of P. knowlesi chromosomes for genomic comparison with P. falciparum 3D7 A. E. Berry 1, E. Adlam, S. Banda, M. A. Rajandream 1, M. Berriman 1. 1 Wellcome Trust Sanger Institute, Welcome trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK. Introduction - Status of sequencing projects Since publication of the genome of Plasmodium 3D7 in October 2002 the sequencing and analysis effort is continuing with a long term view focused on: Human malaria: Plasmodium falciparum, Ghanian isolate and IT strain. Disease models: P. knowlesi (macaque), P. chabaudi (rodent), P. berghei (rodent), P. gallinaceum (avian), P. reichenowi (chimpanzee). Biology and evolution of Plasmodium species and other apicomplexa. Table 1 Current status of Plasmodium projects at the PSU Comparison of the Ghanian clinical isolate with 3D7 is in preliminary stages. This analysis provides an exciting opportunity to analyse the genome of a pathogen in relation to the laboratory adapted 3D7. P. knowlesi is now at 8X and entering the finishing phase. This has a enabled preliminary comparison with 3D7 and an analysis of 5 SICAvar genes (Schizont- infected cell agglutination variant antigens). 2 Analysing Plasmodium spp. genomes using 3D7 as a reference 3D7 is an important reference in the analysis of other Plasmodium spp. genomes. Contigs can be arranged into pseudochromosomes by comparison to 3D7 with TBlastX and ordered relative to it. This approach assumes that since the organisms are closely related, regions of conserved gene order between them will be evident. Such regions of conserved synteny are present throughout comparisons (Figure 1), Figure1 An example of regions of conserved synteny between P. falciparum and P. knowlesi. P. falciparum chr7 P. knowlesi 1 However, a significant drawback to this approach has the drawback that synteny is assumed. For example, the possibility remains that a locus may show conservation of synteny to a locus on falciparum chr1 at the local level, but is in fact present on a different chromsome which is not analogous to falciparum chr1 (Figure 2). Consideration of both sequencing and BAC end read pair information can distinguish cases where contigs, which are assumed to be linked by ordering against P falciparum, are not physically linked. This is indicated by the presence of unpaired reads (Figure 2). Integrating read pair information will result in scaffolds which more accurately reflect P. knowlesi chromosomes. Thus more confidence can be placed in predicted breakpoints in conserved synteny which may give insight into the molecular basis of observed phenotypes. This process will also support the finishing process, aid accurate gene prediction and thus speed release of genome datasets, Figure 2 Read pair information provides evidence for physical linkage 3 Read pair information to confirm linkage of ordered contigs for accurate scaffolds of P. knowlesi chromosomes. Figure 2 Hypothetical contigs of P. knowlesi (light blue and red horizontal boxes) are show ordered against 3D7(dark blue horizontal boxes) using tblastx. Blast hits are shown by red blocks. Matched read pairs are denoted by inward black and orange inward facing arrows joined by a dotted lines. Orange matched read pairs span the boundary of two ordered contigs providing evidence for their linkage. Unmatched read pairs are denoted by red, green, orange and violet arrows and accumulate at boundaries that are not linked. Read pair evidence can be used to map contigs, in this case suggesting that contigA and B should be interchanged, thus resulting in read pairs becoming matched. 4 Results An ordering process has been applied to an 8X PHRAP assembly of P. knowlesi 2766 contigs ( median 1.7 kb ). These contigs were size filtered resulting in a set of 890 (median 5.8 kb) which were ordered into resulting in 14 metachromosomes (Figure 3). The ordering process first removed any contigs below 5 kb. Z % of the remaining contigs were ordered. Figure 3 ACT view of pseudochromosome 7 to 3D7 chr7 7 References SNAP and projector gene prediction analysis has resulted in a set of 5186 predivted proteins. These will be presented in GeneDB ( (Hertz- fowler et. al.) Manual review and annotation of snap/projector gene predictions is in progress, 286 have been manually reviewed thus far. Pkn chr7 3D7chr7 Figure 3 Ordering of contigs to generate Pkn pseudochromosome 7. Blast hits are shown by red lines joining the two sequences. 3D7 genes are shown on the six frame translation of 3D7 chr7. Note that it is not possible to order contigs onto the subtelomeric regions and to the internal VAR gene array. Results contd 5 Preliminary annotation of Shizont-Infected Cell Agglutination antigens SICAvar in P. knowlesi. SICAvar antigens have been shown to play an important role in virulence. The SICA agglutination assay demonstrated that recrudescing parasitemic waves were associated with variant phenotypes (Brown and Brown). The proteins responsible for agglutination of infected erythrocytes were later characterised as the SICAvar antigens (Howard et al, 1983). SICAvar antigens are analogous to P falciparum erthrocyte membrane protein -1(Pfemp1) (Leech., 1984; Howard et al., 1988). A first analysis of P. knowlesi contigs has revealed four full length SICAvar antigens. Figure 4 ACT view of a BLASTn comparison of four contigs encoding SICAvar antigens Figure 5 The ACT comparison shows four SICAvar genes (red boxes).The first and second have 10 and 7 exons repectively, the third is truncated by the end of the contig, and the forth has 12 exons. Blast hits (High scoring pairs) between the genes are denoted by red or blue lines. The hits shown have a minimum nucleotide identity of 80 %. A blue line indicates that the hit is inverted. The green region denotes 2 kb immediately upstream of the start position Figure 6 Ordering places P knowlesi contig 4778 at the right hand telomere This analysis, although not conclusive supports the hypothesis that SICAvar genes are located close to the telomeres. The right hand end of contig4778 has heptameric repeats resembling the telomeric heptad of 3D7 (1 arrowed). Regions in the 3’UTR are similar to regions in REP20 (2 arrowed).Regions with the introns of SICAvar shown have similarity to regions of VAR introns and/or regions flanking exon/intron boundaries. Future comparison of P knowlesi and P falciparum telomeric/subtelomeric regions should shed light on the analogy between SICAvar and VAR genes and mechanisms which generate their antigenic diversity and control their expression throughout the life cycle.