Annotation of Sarcocystis neurona scaffolds Nigel Austin Turgay Ibrikci Liliana Lopez Kleine Marton Megyeri Caribbean Training Programme on Bioinformatics.

Slides:



Advertisements
Similar presentations
On line (DNA and amino acid) Sequence Information Lecture 7.
Advertisements

Sequence Analysis MUPGRET June workshops. Today What can you do with the sequence? What can you do with the ESTs? The case of SNP and Indel.
1 Gene Finding Charles Yan. 2 Gene Finding Genomes of many organisms have been sequenced. We need to translate the raw sequences into knowledge. Where.
Investigating the Importance of non-coding transcripts.
Human Genome Project. Basic Strategy How to determine the sequence of the roughly 3 billion base pairs of the human genome. Started in Various side.
Annotating genomes using proteomics data Andy Jones Department of Preclinical Veterinary Science.
Sequence Analysis. Today How to retrieve a DNA sequence? How to search for other related DNA sequences? How to search for its protein sequence? How to.
Malaria Jonathan Kidd Jennifer Koehl Heather Louch Edwin Wong
Genome Annotation BCB 660 October 20, From Carson Holt.
Arabidopsis Gene Project GK-12 April Workshop Karolyn Giang and Dr. Mulligan.
Making Sense of DNA and protein sequence analysis tools (course #2) Dave Baumler Genome Center of Wisconsin,
On line (DNA and amino acid) Sequence Information
Sequence Analysis with Artemis & Artemis Comparison Tool (ACT) South East Asian Training Course on Bioinformatics Applied to Tropical Diseases (Sponsored.
Wellcome Trust Workshop Working with Pathogen Genomes Module 3 Sequence and Protein Analysis (Using web-based tools)
Basic Introduction of BLAST Jundi Wang School of Computing CSC691 09/08/2013.
Bikash Shakya Emma Lang Jorge Diaz.  BLASTx entire sequence against 9 plant genomes. RepeatMasker  55.47% repetitive sequences  82.5% retroelements.
What is comparative genomics? Analyzing & comparing genetic material from different species to study evolution, gene function, and inherited disease Understand.
Rhesus genome annotations Rob Norgren Department of Genetics, Cell Biology and Anatomy University of Nebraska Medical Center.
Introduction to Bioinformatics CPSC 265. Interface of biology and computer science Analysis of proteins, genes and genomes using computer algorithms and.
NCBI Review Concepts Chuong Huynh. NCBI Pairwise Sequence Alignments Purpose: identification of sequences with significant similarity to (a)
Blast 1. Blast 2 Low Complexity masking >GDB1_WHEAT MKTFLVFALIAVVATSAIAQMETSCISGLERPWQQQPLPPQQSFSQQPPFSQQQQQPLPQ QPSFSQQQPPFSQQQPILSQQPPFSQQQQPVLPQQSPFSQQQQLVLPPQQQQQQLVQQQI.
BASys: A Web Server for Automated Bacterial Genome Annotation Gary Van Domselaar †, Paul Stothard, Savita Shrivastava, Joseph A. Cruz, AnChi Guo, Xiaoli.
Common Errors in Student Annotation Submissions contributions from Paul Lee, David Xiong, Thomas Quisenberry Annotating multiple genes at the same locus.
NCBI Vector-Parasite Genomic Related Databases Chuong Huynh NIH/NLM/NCBI Sao Paulo, Brasil July 12, 2004
Module 3 Sequence and Protein Analysis (Using web-based tools) Working with Pathogen Genomes - Uruguay 2008.
Part I: Identifying sequences with … Speaker : S. Gaj Date
Last lecture summary. Window size? Stringency? Color mapping? Frame shifts?
Welcome to DNA Subway Classroom-friendly Bioinformatics.
CISC667, F05, Lec9, Liao CISC 667 Intro to Bioinformatics (Fall 2005) Sequence Database search Heuristic algorithms –FASTA –BLAST –PSI-BLAST.
RNA Sequencing I: De novo RNAseq
You have worked for 2 years to isolate a gene involved in axon guidance. You sequence the cDNA clone that contains axon guidance activity. What do you.
Web Databases for Drosophila Introduction to FlyBase and Ensembl Database Wilson Leung6/06.
Genome Annotation Rosana O. Babu.
Sequence Analysis with Artemis and Artemis Comparison Tool (ACT) Carribean Bioinformatics Workshop 18 th -29 th January, 2010.
INTRODUCTION ● Expressed sequence tags offer a low cost approach to gene discovery ● For a range of non-model organisms, ESTs represent the only sequence.
EuPathDB: an integrated resource and tool for eukaryotic pathogen bioinformatics Aurrecoechea C., Heiges M., Warrenfeltz S. for the EuPathDB team CTEGD,
Young male fisher Exhibited abnormal behavior and ataxia in hind limbs Necropsy revealed pale organs and hemorrhages on heart, lung, skeletal muscle,
EB3233 Bioinformatics Introduction to Bioinformatics.
Genome annotation and search for homologs. Genome of the week Discuss the diversity and features of selected microbial genomes. Link to the paper describing.
DNA LIBRARIES Dr. E. What Are DNA Libraries? A DNA library is a collection of DNA fragments that have been cloned into a plasmid and the plasmid is transformed.
Annotation of Drosophila virilis Chris Shaffer GEP workshop, 2006.
Exploring and Exploiting the Biological Maze Zoé Lacroix Arizona State University.
Large-scale Prediction of Yeast Gene Function Introduction to Bio-Informatics Winter Roi Adadi Naama Kraus
August 2008Bioinformatics tools for Comparative Genomics of Vectors1 Genome Annotation Daniel Lawson EBI.
Bioinformatics Workshops 1 & 2 1. use of public database/search sites - range of data and access methods - interpretation of search results - understanding.
David Wishart February 18th, 2004 Lecture 3 BLAST (c) 2004 CGDN.
Group discussion Name this protein. Protein sequence, from Aedes aegypti automated annotation >25558.m01330 MIHVQQMQVSSPVSSADGFIGQLFRVILKRQGSPDKGLICKIPPLSAARREQFDASLMFE.
Finding genes in the genome
Annotation of eukaryotic genomes
What is BLAST? Basic BLAST search What is BLAST?
Gene Finding in Chimpanzee Evidence based improvement of ab initio gene predictions Chris Shaffer06/2009.
Work Presentation Novel RNA genes in A. thaliana Gaurav Moghe Oct, 2008-Nov, 2008.
Gene Expression Ilana Granovsky Jonathan Laserson.
BLAST: Basic Local Alignment Search Tool Robert (R.J.) Sperazza BLAST is a software used to analyze genetic information It can identify existing genes.
Web Databases for Drosophila
What is BLAST? Basic BLAST search What is BLAST?
bacteria and eukaryotes
The Transcriptional Landscape of the Mammalian Genome
Basics of BLAST Basic BLAST Search - What is BLAST?
Sequence based searches:
Genome Center of Wisconsin, UW-Madison
Bioinformatics and BLAST
Gene Annotation with DNA Subway
Genome organization and Bioinformatics
Identify D. melanogaster ortholog
Comparative Genomics.
What do you with a whole genome sequence?
.1Sources of DNA and Sequencing Methods 2 Genome Assembly Strategy and Characterization 3 Gene Prediction and Annotation 4 Genome Structure 5 Genome.
Introduction to Alternative Splicing and my research report
Basic Local Alignment Search Tool
Presentation transcript:

Annotation of Sarcocystis neurona scaffolds Nigel Austin Turgay Ibrikci Liliana Lopez Kleine Marton Megyeri Caribbean Training Programme on Bioinformatics January 2010

Sarcocystis neurona Genus: Sarcocystis - parasitic protozoa occur as sporocysts in the muscle of mammals, birds, and reptiles. In humans – asymptomatic Sarcocystis neurona causes equine protozoal myoencephalitis 2

S. neurona & Related Apicomplexa 3 Sarcocystis neurona Eimeria Neospora Toxoplasma

Life Cycle of S. neurona 4

About Data Data cordially supplied by Dr. Jessica Kissinger who very recently acquired the genome sequence First 120,000 bp in 4 scaffolds – analysis Then 400,000 bp in 4 scaffolds - analysis 5

Objectives To annotate novel DNA sequences of S. neurona. Detection of coding sequences by: – comparison with other sequences in data bases NB: No reference genome or other info was available since sequences were novel 6

Strategy for Scaffolds BLASTX in nr db: search of translated sequence in protein databases TBLASTX in est db: search of translated sequence in translated sequence databases Comparison in ACT with most closely related organisms (Toxoplasma gondii and Neospora caninum) 7

Results – Blast Search 8

Results BLAST 9 BLASTDBStartEndSimilarityE-valueSubject BLASTXnr E-16Conserved hypothetical protein Toxoplasma gondii BLASTXnr E-44Conserved hypothetical protein Plasmodium falciparum BLASTXnr""442.00E-42Conserved hypothetical protein Plasmodium vivax BLASTXnr""411.00E-37Conserved hypothetical protein Plasmodium berghei BLASTXnr""401.00E-37Conserved hypothetical protein Cryptosporidium muris BLASTXnr""431.00E-22 Conserved hypothetical protein Cryptosporidium parvum BLASTXnr E-33Putative lectin doman protein Toxoplasma gondii BLASTXnr E-18Transcript GF18541 Drosophila melanogaster BLASTXnr""666.00E-17Putative acylphosphatase Aedes aegypti BLASTXnr""694.00E-16Putative acylphosphatase Toxoplasma gondii TBLASTest E-08Xenopus mRNA (cDNA library) TBLASTest""511.00E-07Cyprinus carpio mRNA (cDNA library) TBLASTest E-10T. gondii mRNA (cDNA library) TBLASTest E-33T. gondii mRNA (cDNA library)

ACT Results 10 Match of region with a conserved gene in Neospora caninum and Toxoplasma gondii Neospora caninum scaffolds

Hmmm…. No genes in 400,000 bp DNA??? And then…. Expertise, experience He was able to locate a gene 11

Gene Discovered! Match of region with a conserved gene in Neospora caninum 12

Discovered Gene - Gene1 The discovered gene was expanded on both the 5’ and 3’ end Start and stop codons were identified Protein sequence was determined BLAST – hypothetical protein with high similarity to one found in Neospora and Toxoplasma 13

Gene Comparison Match of region with a conserved gene in Neospora caninum and Toxoplasma gondii Neighbouring genes are not present in the scaffold. 14

Results – Uniprot Search Performed with GENE1 15

Further Protein Info Characterize our protein product – Membrane protein? High regions of hydrophobicity – Domains and motifs – Secondary structures 16

No transmembrane motifs present Hydrophobicity Graph 17

Domains & Motifs 18

Conclusion Various blast searches may assist in location of orthologous genes in other genomes ACT very useful tool for gene discovery and annotation (along with experience & expertise) One gene (Gene1) was found in 400 Kb of DNA – scaffolds perhaps in a gene poor region of genome Gene1 is perhaps orthologous with a gene in Toxoplasma and Neurospora Hypothetical gene – no function prescribed to it 19

Thank You!!! 20