NCBI Vector-Parasite Genomic Related Databases Chuong Huynh NIH/NLM/NCBI Sao Paulo, Brasil July 12, 2004

2 NCBI Did you read? Paper#1: Getting the Most Out of Bioinformatics Resources – Jessica Kissinger and David Roos Paper#2: Parasite Genome Databases and Web-based Resources – Christiane Hertz-Fowler and Neil Hall Nucleic Acid Research Database Issue – Free Issue Nucleic Acid Research Web Server Issue – Free Issue

3 NCBI Participants Tse-tse fly -1 Malaria – 4 Tryp – 3 Toxoplasma - 1 Leishmania 2 Anopheles – 1 TB – 3 Dengue 1 No specific organism - 3

4 NCBI GOLD – Genome OnLine Database Resource for complete and ongoing genome projects around the world.

5 NCBI General DNA/Protein Databases Primary DNA and Protein Databases –EMBL/DDBJ/GenBank – nucleotide sequence submission/search/retrieval –TrEMBL – protein sequence database of translated nucleotide sequences –SWISS-PROT – manually annotated protein db –TrEMBL/SWISS-PROT/PIR  UniProt –InterPro – resource integrating protein signature db –CDD – conserved domain database

6 NCBI GMOD Generic Model Organism Database Construction Set a joint effort by the model organism system databases WormBase, FlyBase, MGI, SGD, Gramene, Rat Genome Database, EcoCyc, and TAIR to develop reusable components suitable for creating new community databases of biology. WormBaseFlyBaseMGISGDGrameneRat Genome DatabaseEcoCycTAIR Based on Perl E.g GiardiaDB

7 NCBI GUS-Based Genome Databases Genomics Unified Schema Existing database schema for functional genomics and user interface Include: GeneDB, PlasmoDB, RAD, Allgenes, TcruziDB, CryptoDB, ToxoDB Consistency; don’t have to reinvent the wheel, reusable components Somewhat consistent user interface, pre- canned queries via e.g. scroll down forms

8 NCBI GUS-Based Database

9 NCBI Databases for Vectors

10 NCBI Mosquito Resources 1. AnoBase - the Anopheles database 2. Anopheles Gene Plot Cd-rom You should have this, but a local copy is available on the course website. 3. 4. taxid=7165

11 NCBI AnoBase

12 NCBI Ensembl - Anopheles

13 NCBI Mosquito Genome WWW Server

14 NCBI Tse-tse (Glossina) Fly

15 NCBI Databases for “Parasites”

16 NCBI Nematode Genomics

17 NCBI Schistosoma mansoni

18 NCBI Leishmania major (Friedlin)

19 NCBI Trypanosoma brucei

20 NCBI Trypanosoma cruzi

21 NCBI Malaria - Plasmodium Malaria Genome Sequencing Project Consortium Database

22 NCBI PlasmoDB Feature Searching PlasmoDB via text search, find genes by location, sequence features, path assignment, phylogenetic profile of orthologs, search and browse organeller genomes, boolean operators to combine search, sequence search, search gene expressionprofiles, search pathways and cellular location, Browse sequence features such as AT content, tandem repeats, homology to other species of Plasmodium, EST similarities, and BLAST hits Browse annotated features with links to detailed features and analysis Provided gene features such as protein features, database cross references with with RefSeq, GenBank, provides access to automatically generated links to orthologous genes from several species Bulk download in various formats of annotated and predicted gene sequences, translations, whole genomic sequences, and EST libraries

23 NCBI Toxoplasma gondii

24 NCBI Mycobacterium tuberculosis

25 NCBI Giardia lamblia 11.7 Mbp genome 355 contigs as of July 2004 (143 supercontigs) 95.93% coverage GiardiaDB –uses Generic Model Organism Database

26 NCBI GiardiaDB

27 NCBI M. Tuberculosis

28 NCBI Microbial Genomes

29 NCBI Viruses

31 NCBI Entamoeba E. histolytica 7X assembly Feb 2003 E. terrapinae E. invadens 2.8 X coverage March 2003 E. moshkovskii

32 NCBI NCBI SNP Database Schema

