Presentation is loading. Please wait.

Presentation is loading. Please wait.

Generic Database. What should a genome database do? Search Browse Collect Download results Multiple format Genome Browser Information Genomic Proteomic.

Similar presentations


Presentation on theme: "Generic Database. What should a genome database do? Search Browse Collect Download results Multiple format Genome Browser Information Genomic Proteomic."— Presentation transcript:

1 Generic Database

2 What should a genome database do? Search Browse Collect Download results Multiple format Genome Browser Information Genomic Proteomic literature Interact with other Database Generic Usable by everyone

3 GeneDB – An Overview Aim – To provide a database to house the data from the many sequencing projects that the Sanger Institute has been involved in. The database had to be: Generic, flexible enough to handle sequence from diverse organisms Curatable, capable of being manually edited by annotators and curators Intuitive and user friendly Capable of housing new data types, easily expandable Searchable, allow users complete flexibility in searching, selecting and downloading whatever information they want Interactive, community feedback

4 SpeciesGenome sizeStatusCurated Leishmania major 33600In FinishingYes Leishmania infantum 33600 280k reads 5 X Yes Trypanosoma b. brucei 35000In FinishingYes Trypanosoma vivax 30000300k reads ~6 X Yes Trypanosoma cruzi ~41000 In Finishing 19 XNo? GeneDB November 2004 - Datasets www.genedb.org Total number of organisms – 26 Number of protozoa - 12 Leishmania braziliensis ~ 33600 361k reads 5 X Yes Trypanosoma congolense ~30000 262k reads ~5 X Yes Trypanosoma b. gambiense ~30000 188k reads ~5 X Yes Kinetoplastids

5 WWW.genedb.org

6

7 a)Basic information – on the selected gene b)Location – The chromosome number, coordinates, gene length and a graphical map c) Curated and/or automatic annotation d)Predicted peptide properties statistics on the predicted protein, known or predicted domains and motifs

8 e)Gene Ontology – Annotation using the GO controlled vocabulary. f)Database cross references are linked to other public databases g)Curated orthologs – database links to manually selected orthologous genes h)Similarity information and the respective database links i)Swiss-Prot annotations – for this protein and keywords j)Contact – feedback forms for curators and technical queries

9 Orthologs and Paralogues in GeneDB Tri-tryp orthologs Predicted by clustering and Reciprocal BLAST Paralogs or families Predicted using BLAST P and TribeMCL 4 BLAST e value cutoffs TribeMCL Enright A.J., Van Dongen S., Ouzounis C.A; Nucleic Acids Res. 30(7):1575-1584 (2002)

10 Help

11

12

13 (http://godatabase.org/cgi-bin/go.cgi?query=GO%3A0006166)

14 Sequence viewer and annotation tool

15

16 How to access data: keyword searching sequence searching/ motif search complex querying browsable catalogues, product, domain browsable contig/chromosome maps GO (gene ontology) - AmiGO across species

17

18 Searching GeneDB Simple Query Sequence search analysis Browse Catologues

19 Chromosome/contig maps

20 Search multiple datasets over multiple organisms, Uses more than one BLAST algorithm if appropriate Produces an intermediate results page, listing summary of the top 5 hits of all searches If protein sequence used will also display predicted Pfam protein families found Access full BLAST search result from intermediate page OMNIBLAST

21

22

23 Complex querying

24 Complex querying with boolean search tool

25 Cross species search for nucleoside transporter By name or ID By product By protein domain

26 AmiGO – local Gene Ontology (GO) browser

27

28 Proteomics Tool Select the dataset Select restriction enzyme Enter peptide mass data

29 Protein motif search

30

31 Data downloads Any search result that gives a list History of any boolean queries

32 Contiguous sequence Generate download list by adding to gene basket

33 Leishmania major Stats Trypanosoma brucei stats

34 Gene Naming

35

36 GeneDB reference guide Papers: Trends in Parasitology, 2002 18 (10) 465-67 January 2004 issue of Nucleic Acids Research Feed back forms for technical and biological queries More information http://www.genedb.org/

37


Download ppt "Generic Database. What should a genome database do? Search Browse Collect Download results Multiple format Genome Browser Information Genomic Proteomic."

Similar presentations


Ads by Google