Download presentation
Presentation is loading. Please wait.
Published byCaren Higgins Modified over 9 years ago
1
Web Apollo and the VectorBase user community Gloria I. Giraldo-Calderón March 31, 2015
2
Outline ●Gene annotation o Gene automatic annotations o Gene manual annotation and metadata o Basics: A good vs a bad gene model o Why do we need gene manual annotations and gene metadata? ●Why did we replace the Community Manual Annotation (CAP) with Web Apollo (WA)? o Offline vs. online o Advantages vs disadvantages ●How do we interact with WA developers and outreach representatives? ●How do we get the community to submit data?
3
Gene annotation
4
VectorBase gene “automatic” annotations gap 100 Ns Scaffolds or Supercontigs mapping (Optional. Not possible with bioinformatics, must be experimental) Gene prediction: evidence based (BLAST), Ab initio (SNAP), experimental evidence (ESTs, RNAseq, protein or peptide sequencing)
5
Gene “manual” annotation and metadata
6
Gene manual annotation and metadata
7
Metadata -VectorBase gene ID (e.g., AGAP000002) -Organism (species) (e.g., Anopheles gambiae) -Symbol (e.g., para) -Synonym (e.g., kdr, VSC) -Description (e.g., voltage-gated sodium channel) -Comments/notes (e.g., truncated gene, other part on scaffold xxx)
8
Why do we need gene manual annotations and gene metadata?
9
Genome Browser: Gene Page
10
-Homologs and Phylogenetics -Ontology -Variation (e.g., Single Nucleotide Polymorphisms, SNPs) Why do we need gene manual annotations and gene metadata? For downstream analyses of gene(s), gene families or genomes such as:
11
Homologs and Phylogenetics -wrong assignment of orthologs and paralogs -gene alignment ---> tree -wrong inference evolutionary relationships between genes or species -branches with a wrong length, could lead to misleading lineages changes over time (the longest the branch the larger the amount of change) -wrong estimates about the ancestral and derived states, genes or species -wrong taxonomic interpretations
12
Ontology GO: biological process (ion transport, sodium i.t., transmembrane transport ) GO: molecular function (ion channel activity, voltage-gated sodium channel activity, calcium ion binding) GO: cellular component (voltage-gated sodium channel, membrane)
13
Variation (e.g., Single Nucleotide Polymorphisms, SNPs)... T T A...... T T T... SNP L 1014 F Leucine ---> Phenylalanine Hypothetical example: -User is interested in gene “x” -They download this gene from VB -Start analyses -Finds/reports the presence/absence of the SNP -If the gene of interest is not correctly annotated, e.g., missing an exon or part of an exon, results are going to be wrong
14
* *
15
-The size of the genomes -The phylogenetic distance among genomes Number of genomes (genome size): -VB: 37 (110 Mbp – 3,000 Mbp) -EuPathDB: 186 (2 Mbp – 193 Mbp) -PATRIC: 3,481 Bacteria & 186 Archaea (10 kbp – 14 Mbp) -ViPR: 546,381 & IRD: 365,618 (few kbp – 250 kbp)
16
Why did we replaced the Community Manual Annotation (CAP) with Web Apollo?
17
Offline vs. online curation Community Manual Annotation (CAP) Web Apollo gene models RNAseq User-created Annotations
18
Advantages & Disadvantages Community Manual Annotation (CAP) -People had to use Artemis or (Desktop) Apollo: requires downloading scaffolds or supercontigs from VB -VB gene updates can take 2 months or more → more than one person working on the same gene -Most of the time our internal GFF3 validator found issues with submitted data files. Web Apollo -Is web-based, which allows easier collaboration -There is not, however, a clear way to indicate/know when a user is “still working” or “done” with an annotation. -New annotations though are instantaneously visualized by all users of WA.
19
How do we interact with Web Apollo developers and outreach representatives? -Developers: ○Monthly WA developers open conference call ○email -Outreach: ○Meetings, workshops and conferences ○email or phone We are also subscribed to their user email list (help desk).
20
How do we get the community to submit data? -First invitation comes from genome leaders directly (genome paper) -Users send emails to our help desk (info@vectorbase.org)info@vectorbase.org -During outreach events, such as workshops, meetings and conferences -Social media post (Facebook and Twitter) -Help content: Tutorial page
22
Genome group manual annotation efforts -Workshops -Annotation jamborees -Webinars -Independent work
23
Help content: Tutorial page -Decision tree -FAQs -Web Apollo resources: user guide, slides with speaker notes, sample exercises -Documentation about available tracks -Video tutorial (Intro, ~ 50 min) and a video clip (Intron/exon boundaries ~2:45 min)
24
User’s submission stats and Importation of data to VectorBase To be continued by Daniel Lawson...
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.