Presentation is loading. Please wait.

Presentation is loading. Please wait.

Web Apollo and the VectorBase user community Gloria I. Giraldo-Calderón March 31, 2015.

Similar presentations


Presentation on theme: "Web Apollo and the VectorBase user community Gloria I. Giraldo-Calderón March 31, 2015."— Presentation transcript:

1 Web Apollo and the VectorBase user community Gloria I. Giraldo-Calderón March 31, 2015

2 Outline ●Gene annotation o Gene automatic annotations o Gene manual annotation and metadata o Basics: A good vs a bad gene model o Why do we need gene manual annotations and gene metadata? ●Why did we replace the Community Manual Annotation (CAP) with Web Apollo (WA)? o Offline vs. online o Advantages vs disadvantages ●How do we interact with WA developers and outreach representatives? ●How do we get the community to submit data?

3 Gene annotation

4 VectorBase gene “automatic” annotations gap 100 Ns Scaffolds or Supercontigs mapping (Optional. Not possible with bioinformatics, must be experimental) Gene prediction: evidence based (BLAST), Ab initio (SNAP), experimental evidence (ESTs, RNAseq, protein or peptide sequencing)

5 Gene “manual” annotation and metadata

6 Gene manual annotation and metadata

7 Metadata -VectorBase gene ID (e.g., AGAP000002) -Organism (species) (e.g., Anopheles gambiae) -Symbol (e.g., para) -Synonym (e.g., kdr, VSC) -Description (e.g., voltage-gated sodium channel) -Comments/notes (e.g., truncated gene, other part on scaffold xxx)

8 Why do we need gene manual annotations and gene metadata?

9 Genome Browser: Gene Page

10 -Homologs and Phylogenetics -Ontology -Variation (e.g., Single Nucleotide Polymorphisms, SNPs) Why do we need gene manual annotations and gene metadata? For downstream analyses of gene(s), gene families or genomes such as:

11 Homologs and Phylogenetics -wrong assignment of orthologs and paralogs -gene alignment ---> tree -wrong inference evolutionary relationships between genes or species -branches with a wrong length, could lead to misleading lineages changes over time (the longest the branch the larger the amount of change) -wrong estimates about the ancestral and derived states, genes or species -wrong taxonomic interpretations

12 Ontology GO: biological process (ion transport, sodium i.t., transmembrane transport ) GO: molecular function (ion channel activity, voltage-gated sodium channel activity, calcium ion binding) GO: cellular component (voltage-gated sodium channel, membrane)

13 Variation (e.g., Single Nucleotide Polymorphisms, SNPs)... T T A...... T T T... SNP L 1014 F Leucine ---> Phenylalanine Hypothetical example: -User is interested in gene “x” -They download this gene from VB -Start analyses -Finds/reports the presence/absence of the SNP -If the gene of interest is not correctly annotated, e.g., missing an exon or part of an exon, results are going to be wrong

14 * *

15 -The size of the genomes -The phylogenetic distance among genomes Number of genomes (genome size): -VB: 37 (110 Mbp – 3,000 Mbp) -EuPathDB: 186 (2 Mbp – 193 Mbp) -PATRIC: 3,481 Bacteria & 186 Archaea (10 kbp – 14 Mbp) -ViPR: 546,381 & IRD: 365,618 (few kbp – 250 kbp)

16 Why did we replaced the Community Manual Annotation (CAP) with Web Apollo?

17 Offline vs. online curation Community Manual Annotation (CAP) Web Apollo gene models RNAseq User-created Annotations

18 Advantages & Disadvantages Community Manual Annotation (CAP) -People had to use Artemis or (Desktop) Apollo: requires downloading scaffolds or supercontigs from VB -VB gene updates can take 2 months or more → more than one person working on the same gene -Most of the time our internal GFF3 validator found issues with submitted data files. Web Apollo -Is web-based, which allows easier collaboration -There is not, however, a clear way to indicate/know when a user is “still working” or “done” with an annotation. -New annotations though are instantaneously visualized by all users of WA.

19 How do we interact with Web Apollo developers and outreach representatives? -Developers: ○Monthly WA developers open conference call ○email -Outreach: ○Meetings, workshops and conferences ○email or phone We are also subscribed to their user email list (help desk).

20 How do we get the community to submit data? -First invitation comes from genome leaders directly (genome paper) -Users send emails to our help desk (info@vectorbase.org)info@vectorbase.org -During outreach events, such as workshops, meetings and conferences -Social media post (Facebook and Twitter) -Help content: Tutorial page

21

22 Genome group manual annotation efforts -Workshops -Annotation jamborees -Webinars -Independent work

23 Help content: Tutorial page -Decision tree -FAQs -Web Apollo resources: user guide, slides with speaker notes, sample exercises -Documentation about available tracks -Video tutorial (Intro, ~ 50 min) and a video clip (Intron/exon boundaries ~2:45 min)

24 User’s submission stats and Importation of data to VectorBase To be continued by Daniel Lawson...


Download ppt "Web Apollo and the VectorBase user community Gloria I. Giraldo-Calderón March 31, 2015."

Similar presentations


Ads by Google