Presentation is loading. Please wait.

Presentation is loading. Please wait.

Annotator Interface GUS 3.0 Workshop June 18-21, 2002.

Similar presentations


Presentation on theme: "Annotator Interface GUS 3.0 Workshop June 18-21, 2002."— Presentation transcript:

1 Annotator Interface GUS 3.0 Workshop June 18-21, 2002

2 Outline Current annotation efforts Motivation for new annotation tool
Requirements for new annotation tool Thoughts on design and implementation Future plans

3 Current Annotation Efforts

4 Overview of Current Efforts
Automated annotation has been applied to the DoTS transcripts Predicted gene ownership (clustering of assemblies) BlastX against NR Automated assignment of descriptions based on similarity BlastX against ProDom and RPS-Blast against CDD Predicted GO Functions Framefinder Predicted Protein Sequences Blat alignments EPCR, Index Words, etc… Manual annotation efforts have focused on validating the automated annotation and adding additional information at the central dogma level Manual annotation of the gene index utilizes an annotation tool, the GUS Annotator Interface, which directly updates the GUSdev database.

5 “Unassembled” clusters (generate consensus sequences)
DoTS RNA transcripts Incoming Sequences (EST/mRNA) GenBank, dbEST sequences Make Quality (remove vector, polyA, NNNs) The assembly of sequences generates a consensus sequence or DoTS transcript “Quality” sequences Block with RepeatMasker Blocked sequences Blastn to cluster sequences “Unassembled” clusters Assemble sequences with CAP4 CAP4 assemblies (generate consensus sequences) BLASTn DoTs consensus sequences (98% identity, 150bps) Gene Cluster (RNA s in the Gene) Dots Consensus sequences

6 Current Efforts: Gene Annotation (1)
RNA RNAInstance RNAFeature Assembly Generate DoTS transcripts RNA_1 RNA_5 RNA_2 RNA_3 RNA_4 Instance_1 Instance_5 Instance_2 Instance_3 Instance_4 Feature_1 Feature_5 Feature_2 Feature_3 Feature_4 Assembly_1 Assembly_5 Assembly_2 Assembly_3 Assembly_4 Gene_A Task 1: Validation of Gene Membership

7 Current Efforts: Gene Annotation (2)
RNA RNAInstance RNAFeature Assembly Generate DoTS transcripts RNA_1 Instance_1 Instance_5 Instance_2 Instance_3 Instance_4 Feature_1 Feature_5 Feature_2 Feature_3 Feature_4 Assembly_1 Assembly_5 Assembly_2 Assembly_3 Assembly_4 Gene_A RNA_2 RNA_3 Gene_B RNA_4 RNA_5 - Removing RNAs from the cluster results in the creation of a new Gene An entry is made in the MergeSplit table for tracking purposes Similar process followed when an RNA is added to a Gene

8 Current Efforts: Gene Annotation (3)
Task 2: Assign Reference RNA will be annotated further RNA table Task 3: Assign Approved Gene Name/Symbol Gene Table Evidence: Comment (specifies database link) Task 4: Assign Gene Description Evidence: Comment Task 4: Associate known Gene synonyms GeneSynonym table

9 Current Efforts: RNA Annotation
Annotation of “Reference Sequence” Task 1: Assign/Confirm Description of assembly RNA table Task 2: Confirm/Add/Delete GO Functions ProteinGOFunction (in GUSdev, GO tables have been re-designed in GUS3.0) Evidence: Comments or Similarity (ProDom, CDD-Pfam, CDD-Smart, or NR)

10 Current Annotator Interface Architecture
Erebus Zeus Annotator Interface JDBC (Query Only) GUSdev JavaServlet writes executes “XML” file Perl Object Layer DBI(Insert/Update/Delete) reads AnnotatorInterface Submitter GA-Plugin

11 Current Annotator Interface

12 Current Gene Annotation
Validate Cluster and Assign Reference RNA/Assembly

13 Current Gene Annotation (cont.)
Assign Gene Name/Symbol Assign Gene Description Assign Gene Synonym(s) Evidence

14 Current RNA (and Protein) Annotation
RNA Description Evidence GO Functions

15 Allgenes Display of Gene Annotation

16 Allgenes Display of RNA Annotation
RNA Description (Confirmed or manually added GO Functions)

17 Status of Current Annotation (as of June 20, 2002)
1289 manually reviewed genes 1003 with gene name 697 with gene synonyms 1046 with description 6146 manually reviewed RNAs/DoTS assemblies 949 ‘proteins’ with reviewed GO function

18 Motivation for new tool
Want to annotate using genomic sequence Create “curated” gene models specifying structure Increase structure of annotation in GUS Annotation of proteins Redefinition of annotation tasks Current interface not designed for this purpose

19 Some Other Annotation Tools
Artemis Developed and used at Sanger Reads and writes flat files Supports rich set of annotations Save as EMBL format Apollo Combined effort including members from Sanger and Berkeley Flat files (CORBA access to ENSEMBL) 2 versions, currently being merged Sanger: annotation viewer Berkeley: focus on editing No Existing Tool To Meet All of Our Needs

20 Requirements At a High Level

21 Requirements: Graphical View
Provide alignment of features on genomic sequence could potentially display any feature type currently stored in GUS3.0 features can be selected and used to generate “curated” features similar to display and functionality in Apollo Toggle (or configure) the display of each feature type Zoom to sequence level and will include links to functionality relevant to the feature highlighted Also support creation of features “from scratch” based on literature, etc. Detail editors provide ability to change endpoints, etc.

22 Gene Annotation Create curated gene model specify gene boundaries
specify location of exons (and thus introns) 5' exon boundary (putative transcription start site) 3' exon boundary (include poly adenylation signal) automatic creation of Gene entry merge with existing gene instances through GeneInstance table tables/views affected: GeneFeature ExonFeature GeneInstance Gene MergeSplit evidence: features used to create model, PubMed ID should be as easy as clicking on existing features and saying make curated (then can modify endpoints, etc. if needed)

23 Gene Annotation (2) Assign (HUGO or MGI approved) abbrievated gene name/symbol Gene Table Evidence: ExternalDatabaseLink Assign full gene name (MGI or HUGO full gene name) Assign abbrievated gene name/symbol synonyms (non-approved gene symbols) GeneSynonym Table Assign full gene name aliases GeneAlias Table

24 Gene Annotation (3) Assign gene category (e.g. non-coding)
Gene Table Evidence: ExternalDatabaseLink/Literature Reference Similarity (eg. to known non-coding RNA) Confirm/assign gene chromosomal location GeneChromosomalLocation RH mapping data Alignments/Features OMIM Link assignment (verification if computationally determined) ExternalDatabaseLink

25 RNA Annotation (1) Create “curated RNAs”
Define RNA transcript forms of gene (create RNAs) Using exons defined by curated gene 5' and 3' UTRs Automatic creation of RNA entry Merge existing RNA instances Tables affected: RNAFeature UTRFeature RNAInstance RNA Evidence: Features used to create Assign RNA categories to created RNAs (e.g. alternative form) RNARNACategory Table

26 RNA Annotation Assign (or confirm computed) RNA description
RNA table Evidence: Gene from which it is derived Anatomy expression assignment(s) RNAAnatomy RNAAnatomyLOE Evidence: ExternalDatabaseLink/Literature references Assembly anatomy percent from DoTS RAD experiments Assign GO terms to curated RNA (non-coding RNAs, e.g. small RNA involved in splicing) GOTermAssociation GOTermAssociationEvid Evidence: ExternalDatabaseLInk, Literature References Computational analysis performed on curated RNA sequences Annotation workflow Framefinder translation, GO terms, Similarities, etc.

27 Requirements: Protein Annotation
Confirm/assign GO Function GOTermAssociation, GOTermAssociationEvid Evidence: ExternalDatabaseLink and/or Literature References Confirm/assign GO Biological Process Confirm/assign GO Cellular Component Assign protein name Protein Table Evidence: ExternalDatabaseLink, Literature Ref, Similarities Assign protein name synonyms

28 Evidence will be associated with all annotation
Protein Annotation (2) Assign protein category (post-translational modifications) ProteinProteinCategory Evidence: ExternalDatabaseLink, Literature References Protein-protein interactions assigned Interaction InteractionInteractionLOE Evidence: PubMed ID, etc. Protein pathway assignments PathwayInteraction (for newly created interactions) Still under consideration: What is best way to link with existing pathway for example, Pathway is represented in DoTS, and we want to say that this curated Protein is really the same as a protein in a pathway. Assign post translational modification category Assign interactions involving this protein Assign pathway protein is known to be involved in Assign protein family Ability to modify and/or delete curated protein Evidence will be associated with all annotation

29 Potential New Architecture
Java Application more control over graphical interface faster then web-based Java Object Layer simplifies handling of complex updates

30 Next Steps/ Open Issues
Completion of Java Object Layer Decision regarding BioJava wrappers What exactly will this give us to aid in interface development (eg. FeatureRenderer, etc…) Discussion on layout of interface Joan’s input after experimentation with other tools Depending on the above : Client Side portion which communicates with remote GUS Server Interface Implementation


Download ppt "Annotator Interface GUS 3.0 Workshop June 18-21, 2002."

Similar presentations


Ads by Google