Download presentation
Presentation is loading. Please wait.
1
Comprehensive Microbial Resource www.tigr.org/CMR Bioinformatics Visualization Workshop Owen White May 30, 2002
2
Curation Genome Annotation Michelle Gwinn Bob Dodson Bob DeBoy James Kolonay Bill Nelson Ramana Madupu Sean Daugherty Maureen Beanan Scott Durkin Lauren Brinkac Bioinformatics Engineers Jeremy Peterson Lowell Umayam Samual Angiuoli TIGRFAMs/Groups Dan Haft Jeremy Selengut Maria Ermolaeva (Operons/Terminators) Erik Ferlanti (All vs. All) Faculty Jonathan Eisen (DNA repair) Ian Paulsen (transporters) Steven Salzberg Collaborators Swiss-prot Monica Riley The open source crowd Art Delcher (Glimmer)
3
Retrieval Heterocercal- Forked- Lunate- Emarginate- Truncate- Rounded- Pointed- Caudal Fins http://web.pdx.edu/~bowersn/bi399/lecture2.html
4
Caudal FinsDorsal SpinesDorsal Rays Retrieval across data types.
5
Typical annotation datatypes clone_info: Tracks information related to the parent nucleotide assembly, including its annotation status, which institution the sequence was derived, and whether it is part of a larger assembly such as a chromosome. asm_feature: All major features of the parent assembly are stored here, including annotated genes, predicted genes, repetitive elements, splice sites, and all underlying components of a gene (models, transcript exons, and cds exons). phys_ev: Attribute for each gene component within the asm_feature table. For example, each predicted and annotated gene has a model and multiple exons stored in the asm_feature table. Linking the feature to phys_ev will identify the type of feature present: ie. glimmer, genscan+, genemarkHMM, or working (annotation). This becomes important if a single feature in the asm_feature table is shared by multiple model types. feat_link: This table is key to the principles behind representing gene models in the database. All parent and child relationships are defined here. evidence: The main repository for all sequence database search results. Also, it retains information regarding gene model attributes such as the best blast match and all Pfam matches. ident: Stores attributes for the highest element of the gene component hierarchy, the transcriptional unit. Gene names, loci, EC symbols, and other attributes are available. role_link: The role category assignments for each gene are available here. Roles include examples such as ‘transcription’, ‘DNA synthesis’, ‘translation’, ‘DNA repair’, ‘amino acid metabolism’, etc.
6
Omniome Content, Genes Total # of genes: 132,998 from world-wide effort. (43,311 TIGR projects). 36,274 w/ genetic names. 15,098 genes placed into 5,451 paralogous families. 413 rRNAs. 1311 tRNAs. 49 sRNAs. 293 IS elements.
7
Omniome Content Evidence: 1073 distinct EC#s, assigned to 17308 genes Rows of allVall data: 3,996,851 Rows of HMM TIGRFAM data: 91,550 Rows of HMM Pfam data: 131,963 Rows of COG data: 149,940 Rows of Interpro data: 175,760 Rows of Prosite data: 53,132 Rows of BER data: 91,899
11
TIGRFAM Matrix
14
The Genome Browser: Linear Display of DNA Molecules
15
Genome vs. Genome Protein Hits
16
MUMmer: The Whole Genome Alignment Tool
18
Role Category Graph
24
Multi-Genome Query Tool Query across all genomes based on different properties MW, pI, membrane spanning regions Taxon, Paralogous families, TIGRFAMs, Role Category Best Match to: organism, locus, kingdom, etc. “Genes with >5 membrane spanning regions and MW 36,000-51,000d.” “E. coli genes with best match to Archeoglobis involved in DNA metabolism.”
25
Pseudo-Restriction Digest and Linear Depiction of Cuts
28
Position effect:
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.