BUSINESS SENSITIVE 1 SAAW - Sequence Annotation and Analysis Workshop Boyu Yang and Gene Godbold Battelle Memorial Institute, Charlottesville Operations.

Slides:



Advertisements
Similar presentations
PREDetector : Prokaryotic Regulatory Element Detector Samuel Hiard 1, Sébastien Rigali 2, Séverine Colson 2, Raphaël Marée 1 and Louis Wehenkel 1 1 Bioinformatics.
Advertisements

Genome databases and webtools for genome analysis Become familiar with microbial genome databases Use some of the tools useful for analyzing genome Visit.
Integration of Protein Family, Function, Structure Rich Links to >90 Databases Value-Added Reports for UniProtKB Proteins iProClass Protein Knowledgebase.
Web Apollo Resources at the National Agricultural Library Christopher Childers NAL ARS USDA i5k.nal.usda.gov.
1 Welcome to the Protein Database Tutorial This tutorial will describe how to navigate the section of Gramene that provides collective information on proteins.
The design, construction and use of software tools to generate, store, annotate, access and analyse data and information relating to Molecular Biology.
January 25, Current and Future Database (CH)  Indexing vgd_common (JM; 1Q)  Fully implement Taxonomy tables (JO, DD; 2Q)  Allow subspecies-level.
Bioinformatics for biomedicine Summary and conclusions. Further analysis of a favorite gene Lecture 8, Per Kraulis
Front and Back End: Webpage and Database Management Prepared by Nailya Galimzyanova and Brian J Kapala Supervisor: Prof. Adriano Cavalcanti, PhD College.
Today’s menu: -UniProt - SwissProt/TrEMBL -PROSITE -Pfam -Gene Onltology Protein and Function Databases Tutorial 7.
IST Computational Biology1 Information Retrieval Biological Databases 2 Pedro Fernandes Instituto Gulbenkian de Ciência, Oeiras PT.
EBI is an Outstation of the European Molecular Biology Laboratory. UniProt Jennifer McDowall, Ph.D. Senior InterPro Curator Protein Sequence Database:
Algorithm Animation for Bioinformatics Algorithms.
We are developing a web database for plant comparative genomics, named Phytome, that, when complete, will integrate organismal phylogenies, genetic maps.
Subsystem Approach to Genome Annotation National Microbial Pathogen Data Resource Claudia Reich NCSA, University of Illinois, Urbana.
BTN323: INTRODUCTION TO BIOLOGICAL DATABASES Day2: Specialized Databases Lecturer: Junaid Gamieldien, PhD
GenSAS: Genome Sequence Annotation Server, a Tool for Online Annotation and Curation Dorrie Main, Taein Lee, Ping Zheng, Sook Jung, Stephen P. Ficklin,
Pattern databasesPattern databasesPattern databasesPattern databases Gopalan Vivek.
INTRODUCTION TO WEB DATABASE PROGRAMMING
Annotating Search Results from Web Databases. Abstract An increasing number of databases have become web accessible through HTML form-based search interfaces.
Wellcome Trust Workshop Working with Pathogen Genomes Module 3 Sequence and Protein Analysis (Using web-based tools)
Erice 2008 Introduction to PDB Workshop From Molecules to Medicine: Integrating Crystallography in Drug Discovery Erice, 29 May - 8 June Peter Rose
Viewing & Getting GO COST Functional Modeling Workshop April, Helsinki.
C51BR Applications of Spreadsheets 1 Chapter 16 Getting Started Making Charts.
Title: GeneWiz browser: An Interactive Tool for Visualizing Sequenced Chromosomes By Peter F. Hallin, Hans-Henrik Stærfeldt, Eva Rotenberg, Tim T. Binnewies,
Databases in Bioinformatics and Systems Biology Carsten O. Daub Omics Science Center RIKEN, Japan May 2008.
BASys: A Web Server for Automated Bacterial Genome Annotation Gary Van Domselaar †, Paul Stothard, Savita Shrivastava, Joseph A. Cruz, AnChi Guo, Xiaoli.

SRI International Bioinformatics 1 Object Groups & Enrichment Analysis Suzanne Paley Pathway Tools Workshop 2010.
Copyright OpenHelix. No use or reproduction without express written consent1.
Genome databases and webtools for genome analysis Become familiar with microbial genome databases Use some of the tools useful for analyzing genome Visit.
Welcome to DNA Subway Classroom-friendly Bioinformatics.
Grup.bio.unipd.it CRIBI Genomics group Erika Feltrin PhD student in Biotechnology 6 months at EBI.
Biological Databases Biology outside the lab. Why do we need Bioinfomatics? Over the past few decades, major advances in the field of molecular biology,
ModelPedia Model Driven Engineering Graphical User Interfaces for Web 2.0 Sites Centro de Informática – CIn/UFPe ORCAS Group Eclipse GMF Fábio M. Pereira.
Annotator Interface Sharon Diskin GUS 3.0 Workshop June 18-21, 2002.
Building WormBase database(s). SAB 2008 Wellcome Trust Sanger Insitute Cold Spring Harbor Laboratory California Institute of Technology ● RNAi ● Microarray.
NCBI Genome Workbench Chuong Huynh NIH/NLM/NCBI Sao Paulo, Brasil July 15, 2004 Slides from Michael Dicuccio’s Genome Workbench.
3 Copyright © 2004, Oracle. All rights reserved. Working in the Forms Developer Environment.
Generic Database. What should a genome database do? Search Browse Collect Download results Multiple format Genome Browser Information Genomic Proteomic.
Project Database Handler The Project Database Handler is a brokering application that mediates interactions between the project database and the external.
Genome annotation and search for homologs. Genome of the week Discuss the diversity and features of selected microbial genomes. Link to the paper describing.
A collaborative tool for sequence annotation. Contact:
Bioinformatics and Computational Biology
How can we find genes? Search for them Look them up.
Protein Structure Database for Structural Genomics Group Jessica Lau December 13, 2004 M.S. Thesis Defense.
ARGOS (A Replicable Genome InfOrmation System) for FlyBase and wFleaBase Don Gilbert, Hardik Sheth, Vasanth Singan { gilbertd, hsheth, vsingan
Web Apollo Resources at the National Agricultural Library Christopher Childers NAL ARS USDA i5k.nal.usda.gov.
EBI is an Outstation of the European Molecular Biology Laboratory. UniProtKB Sandra Orchard.
March 28, 2002 NIH Proteomics Workshop Bethesda, MD Lai-Su Yeh, Ph.D. Protein Scientist, National Biomedical Research Foundation Demo: Protein Information.
GeWorkbench Overview Support Team Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute of MIT and Harvard.
A guided tour of Ensembl This quick tour will give you an outline view of what Ensembl is all about. You will learn: –Why we need Ensembl –What is in the.
PROGRAMMING IN R Introduction to R. In this session I will: Introduce you to the R program and windows Show how to install R Write basic programs in R.
Protein sequence databases Petri Törönen Shamelessly copied from material done by Eija Korpelainen This also includes old material from my thesis
V diagonal lines give equivalent residues ILS TRIVHVNSILPSTN V I L S T R I V I L P E F S T Sequence A Sequence B Dot Plots, Path Matrices, Score Matrices.
V diagonal lines give equivalent residues ILS TRIVHVNSILPSTN V I L S T R I V I L P E F S T Sequence A Sequence B Dot Plots, Path Matrices, Score Matrices.
InterPro Sandra Orchard.
Supplementary Figure S1. Supplementary Figure S2.
CIP HPC CIP - HPC HPC = High Performance Computer It’s not a regular computer, it’s bigger, faster, more powerful, and more.
The Bovine Genome Database Abstract The Bovine Genome Database (BGD, facilitates the integration of bovine genomic data. BGD is.
Canadian Bioinformatics Workshops
1 RIC 2009 Symbolic Nuclear Analysis Package - SNAP version 1.0: Features and Applications Chester Gingrich RES/DSA/CDB 3/12/09.
Web Routing Designing an Interface
Working in the Forms Developer Environment
Sequence based searches:
Department of Genetics • Stanford University School of Medicine
The Celera Genome Browser: A Tool for Visualizing and Annotating the Human Genome
Explore Evolution: Instrument for Analysis
Annotator Interface GUS 3.0 Workshop June 18-21, 2002.
TF candidate selection pipeline.
Presentation transcript:

BUSINESS SENSITIVE 1 SAAW - Sequence Annotation and Analysis Workshop Boyu Yang and Gene Godbold Battelle Memorial Institute, Charlottesville Operations 1001 Research Park Blvd, Charlottesville, VA Abstract The Sequence Annotation and Analysis Workshop (SAAW) is a semi-automated system that supports sequence annotation and analysis. It consists of a Java client application, a variety of web services and a relational database. The Java client provides the capability to manually examine a sequence at the single residue level. The annotated features are graphically aligned and displayed. Some sequence analysis tools are integrated to the system in the form of web services that support semi-automated annotation, these tools include Blast, PHI-Blast, RPS-Blast, MSA(Clustal-W), InterPro, InterProMatch, IDMap, UniProt, Pfam/CDD, PDB, Prints and GO. Jmol is integrated as a visualization plug-in for the display of 3D structure. Some basic functions such as drawing dot or hydropathy plots, finding binding sites for various proteins, finding stem-loops, finding transcription terminators, finding ORFs, finding restriction enzyme cutting sites, showing statistics regarding numbers of residues, finding subsequence by pattern and translating nucleotide into protein sequences are hardcoded in the client. The user interfaces are customizable and pluggable. A rich set of visualization plug-ins have been developed to support various visualization needs including (1) displaying BLAST results (2) aligning multiple sequences, (3) displaying 3D structures and (4) showing domain/family information. An automatic annotation pipeline is designed for automatically assigning features from UniProt, InterPro/InterProMatch, Pfam/CDD and Prints so that curators do not have to manually annotate features from these data sources. The web services tier provides functions to access the relational database and do some computation intensive jobs. All sequences and annotation data are stored in a mySQL database. Front End User Interface The front end user interface is a Java application. It has three major views: editor view, graph view and table view (figures 1, 2 and 3). The editor view is for showing, editing, searching and annotating a sequence. The graph view shows sequence and annotations in graph mode. The table view manages all sequences in memory. Figure 1 Editor ViewFigure 2 Graph View Embedded Basic Functions Basic functions that are directly embedded in the client include finding a protein binding site on a nucleotide sequence, finding stem-loops in RNA, finding transcription terminators, identifying restriction enzyme cutting sites for a nucleotide sequence, finding subsequence by pattern and translating nucleotide sequences into proteins. Figure 4 provides a demonstration for finding subsequence by pattern. Figure 5 shows all embedded functions. Figure 4 Finding a sequence pattern Figure 5 Embedded basic functions Creating Features Features can be created from either the editor view or the graph view, a feature is defined by its type, e.g., a domain, its start and end position, a note that describes the feature and a note source describe the source reference. A feature does not have to be a continuous region, it may be a few sequence pieces spread across the sequence (figure 6). Integrated Web Services Many applications that involve intensive computations or accessing huge amounts of data are implemented as web services including Blast, InterPro, and Pfam (figure 7 in the Tools menu). All Web services are accessed through a universal user interface (figure 8) which is the UI for Blast web service. The interface is constructed at runtime based on the WSDL of the web service. Figure 3 Table View Figure 6 Window for creating a feature Figure 7 Menu for all integrated web services Figure 8 Universal web service interface Discussion Biological data curators at Battelle have used the sequence annotation and analysis workshop to annotate hundreds of sequences manually and thousands of sequences automatically. The system is easy to learn. The graph view assists in viewing all annotated features. The integrated tools help to verify that all entered features are valid through cross-checking against BLAST or aligning it with other annotated sequences. The automatic annotation pipeline of integration of InterPro, UniProt, Pfam, Prints, etc. allows the curators to focus on features that are not available from the publicly available resources, making annotation a much easier job. The 3D structure display feature provides another way to examine the annotated features. It is designed as a three-tier system; the Java client is expandable by adding new visual plugins; the web services tier loosely coupled the backend database and the client; the universal web services interface makes system integration a much easier job.