IGR-ANNOT: A Multiagent System for InterGenic Regions Annotation Sandro Camargo, João Valiati, Luis Otávio Álvares, Paulo Engel, Sergio Ceroni.

Slides:



Advertisements
Similar presentations
Configuration management
Advertisements

Configuration management
1 An Adaptive GA for Multi Objective Flexible Manufacturing Systems A. Younes, H. Ghenniwa, S. Areibi uoguelph.ca.
PREDetector : Prokaryotic Regulatory Element Detector Samuel Hiard 1, Sébastien Rigali 2, Séverine Colson 2, Raphaël Marée 1 and Louis Wehenkel 1 1 Bioinformatics.
Case Tools Trisha Cummings. Our Definition of CASE  CASE is the use of computer-based support in the software development process.  A CASE tool is a.
Genome organization Lesk, Ch 2 (Lesk, 2008). Genomes and proteomes Genome of a typical bacterium comes as a single DNA molecule of about 5 million characters.
Managing Data Resources
A FRAMEWORK BASED ON WEB SERVICES ORCHESTRATION FOR BIOINFORMATICS WORKFLOW MANAGEMENT Laboratory for Bioinformatics (LBI), Institute of Computing (IC)
Chapter 9 Describing Process Specifications and Structured Decisions
The Hierarchy of Data Bit (a binary digit): a circuit that is either on or off Byte: 8 bits Character: each byte represents a character; the basic building.
Investigating the Importance of non-coding transcripts.
C++ fundamentals.
Microsoft ® Expression ® Web An Introduction to the Your Learning Guide to Expression Web tutorial.
TIBCO Designer TIBCO BusinessWorks is a scalable, extensible, and easy to use integration platform that allows you to develop, deploy, and run integration.
Introduction to Systems Analysis and Design Trisha Cummings.
The Design Discipline.
WWLC Standard Operating Procedures Presented by Frank Hall, Laboratory Certification Coordinator.
Systems Analysis – Analyzing Requirements.  Analyzing requirement stage identifies user information needs and new systems requirements  IS dev team.
Aurora: A Conceptual Model for Web-content Adaptation to Support the Universal Accessibility of Web-based Services Anita W. Huang, Neel Sundaresan Presented.
Title: GeneWiz browser: An Interactive Tool for Visualizing Sequenced Chromosomes By Peter F. Hallin, Hans-Henrik Stærfeldt, Eva Rotenberg, Tim T. Binnewies,
1 CS 456 Software Engineering. 2 Contents 3 Chapter 1: Introduction.
Introduction CSE 1310 – Introduction to Computers and Programming Vassilis Athitsos University of Texas at Arlington 1.
Information Systems: Databases Define the role of general information systems Describe the elements of a database management system (DBMS) Describe the.
1 Welcome to the GrameneMart Tutorial A tool for batch data sequence retrieval 1.Select a Gramene dataset to search against. 2.Add filters to the dataset.
Introduction to MDA (Model Driven Architecture) CYT.
Final Year Project Interim Presentation Software Visualisation and Comparison Tool Presented By : Shane Lillis, , 4th Year Computer Engineering.
Configuration Management (CM)
Supporting High- Performance Data Processing on Flat-Files Xuan Zhang Gagan Agrawal Ohio State University.
Silverlight Technology. Table of Contents 1.What is Silverlight Technology? 2.Silverlight Overview. 2.1 How it works 2.2 Silverlight development tools.
Introduction To System Analysis and Design
A new way of seeing genomes Combining sequence- and signal-based genome analyses Maik Friedel, Thomas Wilhelm, Jürgen Sühnel FLI Introduction: So far,

Presentation on Issues and Challenges in Evaluation of Agent-Oriented Software Engineering Methodologies By: kanika singhal.
McGraw-Hill/Irwin Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 4 Computer Software.
TAL7011 – Lecture 4 UML for Architecture Modeling.
The course. Description Computer systems programming using the C language – And possibly a little C++ Translation of C into assembly language Introduction.
Regulatory Genomics Lab Saurabh Sinha Regulatory Genomics | Saurabh Sinha | PowerPoint by Casey Hanson.
Introduction of Geoprocessing Lecture 9. Geoprocessing  Geoprocessing is any GIS operation used to manipulate data. A typical geoprocessing operation.
Architecture View Models A model is a complete, simplified description of a system from a particular perspective or viewpoint. There is no single view.
© FPT SOFTWARE – TRAINING MATERIAL – Internal use 04e-BM/NS/HDCV/FSOFT v2/3 JSP Application Models.
Software Quality Assurance and Testing Fazal Rehman Shamil.
ANALISA & PERANCANGAN SISTEM Disusun Oleh : Dr. Lily Wulandari Program Pasca Sarjana Magister Sistem Informasi Universitas Gunadarma.
Maik Friedel, Thomas Wilhelm, Jürgen Sühnel FLI-Jena, Germany Introduction: During the last 10 years, a large number of complete.
From Use Cases to Implementation 1. Structural and Behavioral Aspects of Collaborations  Two aspects of Collaborations Structural – specifies the static.
GENBANK FILE FORMAT LOCUS –LOCUS NAME Is usually the first letter of the genus and species name, followed by the accession number –SEQUENCE LENGTH Number.
Welcome to the GrameneMart Tutorial A tool for batch data sequence retrieval 1.Select a Gramene dataset to search against. 2.Add filters to the dataset.
Chapter 04 Semantic Web Application Architecture 23 November 2015 A Team 오혜성, 조형헌, 권윤, 신동준, 이인용.
SYSTEM ANALYSIS AND DESIGN LAB NARZU TARANNUM(NAT)
13-2: Manipulating DNA Biology 2. Until very recently breeders could not change the DNA of the plants/animals they were breeding Scientists use DNA structure.
OBJECT ORIENTED VS STRUCTURED WHICH ONE IS YOUR CHOICE.
From Use Cases to Implementation 1. Mapping Requirements Directly to Design and Code  For many, if not most, of our requirements it is relatively easy.
Glencoe Introduction to Web Design Chapter 4 XHTML Basics 1 Review Do you remember the vocabulary terms from this chapter? Use the following slides to.
WEKA: A Practical Machine Learning Tool WEKA : A Practical Machine Learning Tool.
Managing Data Resources File Organization and databases for business information systems.
Computer Aided Software Engineering (CASE)
Unified Modeling Language
VELTI Evaluation Methodology
University of Pittsburgh
Chapter 4 Computer Software.
Programming languages and software development
Object-Oriented Design
The Celera Genome Browser: A Tool for Visualizing and Annotating the Human Genome
Silverlight Technology
Introduction to Systems Analysis and Design
What is an Ontology An ontology is a set of terms, relationships and definitions that capture the knowledge of a certain domain. (common ontology ≠ common.
The Database Environment
Chapter 11 Describing Process Specifications and Structured Decisions
Supporting High-Performance Data Processing on Flat-Files
Web Application Development Using PHP
Presentation transcript:

IGR-ANNOT: A Multiagent System for InterGenic Regions Annotation Sandro Camargo, João Valiati, Luis Otávio Álvares, Paulo Engel, Sergio Ceroni

Introduction The exponential growth of genomic data has led to an absolute requirement for computerized tools to analyze this data. A new genome sequencing does not answer all questions about the organism. Progress is more likely to come from comparing the genomes of different organisms.

Introduction There are many tools and techniques to compare complete genomes and coding regions, but there is a lack for techniques for compare non-coding regions of DNA, which contains regulatory elements. Many of the differences between species may be attributed to changes in the regulation of transcription and translation. Transcription and translation are often regulated via elements that lie in intergenic regions.

InterGenic Regions Intergenic regions are defined as the sequence between the translational stop of a gene and translational start of the next gene. For obtaining intergenic regions of an organism are necessary: –the complete genome of this organism (the nucleotides sequence) –the information about coding regions (start and stop positions, orientation, and name).

InterGenic Regions Our decision was to work with GenBank files because they contain all this necessary information for identifying coding regions, and this information will be used to infer the necessary information about intergenic regions.

InterGenic Regions The format design is based on a tabular approach and consists of the following items: –Feature Key: a single word or abbreviation indicating functional group; –Location: instructions for finding a feature; –Qualifiers: auxiliary information about a feature.

InterGenic Regions KeyLocation/Qualifiers CDS /product=“alcohol dehydrogenase” /gene="adhI" An example of a feature in the feature table.

InterGenic Regions InterGenic Regions naming conventions: IGR-O-G1-G2 where O = {F|R|B|X} depending on the previous and next gene orientations, and G1 and G2 are the names coding regions which intergenic regions contains regulatory information.

InterGenic Regions Intergenic regions will be written in the GenBank file format using the feature misc_feature. According to the GenBank file format description, this feature key is used for annotate regions of biological interest which cannot be described by any other feature key.

IGR-ANNOT Engineering Process The multiagent approach is particularly attractive to this problem because: –information content is heterogeneous. –information can be distributed. –much of the annotation work for each gene can be done by different laboratories using different methodologies for annotate information about genes. We have used MASE and AgentTool to modelling the agent.

IGR-ANNOT Engineering Process User Interface Agent (UIA) File Reader Agents (FRA) Gene Agents (GA) InterGenic Regions Agents (IGRA) File Writer Agents (FWA)

IGR-ANNOT Engineering Process

To implementing this architecture, we have used the Perl language, and it can be run on any suitable platform. Perl have many features, like string manipulation facilities, that become it a very interesting language to working with DNA sequences, besides there are complete packages to implementing multiagent systems.

Results Discussion We have extensively used IGR-ANNOT to creating intergenic regions annotation in several genomes of Mycoplasmataceae family. To getting a graphical view of annotation created by our tool we have used the Artemis tool. The next figures are presenting the Mycoplasma Hyopneumoniae 232 genome.

Results Discussion

Len1Len2%IdyMhyMhy ,34IGR-F- MP04451_oppB-1 IGR-R-oppB ,42IGR-F- MP0611_MHP0054 IGR-F-mhp ,26IGR-X- MP07135_rpsO- MP01224_MHP0106 IGR-X-mhp275- rps ,99IGR-X- MP09826_MHP0309- MP03567_baiH IGR-X-mhp321- baiH

Results Discussion Len1Len2%IdyMhyMhy ,02IGR-R- MP03198_MHP0344 IGR-R-mhp ,49IGR-B- MP18658_MHP0508- MP05045_pdhC IGR-B-mhp502- aceF ,49IGR-B- MP07145_deoC- MP12669_gyrA IGR-B-deoC-gyrA ,69IGR-F-MP02519_lgtIGR-R-lgt

Conclusions This system is now successfully in use by biologists at the UFRGS. The result of IGR-ANNOT application provides an easy way to comparing intergenic regions among different organisms. Although the positive results achieved until now in genomes of Mycoplasmataceae family, further tests will be performed, mainly using most complexes genomes.

Future Works Create an environment to InterGenic Regions comparison. IGR-ANNOT will be available publicly to other biologists over the web at in software section.