Cancer Bioinformatics Infrastructure Objects (caBIO) Architecting the Future of Genomics Himanso Sahni & Scott Gustafson December.

Slides:



Advertisements
Similar presentations
Overview Environment for Internet database connectivity
Advertisements

A Prototype Implementation of a Framework for Organising Virtual Exhibitions over the Web Ali Elbekai, Nick Rossiter School of Computing, Engineering and.
Web Service Ahmed Gamal Ahmed Nile University Bioinformatics Group
General introduction to Web services and an implementation example
Apache Struts Technology
Virtual Ticketing Agents using Web Services and J2EE Advisor: Dr. Chung-E-Wang Date: 05/06/03 Naveen Repala.
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
G O B E Y O N D C O N V E N T I O N WORF: Developing DB2 UDB based Web Services on a Websphere Application Server Kris Van Thillo, ABIS Training & Consulting.
Presentation 7 part 2: SOAP & WSDL. Ingeniørhøjskolen i Århus Slide 2 Outline Building blocks in Web Services SOA SOAP WSDL (UDDI)
Sapana Mehta (CS-6V81) Overview Of J2EE & JBoss Sapana Mehta.
DCS Architecture Bob Krzaczek. Key Design Requirement Distilled from the DCS Mission statement and the results of the Conceptual Design Review (June 1999):
1 HyCon Framework Overview Frank Allan Hansen and Bent Guldbjerg Christensen ! Run this presentation in presentation mode to watch animations.
Java Server Team 8. Overview What is a Java Server? History Architecture Advantages Disadvantages Current Technologies Conclusion.
Application Architectures Vijayan Sugumaran Department of DIS Oakland University.
M.Sc. Course, Dept. of Informatics and Telecommunications, University of Athens S.Hadjiefthymiades “Web Application Servers” Basics on WAS WAS are necessary.
Web Service What exactly are Web Services? To put it quite simply, they are yet another distributed computing technology (like CORBA, RMI, EJB, etc.).
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
UNIT-V The MVC architecture and Struts Framework.
INTRODUCTION TO WEB DATABASE PROGRAMMING
M. Taimoor Khan * Java Server Pages (JSP) is a server-side programming technology that enables the creation of dynamic,
Aurora: A Conceptual Model for Web-content Adaptation to Support the Universal Accessibility of Web-based Services Anita W. Huang, Neel Sundaresan Presented.
Web Services Mohamed Fahmy Dr. Sherif Aly Hussein.
16-1 The World Wide Web The Web An infrastructure of distributed information combined with software that uses networks as a vehicle to exchange that information.
CPS120: Introduction to Computer Science The World Wide Web Nell Dale John Lewis.
MMHCC Informatics Providing Innovative and Integrative Informatics Solutions Johnita Beasley (SAIC) Dana Zhang (SAIC) Sharon Settnek (SAIC)
Jaeki Song ISQS6337 JAVA Lecture 16 Other Issues in Java.
The Semantic Web Service Shuying Wang Outline Semantic Web vision Core technologies XML, RDF, Ontology, Agent… Web services DAML-S.
11/16/2012ISC329 Isabelle Bichindaritz1 Web Database Application Development.
Chapter 17 - Deploying Java Applications on the Web1 Chapter 17 Deploying Java Applications on the Web.
Introduction to MDA (Model Driven Architecture) CYT.
第十四章 J2EE 入门 Introduction What is J2EE ?
1 HKU CSIS DB Seminar: HKU CSIS DB Seminar: Web Services Oriented Data Processing and Integration Speaker: Eric Lo.
® IBM Software Group © 2007 IBM Corporation J2EE Web Component Introduction
Web Services Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
Interfacing Registry Systems December 2000.
Web Services based e-Commerce System Sandy Liu Jodrey School of Computer Science Acadia University July, 2002.
Source: Peter Eeles, Kelli Houston, and Wojtek Kozaczynsky, Building J2EE Applicationa with the Rational Unified Process, Addison Wesley, 2003 Prepared.
WEB BASED DATA TRANSFORMATION USING XML, JAVA Group members: Darius Balarashti & Matt Smith.
11 CORE Architecture Mauro Bruno, Monica Scannapieco, Carlo Vaccari, Giulia Vaste Antonino Virgillito, Diego Zardetto (Istat)
XML Web Services Architecture Siddharth Ruchandani CS 6362 – SW Architecture & Design Summer /11/05.
Systems Analysis and Design in a Changing World, 3rd Edition
1 MSCS 237 Overview of web technologies (A specific type of distributed systems)
Web Services Presented By : Noam Ben Haim. Agenda Introduction What is a web service Basic Architecture Extended Architecture WS Stacks.
SPOREs Specialized Programs of Research Excellence Ryan Landy Qinyan Pan -SAIC 2003 NCICB Jamboree.
User Profiling using Semantic Web Group members: Ashwin Somaiah Asha Stephen Charlie Sudharshan Reddy.
CS562 Advanced Java and Internet Application Introduction to the Computer Warehouse Web Application. Java Server Pages (JSP) Technology. By Team Alpha.
© FPT SOFTWARE – TRAINING MATERIAL – Internal use 04e-BM/NS/HDCV/FSOFT v2/3 JSP Application Models.
Providing web services to mobile users: The architecture design of an m-service portal Minder Chen - Dongsong Zhang - Lina Zhou Presented by: Juan M. Cubillos.
The Semantic Web. What is the Semantic Web? The Semantic Web is an extension of the current Web in which information is given well-defined meaning, enabling.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
INFSO-RI Enabling Grids for E-sciencE Web Services Mike Mineter National e-Science Centre, Edinburgh.
Rendering Syndicated Library Content in an Institutional Portal: Integrating MyLibrary into uPortal John Fereira: Cornell University Eric Lease Morgan:
Java Programming: Advanced Topics 1 Building Web Applications Chapter 13.
EGEE is a project funded by the European Union under contract IST Introduction to Web Services 3 – 4 June
Copyright 2007, Information Builders. Slide 1 iWay Web Services and WebFOCUS Consumption Michael Florkowski Information Builders.
Apache Struts Technology A MVC Framework for Java Web Applications.
1 LM 6 Database Applications Dr. Lei Li. Learning Objectives Explain three components of a client-server system Describe differences between a 2-tiered.
XML and Distributed Applications By Quddus Chong Presentation for CS551 – Fall 2001.
By Jeremy Burdette & Daniel Gottlieb. It is an architecture It is not a technology May not fit all businesses “Service” doesn’t mean Web Service It is.
I Copyright © 2004, Oracle. All rights reserved. Introduction.
6/28/ A global mesh of interconnected networks (internetworks) meets these human communication needs. Some of these interconnected networks are.
Added Value to XForms by Web Services Supporting XML Protocols Elina Vartiainen Timo-Pekka Viljamaa T Research Seminar on Digital Media Autumn.
The Holmes Platform and Applications
Java Web Services Orca Knowledge Center – Web Service key concepts.
J2EE Platform Overview (Application Architecture)
WEB SERVICES.
Director’s Challenge IT Overview
Cancer Bioinformatics Infrastructure Objects (caBIO)
Evaluating Compuware OptimalJ as an MDA tool
Presentation transcript:

Cancer Bioinformatics Infrastructure Objects (caBIO) Architecting the Future of Genomics Himanso Sahni & Scott Gustafson December 7, 2001 Science Applications International Corporation (SAIC)

caBIO Team caBIO was developed as “open source” technology by the National Cancer Institute Center for Bioinformatics (NCICB) Software developers teamed with scientists from the NCICB Cancer Genome Anatomy Project (CGAP) to define caBIO requirements and use cases, and model the genomics domain

Purpose “To provide an overview of caBIO, its current role in genomics, and future transitions towards intelligent genomics”

The Genomic Past n Little or no data integration n Lack of standards n Little or no software code re-use n Duplication of efforts n Duplication of data n Lack of common vocabulary n Maintenance nightmares

The Genomic Challenge Code Re-Use INTELLIGENCE! Data Integration Standards Increased Productivity Common Vocabulary Ease of Maintenance Normalized Data

caBIO Agenda n caBIO Overview n Development Process n Architecture n Data Sources n Usage n Benefits n Future

caBIO Overview n caBIO is a standards based set of genomic components n caBIO objects simulate the behavior of actual genomic components such as genes, chromosomes, sequences, libraries, clones, ontologies, etc. n caBIO objects provide access to a variety of genomic data sources including GenBank, Unigene, LocusLink, Homologene, Ensemble, GoldenPath, and NCICB’s CGAP (Cancer Genome Anatomy Project) data repositories n caBIO is “open source”

Development Process n Iterative Approach –An iterative software design and development approach leveraging concepts from RUP and XP was used –Initial Vision and High Level Analysis form the overall goals of the project –Iterative functional design and implementation processes allows for rapid and segmented development of the application

Development Process n High Level Use Cases –Initial high level Use Cases were developed for gene components –Additional high level Use Cases were added as additional functional areas are mapped (e.g. pathways, therapies, microarray objects, mouse models) –Detailed Use Cases are derived from working with domain experts in requirements gathering.

Development Process b Detailed Use Cases Detailed Use Cases were derived from teaming with domain experts during requirements gatheringDetailed Use Cases were derived from teaming with domain experts during requirements gathering Detailed Use Cases includeDetailed Use Cases include Characteristic InformationCharacteristic Information Main Success ScenarioMain Success Scenario ExtensionsExtensions Related InformationRelated Information

Development Process n Class Diagrams –Class Diagrams were developed using the Unified Modeling Language (UML) –The class diagram is continually updated as new objects and relationships are defined

Development Process b Objects Each object consists of many attributes, operations, and associationsEach object consists of many attributes, operations, and associations For example a gene object has many attributes such as symbols and keywords. A gene also has operations including getName and getSequnces, and associations which indicate that a gene “has” sequences and “has a ” locationFor example a gene object has many attributes such as symbols and keywords. A gene also has operations including getName and getSequnces, and associations which indicate that a gene “has” sequences and “has a ” location

Development Process n Sequence Diagrams –Additional methods are discovered as the objects are further explored modeling the use cases

Development Process n Object Development –Java Code is written to implement the Object Model –Documentation (API) is built from the Java Code

Development Process n Application Development –Applications leveraging caBIO can be developed using multiple programming languages

Development Process n Example Application –caBIO was used to add intelligence to BioCarta pathways Gene, pathway, and expression objects were used –Users can select a pathway, and view pathway components that are mutated or expressed –Users can drill down to detailed gene information – prot.nci.nih.govhttp://cmap- prot.nci.nih.gov

Architecture External Java Apps ClientsPresentation LayerObject Layer Data Sources Browsers Other Apps HTML /HTTP XML/ HTTP RDF Internal Java Apps Web Server Servlet Container JSPs Servlets UI Bean XML Builder XSLT Engine SOAP Engine XML Docs DTDs XSL Style Sheet RMI URLs Flat Files Databases Genes Chromosomes Libraries Tissues Clusters Object Managers JDBC HTTP FTP Agents SOAP Data Access Objects Sequences Diseases Other Domain Objects

Architecture n The core infrastructure exhibits an n-tiered architecture with client interfaces, server components, back-end objects and data sources n Clients (browsers, applications) can receive information (HTML and XML) from back-end objects over HTTP –Client applications can also communicate with back-end objects via Java RMI (Java applications) –Non-Java based applications will communicate via SOAP n Server components communicate with back-end objects via Java RMI n Back-end objects communicate directly with data sources (database, URLs, flat files) n A UDDI registry will be configured to advertise services –RDF is currently used to advertise services to crawlers and agents

Architecture n Client Technology Industry Standard Web Browsers - Netscape 4+ and IE 4+ Java Applications – Applications that implement the Java programming language, an object-oriented language which provides portability and many other features Non-Java Applications - Applications usually written in Perl, C, etc. –Non-Java applications use SOAP clients to interface with caBIO »SOAP::Lite for Perl – A collection of Perl modules which provides a simple and lightweight interface to the Simple Object Access Protocol (SOAP) Agents - Software programs that perform a function for a user in a trusted fashion, can learn or be taught its function, and can perform actions for the user without permission RDF – The Resource Description Framework is a foundation that advertises Web services via the Web. RDF is used to describe the content and services available at a particular Web site. (passive) UDDI – Universal Description, Discovery and Integration is a foundation to enable businesses to discover and transact business with each other using preferred applications (active)

Architecture n Presentation Layer Technology –Jakarta Tomcat - Servlet+JSP Engine which is a subproject of the Jakarta Project –JSPs - Java Server Pages are web pages with Java embedded in the HTML to incorporate dynamic content in the page –Java Servlets – Server-side Java programs that web servers can run to generate content in response to client requests –Java Beans – Reusable software components that work with Java –XML – The Extensible Markup Language is a universal format for structured data on the Web XSL/XSLT – The Extensible Stylesheet Language is a language for expressing stylesheets. XSL Transformations (XSLT) is a language for transforming XML documents XLink – The XML Linking Language allows elements to be inserted into XML documents in order to create and describe links between resources DOM – The Document Object Model is a platform and language independent interface that allows programs to dynamically access content, structure, and document style

Architecture n Presentation Layer Technology(cont.) –SOAP – The Simple Object Access Protocol is a lightweight XML based protocol for the exchange of information in a decentralized, distributed environment. It consists of an envelope that describes the message and a framework for message transport Apache SOAP – An implementation of the W3C SOAP specification. The Apache SOAP provides a server-side infrastructure for deploying, managing, and running SOAP enabled services.

n Presentation Layer –The Mediator Servlet manages the user session, forwards requests to the JSP, and returns HTML –JSPs forward requests to the UI Bean, and returns XML or HTML to the client –The UI Bean receives the a Domain Object and converts the object to XML or HTML –The DOM Writer performs the Conversion to XML while the Conversion Servlet represents the Domain Object as a Document Object (DOM) The conversion is performed by calling the Domain Objects toXML() method Architecture XML Builder HTML /HTTP XML/ HTTP RDF RMI SOAP Mediator Servlet Conversion Servlet JSPs DOM Writer UI Bean Manager Proxies Domain Objects

Architecture n Object Layer Technology –Java Applications/Objects – Applications/Objects that implement the Java programming language, an object- oriented language which provides portability and many other features –Java RMI – Remote Method Invocation is distributed computing in Java. It is a simple technology for object communication that allows remote objects to act as local Java objects –JDBC – Java Database Connectivity is a Java API for executing SQL statements and connecting to databases –SOAP – Simple Object Access Protocol extends object interoperability to other programming languages (Perl, Python, C++) –DAS – The Distributed Annotation System is an emergent system for retrieving genomic annotations from a variety of data sources (GoldenPath, Ensembl)

Object Managers n Object Layer –Object Managers are created to implement complex scientific concepts –The Role Model object verifies user permissions to the data if applicable –The Remote Manager interface abstracts the RMI layer Allows RMI to be easily replaced by EJB or other communication venues –Domain Objects are serialized and XML enabled Architecture RMI Role Model Remote Interface Manager GeneManager SequenceManager LibraryManager Other Managers Data Access Objects Domain Objects

n Data Access Objects –Provides the object relational mapping –Database independent persistence –Allows for the introduction of new data sources –Abstracts persistence details from the domain objects –Allows for federated database topology LibraryPersistence GenePersistence TissuePersistence SequencePersistence Other ClonePersistence ProteinPersistence Architecture Data Sources URLs Flat Files Databases JDBC HTTP FTP Data Access Objects

Data Sources n CGAP (NCI Cancer Genome Anatomy Project) –Provides gene expression profiles of normal, pre- cancer, and cancer cells n GenBank (NCBI) –Sequence submission software n Unigene (NCBI) –Partitions sequences into unique gene-oriented clusters n LocusLink (NCBI) –Provides an interface to curated sequence and descriptive info about genetic loci

Data Sources n Homologene (NCBI) –Provides curated orthologs and homologs for UniGene and LocusLink genes n GoldenPath (UCSC) –Contains a working draft of the human genome

Usage n Java developers use the caBIO jar file for immediate access to genomic object attributes and methods –caBIO objects can easily be extended to support customized objects n Non-Java and Java developers can use SOAP to access genomic object attributes and methods over HTTP n Developers can obtain detailed descriptions of available services via the RDF site –URIs of desired services can be obtained via RDF n Developers can host a caBIO server or leverage the existing NCICB caBIO server

Benefits n Provides an abstraction layer that allows developers to access genomic information using a standardized tool set without concerns for implementation details n Permits access to allow developers to obtain the information they need from a variety of data sources without data management n Manages the display of large volumes of data to assist in load balancing n Provides an effective mechanism for comparing similar objects that rely on diverse data sources n Facilitates information sharing without managing linkages between multiple data sources

Future n The Semantic Web –Concept “The Semantic Web is an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation." -- Tim Berners- Lee, James Hendler, Ora Lassila (W3C) –Implementation Agent Based Technology may be used to create programs that implement semantic web concepts n Natural Language Processing (NLP) –NLP involves understanding natural language and extracting information –NLP can be used as a tool to help automate tasks, refine searches and improve genomic algorithms

Questions?