TAMBIS Transparent Access to Multiple Biological Information Sources
Why Tambis? TAMBIS aims to provide transparent information retrieval and filtering from biological information sources. This will be through the use of a homogenising layer on top of the sources. The layer uses a mediator and wrappers to create the illusion of a single data source.
Mediators The mediator is an information broker. It uses a conceptual knowledge base of biology to: Describe a universal model Help users form queries Wrapper Mediator Translate the mediator’s model to the sources’ model Mediator
Wrappers Wrappers create the illusion of a common query language for each information resource. This insulates the mediator from differences in source access methods The current wrapper language is CPL Mediator Wrapper
Architecture Biological Terminology Server Query Formulation Dialogues Services KB Query Transformation Wrapper Service Terminology Server The Terminology Server provides services for reasoning about concept models, answering questions like: What can I say about Proteins? What are the parents of concept X? It communicates with other modules through a well-defined interface Terminology Server Biology Concept Model Linguistic Model
Architecture Biological Terminology Server Query Formulation Dialogues Services KB Query Transformation Wrapper Service Query Formulation Dialogues The user interacts with Query Formulation Dialogues, expressing queries in terms of the biological model. The dialogues are driven by the content of the model, guiding the user towards sensible queries. The query is then passed to the transformation process, which may require further user input to refine and instantiate the query.
Architecture Biological Terminology Server Query Formulation Dialogues Services KB Query Transformation Wrapper Service The Services Knowledge Base links the biological ontology with the sources and their schemas. This information is used by the transformation process to determine which source should be used. Services KB Source Combination Model SCM Concepts SSM mapping BISSMap Source and Services Model SSM
Architecture Biological Terminology Server Query Formulation Dialogues Services KB Query Transformation Wrapper Service Query Transformation takes the conceptual source-independent queries and rewrites to produce executable query plans. To do this it requires knowledge about the biological sources and the services they offer. Information about particular user preferences - say favourite databases or analysis methods - may also be incorporated by the query planner. The query plans are then passed to the wrappers. Query Transformation
Architecture Biological Terminology Server Query Formulation Dialogues Services KB Query Transformation Wrapper Service The Wrapper Service coordinates the execution of the query and sends each component to the appropriate source. Results are collected and returned to the user. Wrapper Service Query Execution Coordinator Wrapper Client Wrapper Client Wrapper Client
Modelling Biology with DLs Primitive concepts are atomic terms, e.g. Protein or Motif. Roles denote binary relationships between concepts, e.g. hasOrganismSource, isComponentOf. Term constructors associate concepts and roles to define composite concepts, e.g. Motif which isComponent of Protein. Concepts are both definitions that form the model and queries on the model - the same language is used. The Biological Concept model is built using a Description Logic or DL.
Modelling Biology with DLs Primitive concepts are placed by the modeller into a subsumption (or kind-of) hierarchy. Composite concepts are automically classified in the hierarchy based on the description of the concept. SequenceComponent Motif which SequenceComponent which hasFunction Hydrolase SequenceComponent which isComponentOf Protein Motif which hasFunction Hydrolase Motif which isComponentOf Protein Motif
Modelling Biology with DLs The combination of concepts with roles is tightly controlled. We use these controls together with the classification to check the coherency of a concept. Two concepts are permitted to be related via some role through the use of sanctions. Composite concepts can’t be formed without sanctioned permission. 4Motif isComponentOf Protein 8NucleicAcidComponent isComponentOf Protein Sanctions ensure that only semantically valid compositions are formed; a large number of compositions can be inferred from a sparse model. They also allow us to answer questions like “what can I say about this concept?”
TAMBIS in action Query Interface Graphical presentation of Query Hierarchical view of parent concepts
TAMBIS in action Query Interface Motif which is ComponentOf (Protein which hasOrganismSource PoeciliaReticulata) {motif1 | \protein1 <- get-sp-entry-by- os(“POECILIA+RETICULATA”), \motif1 <- do-prosite-scan-by-entry-rec(protein1)} Query expression Rewrite to CPL Accept query Evaluate query
TAMBIS is a collaboration between the departments of Computer Science & Biological Science at the University of Manchester, funded by EPSRC and Zeneca Pharmaceuticals To find out more Carole Goble Computer Science, University of Manchester, Oxford Road, Manchester M13 9PL, UK Phone: Fax: Andy Brass Biochemistry and Molecular Biology, University of Manchester, Oxford Road, Manchester M13 9PL, UK Phone: /5096 Fax: Norman Paton Computer Science, University of Manchester, Oxford Road, Manchester M13 9PL, UK Phone: Fax: