Presentation is loading. Please wait.

Presentation is loading. Please wait.

Dictionaries and Ontologies in Structural Biology.

Similar presentations


Presentation on theme: "Dictionaries and Ontologies in Structural Biology."— Presentation transcript:

1 Dictionaries and Ontologies in Structural Biology

2 Scope of Ontology PDB Exchange Dictionary Meta Data  Experimental information  Molecular description  Structural description Coordinates  Macromolecule  Ligands  Solvent

3 History of Project 1990 mmCIF project begins 1992 NDB serves as testbed 1998 PDB adopts mmCIF as core data representation 2001 PDB Exchange Dictionary incorporates X-ray, NMR and cryoEM 2003 direct translation of mmCIF data & dictionaries into XML(PDBML)

4 Challenges in Creating an Ontology  Appropriate coverage and level of detail  Acquiring and organizing expert input  Getting consensus  Evolution with the science  Create a rigorous syntax that can be translated (eg mmCIF ->XML)

5 mmCIF (PDB Exchange) is an Ontology Relationships among data items are explicit

6 Features of Dictionary  Data Items  Definitions  Examples  Data types  Ranges or enumerations  Simple organization  Tables and columns (categories)  Related data item sets (subcategories)  Chapters (category groups)  Associations  Parent-child relationships  Interdependencies/exclusivity  Methods

7 Dictionary Definition Example save__em_detector.type _item_description.description ; The detector type used for recording images. Usually film or CCD camera. ; _item.name '_em_detector.type' _item.category_id em_detector _item.mandatory_code no _item_type.code line loop_ _item_enumeration.value 'KODAK SO163 FILM' 'GATAN 673' 'GATAN 676' ’TVIPS TEMCAM F224' 'TVIPS FASTSCAN F114' PROSCAN AMT save_ Controlled vocabulary Data type Schema Semantics

8 Dictionary Definition Example save__struct_biol.id _item_description.description ; The value of _struct_biol.id must uniquely identify a record in the STRUCT_BIOL list. Note that this item need not be a number; it can be any unique identifier. ; _item.name '_struct_biol.id' _item.category_id struct_biol _item.mandatory_code yes _item_type.code line loop_ _item_linked.child_name _item_linked.parent_name '_struct_biol_gen.biol_id' '_struct_biol.id' '_struct_biol_keywords.biol_id' '_struct_biol.id' '_struct_biol_view.biol_id' '_struct_biol.id' '_struct_ref.biol_id' '_struct_biol.id' save_ Parent-child (foreign key) relationships Data type Schema Semantics

9 Molecular Description  Macromolecular sequence  Macromolecular source  Detailed chemical descriptions of monomers  Detailed chemical descriptions of ligands and solvent

10 Non-polymer Chemical Details Molecular Description Molecular Component Dictionary Biological Source Molecular Hierarchy Macromolecular Polymer Sequence

11 Structural Description  Coordinates of the experimental subunit  Symmetry operations required to build functional assemblies  Structural annotation  Secondary structure  Hydrogen bonding classification  Base pairs and base pair steps  Backbone torsions and base morphology

12 Base Pairs Base Pair Steps Hydrogen Bonding Secondary Structure Atomic Coordinates Experimental Subunits Functional Units Molecular Description Backbone Torsions Base Morphology Structural Hierarchy

13 Connection between Molecular and Structure Descriptions  Macromolecular sequences are explicitly aligned to experimentally determined chemical sequences  Monomers, ligands and solvent matched with chemical descriptions in the PDB molecular components dictionary Molecular Description Structural Description

14 Relationships with other Resources  Sequence database correspondences  Domain/family annotation  Functional annotation (GO/EC/OMIM)  Structural database correspondences  SCOP/CATH/RNAML structural classifications  Functional annotation  Citation and related literature

15 Supporting Software Tools Dictionaries, Data Files and Databases  Validating Parsers for Files and Dictionaries (CIFPARSE)  Dictionary access and presentation tools (CIFOBJ)  File format translation tools (MAXIT, CIFTr)  PDB Validation Suite  Data acquisition and editor tool (ADIT)  Database Builder, Loader (mmCIFLOADER)  XML translation tool  Data extraction and merging tools (PDB_EXTRACT)

16 Availability http://sw-tools.pdb.org/  WWW and CDROM Distribution  Source and Binary Distributions  Open Source License  Supported on Linux, IRIX, ALPHA, SUNOS, and Mac OSX

17 Structure Related Data Dictionaries DDL2 mmCIF RNAML Ligand data NMR Cryo-EM Modeling Crystallization Symmetry Image data BIOSYNc Protein Production

18 Access  RCSB Protein Data Bank Site http://www.pdb.org/  RCSB/PDB Beta Data Site http://pdbbeta.rcsb.org/  RCSB/PDB Dictionary Resource Site http://mmcif.pdb.org /  RCSB/PDB Deposition Site http://deposit.pdb.org /  PDBML site http://pdbml.pdb.org/  RCSB/PDB Software Download Site http://sw-tools.pdb.org /


Download ppt "Dictionaries and Ontologies in Structural Biology."

Similar presentations


Ads by Google