Download presentation
Presentation is loading. Please wait.
Published byLee Bradford Modified over 9 years ago
1
Research Collaboratory for Structural Bioinformatics http://openmms.sdsc.edu Macromolecular Structure Middleware OpenMMS An Ontology Driven Architecture
2
Research Collaboratory for Structural Bioinformatics http://openmms.sdsc.edu Overview The mmCIF Ontology OpenMMS Toolkit Macromolecular Structure (MMS) Metamodel Parser, XML SQL / Corba Servers and Clients Corba UML and the future...
3
Research Collaboratory for Structural Bioinformatics http://openmms.sdsc.edu How do we “Enable” Science? n Promote well defined Macromolecular Structure (MMS) Specifications n Distribution – Open Interfaces –Now: flat files W3 browsing and searching –Future: XML, SQL, CORBA
4
Research Collaboratory for Structural Bioinformatics http://openmms.sdsc.edu Why OpenMMS? n Allow programmers to more easily create efficient, high performance and robust applications. n A Java-only toolkit with that creates XML, CORBA and Relational DB representations of the mmCIF Macromolecular Structure Data. n Source code is publicly available so users can easily modify the metamodel or create an entirely new one.
5
Research Collaboratory for Structural Bioinformatics http://openmms.sdsc.edu What Do We Mean by an Ontology Driven Architecture? What do we mean by an Ontology? A bridge between Our World of Natural Language and the World of Machines.
6
Research Collaboratory for Structural Bioinformatics http://openmms.sdsc.edu mmCIF Dictionary and Data Files n Based on Ontology for Macromolecular Structure defined by the International Union of Crystallography n Replaces the older 80-Column PDB files n mmCIF Dictionary contains over 140 Category and 1600 Item definitions n Open, Extensible n Provides a well-defined reference standard for data distribution
7
Research Collaboratory for Structural Bioinformatics http://openmms.sdsc.edu OpenMMS Toolkit Data Flow ApplicationsApplications mmCIF Data Files (Reference Standard) Corba Server Relational Database mmCIF Parsers XML Files
8
Research Collaboratory for Structural Bioinformatics http://openmms.sdsc.edu Metamodel Information Flow mmCIF Dictionary Metamodel Framework Corba IDL, SQL Schema, XML DTD, Java Data Loaders JDBC Loaders mmCIF Ontology Metamodel
9
Research Collaboratory for Structural Bioinformatics http://openmms.sdsc.edu What can OpenMMS do? n PDBase program will load any or all PDB files into any SQL-92 compatible database (Oracle, mySQL, Sybase...) n Translate any PDB file into an XML file. n Contains Two Corba servers: –Reference server will cache and serve data read from PDB flat files. –DB server will cache and serve data read from a SQL database (very quickly...) n All Source code written in Java and publicly available.
10
Research Collaboratory for Structural Bioinformatics http://openmms.sdsc.edu Some Advantages of Using an Ontology Driven Architecture n Scales to very large Ontologies n More reliable and maintainable code n Transfer between representations n Scientific Correctness of representation n Help in maintaining backward compatibility
11
Research Collaboratory for Structural Bioinformatics http://openmms.sdsc.edu How does one actually represent an ontology? (OpenMMS Internal Metamodel Overview) Root Module Interface Field Struct Field Visitor Abstract Class Visitor Subclass
12
Research Collaboratory for Structural Bioinformatics http://openmms.sdsc.edu mmCIF Parsers n General Purpose, Low-level access to data n Parsers available in many languages n OpenMMS toolkit includes Java Parser –Uses “Builder” Design Pattern –An application subclasses Abstract Builder class and stores data into its data structures
13
Research Collaboratory for Structural Bioinformatics http://openmms.sdsc.edu MMS in XML n Large Flat Files (open and close tags) n Tables can be grouped by rows or columns n XML from SQL Query –Many requests from Web browsers don’t really need or want all the data –SW available from DB Vendors and ISVs for creating XML files from SQL result sets –Smaller files load faster
14
Research Collaboratory for Structural Bioinformatics http://openmms.sdsc.edu Relational DB Expression n SQL-92 Compatible n Schemas for all the standard DB vendors n Fast and Flexible Keyword searches n PDBase loader allows structures to be selectively loaded n Oracle Instance Tested –14,556 Structures –16GB, 88 Million Atom Records
15
Research Collaboratory for Structural Bioinformatics http://openmms.sdsc.edu A very high-level (and very-rough) classification of communication n Person-to-Person communication –email n Person-to-Machine communication –HTTP/HTML n Machine-to-Machine communication –CORBA, SQL,.NET, Soap n Not Communications -> Data Formats –XML, mmCIF (STAR), many more …
16
Research Collaboratory for Structural Bioinformatics http://openmms.sdsc.edu What is CORBA? Common Object Request Broker Architecture Defines a family of open software interface specifications for distributed object computing. http://www.omg.org
17
Research Collaboratory for Structural Bioinformatics http://openmms.sdsc.edu What is an Object? “ A Data Structure with an Attitude” Programs = Algorithms + Data Structure Object Oriented Programming Principle: Partition the parts of algorithms with the data structures they use
18
Research Collaboratory for Structural Bioinformatics http://openmms.sdsc.edu Side View of a Distributed Application Client E.g. a Java Applet Server E.g. Mainframe Computer Server Internet (TCP/IP) Middle Ware Middle Ware Network IDL
19
Research Collaboratory for Structural Bioinformatics http://openmms.sdsc.edu The “Hourglass” view of the Internet Unreliable Datagrams Reliable Bitsteam Applications TCP, RTP,... IP Copper, Glass Radio Spectrum HTTP, Corba,.NET OO High-Level Interface (ATM, Ethernet, V.90, SONET...)
20
Research Collaboratory for Structural Bioinformatics http://openmms.sdsc.edu Where is Corba? n Inside every Java Runtime Environment. n Commonly used in middle tier and backend (e.g. database) connections. n Open Source and Commercial Implementations Available n Usually buried deep inside the software –Difficult or impossible to tell when it is being used
21
Research Collaboratory for Structural Bioinformatics http://openmms.sdsc.edu What is Distributed Object Computing? n Extends the benefits of object-oriented technology across process and machine boundaries to encompass entire networks. n Attempts to make remote objects appear to programmers as if they were local objects in the same process. This is called location transparency.
22
Research Collaboratory for Structural Bioinformatics http://openmms.sdsc.edu Advantages of Distributed Object Computing n Easier (and faster) for programmers to create distributed applications n Increases Reliability n Increases Maintainability n Increases Portability n Increases Extensibility
23
Research Collaboratory for Structural Bioinformatics http://openmms.sdsc.edu The Alphabet Soup n OMG = Object Management Group Consortium of 800+ companies founded in 1989. n IDL = Interface Definition Language
24
Research Collaboratory for Structural Bioinformatics http://openmms.sdsc.edu n The key is to focus on boundaries, interfaces, how things fit together n Not on the internal details of how they’re built; assume that will be diverse & changing Shape of boundary is defined in IDL Boundaries, Interfaces
25
Research Collaboratory for Structural Bioinformatics http://openmms.sdsc.edu The Interface to an object can be distributed over a network The Interface to an object can be distributed over a network The glue that binds parts together is the ORB Shape of boundary is defined in IDL Boundaries, Interfaces
26
Research Collaboratory for Structural Bioinformatics http://openmms.sdsc.edu Corba Independence n Open Standard for Distributed Object Oriented Design n Independent of Hardware Platform n Independent of Operating System n Independent of Programming Language n Independent of Object Location
27
Research Collaboratory for Structural Bioinformatics http://openmms.sdsc.edu Object Request Broker Client Object L IDLIDL n ORBs mediate between objects and things that use them (clients) Object Request Broker
28
Research Collaboratory for Structural Bioinformatics http://openmms.sdsc.edu Terminology n IIOP – The Internet Inter-ORB Protocol, defined in the Spec as a vendor-independent, wire- level network protocol on top of TCP/IP. This allows ORB implementations of different vendors to interoperate.
29
Research Collaboratory for Structural Bioinformatics http://openmms.sdsc.edu ORB JavaPerlC++CAdaJava VBActiveX Corba / IIOP—Internet Inter-ORB Protocol ORBs: Medium for Integration
30
Research Collaboratory for Structural Bioinformatics http://openmms.sdsc.edu Corba Facilities: Industry Standards in Vertical Markets n Manufacturing n Finance n Life Sciences Research n C4I n Many others...
31
Research Collaboratory for Structural Bioinformatics http://openmms.sdsc.edu Using Corba to access Macromolecular Structure Data n No Parsing of Flat Files n Direct Access to Binary Data Structures n Strongly Typed Data n Granularity of Access n Indices and Presence Flags Pre-computed n Highest Performance
32
Research Collaboratory for Structural Bioinformatics http://openmms.sdsc.edu OMG/LSR Macromolecular Structure Adoption Process n August 1999RFP issued n March 2000Initial Submission September 2000Revised Submission February 2001Adopted Spec by the OMG 4Q 2001OpenMMS LSR/MMS1.0 compliant implementation source code publicly available February 2002Approved as a Formal OMG Available Specification.
33
Research Collaboratory for Structural Bioinformatics http://openmms.sdsc.edu Using the CORBA MMS Server An excerpt from legacy PDB Formatted File ATOM Record (4hhb.ent)... ATOM 6 CG1 VAL A 1 7.009 20.127 5.418 6.00 61.79... ATOM 7 CG2 VAL A 1 5.246 18.533 5.681 6.00 80.12... ATOM 8 N LEU A 2 9.096 18.040 3.857 7.00 26.44... ATOM 9 CA LEU A 2 10.600 17.889 4.283 6.00 26.32... ATOM 10 C LEU A 2 11.265 19.184 5.297 6.00 32.96... ATOM 11 O LEU A 2 10.813 20.177 4.647 8.00 31.90... ATOM 12 CB LEU A 2 11.099 18.007 2.815 6.00 29.23... ATOM 13 CG LEU A 2 11.322 16.956 1.934 6.00 37.71... ATOM 14 CD1 LEU A 2 11.468 15.596 2.337 6.00 39.10... ATOM 15 CD2 LEU A 2 11.423 17.268.300 6.00 37.47......
34
Research Collaboratory for Structural Bioinformatics http://openmms.sdsc.edu LSR/MMS “ATOM Record” struct AtomSite { string id; IndexId type_symbol; AtomIndex label; IndexId label_entity; VectorXYZ cartn; float occupancy; float b_iso_or_equiv; }; DsLSRMacromolecularStructure.idl excerpt:
35
Research Collaboratory for Structural Bioinformatics http://openmms.sdsc.edu Example Code and Resulting Output Entry e = entryFactory.get_entry_from_id(”4hhb"); AtomSite[] a = e.get_atom_site_list(); for (int i = 0; i < a.length; i++) { System.out.println(a[i].id + " " + a[i].type_symbol.id + " (" + a[i].cartn.x + ", " + a[i].cartn.y + ", " + a[i].cartn.z + ")"); } produces: 1 N (11.065, 7.352, 9.598) 2 C (12.436, 7.764, 9.902) 3 C (12.883, 7.09, 11.208) 4 O (12.088, 7.0, 12.147) 5 C (12.611, 9.264, 10.06)...
36
Research Collaboratory for Structural Bioinformatics http://openmms.sdsc.edu What are the alternatives to Corba? n TCP/IP Sockets - Byte stream n DCOM, COM++, OLE,.NET (Microsoft Only) –DCOM Corba Bridges are available from several vendors n SOAP (Simple Object Access Protocol) –XML Based
37
Research Collaboratory for Structural Bioinformatics http://openmms.sdsc.edu Unified Modeling Language – UML What do all those arrows and boxes Mean? n Schematic Language for Defining SW n Graphics Representations n UML = Things, Relations and Diagrams n 9 types of Diagrams n The most commonly used diagram is the “Class Diagram”
38
Research Collaboratory for Structural Bioinformatics http://openmms.sdsc.edu UML Class Diagram Example get_version() get_entry_id_list() get_entry_modification_dates() native_formats_supported() get_native_entry_representation() EntryFactory EntryIdList * EntryId IdentifierModificationDateList Entry_id : EntryId date: TimeBase::TimeT ModificationDate *
39
Research Collaboratory for Structural Bioinformatics http://openmms.sdsc.edu UML Class Diagram Basics method1() method2() method3() Class_Name var1: Type var2: Type Underlined for Class Instances, Italics for Abstract Classes Variables Methods Details may be omitted if not important
40
Research Collaboratory for Structural Bioinformatics http://openmms.sdsc.edu UML Relationships * * 0..1 Dependency Association Generalization (Inheritance) Aggregation
41
Research Collaboratory for Structural Bioinformatics http://openmms.sdsc.edu UML Example get_version() get_entry_id_list() get_entry_modification_dates() native_formats_supported() get_native_entry_representation() EntryFactory EntryIdList * EntryId IdentifierModificationDateList Entry_id : EntryId Date : TimeBase::TimeT ModificationDate *
42
Research Collaboratory for Structural Bioinformatics http://openmms.sdsc.edu XMI: XML Metadata Interchange n UML is a graphical representation; need some way to exchange UML models between applications n XMI is used to store and transmit UML models n XML based n Defines XML tags for classes, relationships between classes etc.
43
Research Collaboratory for Structural Bioinformatics http://openmms.sdsc.edu OMG MDA n Platform Independent Models (PIMs) that define the interface are defined in UML n The PIMs are translated to Platform Specific Models (PSMs) such as Corba, SOAP,.NET or XML Schemas n The Corba servers and clients may be the same, but now the interface is defined in UML and the IDL is then generated from the UML
44
Research Collaboratory for Structural Bioinformatics http://openmms.sdsc.edu MDA Platform Independent to Platform Dependent Translation UML Corba SOAPXML.NET
45
Research Collaboratory for Structural Bioinformatics http://openmms.sdsc.edu Thanks and Acknowledgments Phil Bourne John Westbrook David Benton Karl Konnerth Lynn TenEyck
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.