Download presentation
Presentation is loading. Please wait.
Published byDominick Day Modified over 9 years ago
1
The Red Pill Roger Sayle, Geoff Skillman, Matthew Stahl Robert Tolbert OpenEye Scientific Software
2
Integration The process of computing an integral; the inverse of differentiation.
3
Integration The organization of the psychological or social traits and tendencies of a personality into a harmonious whole.
4
Data Integration Merge (data) into a [harmonious] whole Chaining data generation Extensible data storage
5
OEChem Programming toolkit Python/C++ API's Public API Precise handling of chemistry Multiple models of chemistry Aromaticity Atom types Valence models Query semantics
6
Perception Kekule form Aromaticity (Daylight, Tripos, Merck, MDL, OpenEye) Atom types Topological symmetry Stereochemistry (tetrahedral, cis/trans) Partial charges Biomonomers recognition Bond orders from coordinates
7
Aromaticity Models Yes No Yes No YesYes/NoNo YesNoN/A No Yes No OpenEyeDaylightMMFFTriposMDL
8
Data Integration Merge (data) into a [harmonious] whole Chaining data generation Extensible data storage
9
Chaining Data Generation Software ASoftware BData -Challenging in a heterogeneous software environment -Lossless data conversion -Feature perception
10
Extensible Data Storage Source Data
11
Question How often do people (mis)use SD files for attaching data to molecules?
12
Extensible Data Structures Python: atom.SetStringData(“Spam”,”Eggs”) atom.GetStringData(“Spam”) C++: class Foo {}; Foo foo; mol.SetData(“VeryNiceData”,foo); mol.GetData (“VeryNiceData”);
13
Chemical EXchange An interchange language to enable components to communicate Model similar to Unix pipes and single purpose commands CEX stream contains objects (molecule, message) Extensible named property/value pairs Each component in the CEX pipeline can read some objects and properties from the input stream and add new ones to the output stream
14
OEBinary V2 Extensible tag/data format Heirarchical Persistent objects (automatic for POD types) Dynamic data parsing Efficient storage of conformers Ideal for storage as BLOB Lossless data storage possible Definition publicly available
15
Conclusions I have no idea what 'data integration' really means OEChem maintains the integrity of chemical data Extensible persistent data structures likely facilitate data integration OEChem provides extensible persistent data structures OEChem likely facilitates data integration
16
Acknowledgments Geoff Skillman Bob Tolbert Roger Sayle AstraZeneca Pharmaceuticals Vertex Pharmaceuticals
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.