Download presentation
Presentation is loading. Please wait.
1
© S.J. Coles 2005 ACS 2005, San Diego Furthering Chemoinformatics through ‘Crystalloinformatics’ Simon J. Coles EPSRC National Crystallography Service School of Chemistry University of Southampton
2
© S.J. Coles 2005 ACS 2005, San Diego Data – Information – Knowledge Cycle Experiment PredictionModel Properties
3
© S.J. Coles 2005 ACS 2005, San Diego Leveraging eScience X-Ray e-Lab Analysis Properties e-Lab Simulation Video Diffractometer Grid Middleware Structures Database
4
© S.J. Coles 2005 ACS 2005, San Diego Data ‘Acquisition’ and ‘Workup’ 1)Application for an allocation 2)Secure access to NCS Grid resources 3)Sample submission 4)Monitoring sample status 5)Data collection 6)Raw data download 7)Automated structure solution
5
© S.J. Coles 2005 ACS 2005, San Diego Application * * * * * * *
6
© S.J. Coles 2005 ACS 2005, San Diego Security NCS RA KEYSTORE Applicant identity independently verified by NCS Panel award access to NCS CLIENT CSR NCS RA signs key pair NCS RA public key NCS RA exports signed certificate Passcode & signed PFX Signed certificate imported into browser
7
© S.J. Coles 2005 ACS 2005, San Diego Sample Submission
8
© S.J. Coles 2005 ACS 2005, San Diego Status Monitoring NCS CLIENT
9
© S.J. Coles 2005 ACS 2005, San Diego Data Collection Diffraction Unit Cell Success Strategy Data Collection Data Process System Y PreScans Yes BruNo Mount BruNo Unmount Setup via GUI Sample Tray No
10
© S.J. Coles 2005 ACS 2005, San Diego Data Collection Metadata capture
11
© S.J. Coles 2005 ACS 2005, San Diego Data Collection
12
© S.J. Coles 2005 ACS 2005, San Diego Automatic Structure Solution Background process designed to adopt the ‘Human Approach’, using refinement indicators and structural knowledge Encorporates all ‘Q peaks’ above a cut-off as C atoms Reject on basis of thermal parameters, adjust atom types accordingly & iterate Hybridisation & hydrogens from connectivity & difference map peaks then fixed Usual crystallographic validation performed, -introducing ‘chemical validation’
13
© S.J. Coles 2005 ACS 2005, San Diego Data Overload & the Publication Problem 25,000,000 2,000,000 300,000
14
© S.J. Coles 2005 ACS 2005, San Diego Current Publishing Protocols Aims, intellectual ideas, conclusions Inferences, interpretation, derived results Raw & underlying data
15
© S.J. Coles 2005 ACS 2005, San Diego The Open Archive Solution?
16
© S.J. Coles 2005 ACS 2005, San Diego Separating Data from Interpretations Underlying data Intellect & Interpretation
17
© S.J. Coles 2005 ACS 2005, San Diego The Open Archive Solution for Data Research & e-Science workflows Aggregator services: national, commercial Repositories : institutional, e-prints, subject, data, learning objects Data curation: databases & databanks Validation Harvesting metadata Data creation / capture / gathering: laboratory experiments, Grids, fieldwork, surveys, media Deposit / self- archiving Peer-reviewed publications: journals, conference proceedings Publication Validation Data analysis, transformation, mining, modelling Searching, harvesting, embedding Presentation services: subject, media-specific, data, commercial portals Resource discovery, linking, embedding Linking
18
© S.J. Coles 2005 ACS 2005, San Diego Workflow RAW DATADERIVED DATARESULTS DATA Initialisation: mount new sample on diffractometer & set up data collection Collection: collect data Processing: process and correct images Solution: solve structures Refinement: refine structure CIF: produce CIF (Crystallographic Information File) Validation: chemical & crystallographic checks Report: generate Crystal Structure Report
19
© S.J. Coles 2005 ACS 2005, San Diego Simple Deposition Metadata ‘attached’
20
© S.J. Coles 2005 ACS 2005, San Diego An Archive Entry ecrystals.chem.soton.ac.uk
21
© S.J. Coles 2005 ACS 2005, San Diego Access to ALL underlying data
22
© S.J. Coles 2005 ACS 2005, San Diego Metadata Publication Using simple Dublin Core Crystal structure Title (Systematic IUPAC Name) Authors Affiliation Creation Date Additional chemical information through Qualified Dublin Core Empirical formula International Chemical Identifier (InChI) Compound Class Keywords Specifies which ‘datasets’ are present in an entry DOI Rights
23
© S.J. Coles 2005 ACS 2005, San Diego Harvesting & Aggregating: Google
24
© S.J. Coles 2005 ACS 2005, San Diego OAI Harvesting & Aggregating OAIster: Generic
25
© S.J. Coles 2005 ACS 2005, San Diego OAI Harvesting & Aggregating eBank: Subject Specific
26
© S.J. Coles 2005 ACS 2005, San Diego OAI Harvesting & Aggregating PSIgate: Service Provider
27
© S.J. Coles 2005 ACS 2005, San Diego ‘Value Added’ Studies Courtesy: Thomas Gelbrich
28
© S.J. Coles 2005 ACS 2005, San Diego ‘Value Added’ Studies
29
© S.J. Coles 2005 ACS 2005, San Diego ‘Value added’ studies X~I X~CF 3 CH 3 ~CF 3 (X = CF 3, I, Br, Cl, F, H) Br~Br (iii) I~Cl I~Br (ii) I~I (iii) CN~Br (i) CN~CN C2C3 I-Dimer C1 Br~Br(ii)Br~Br (i) I~Br CF 3 ~Cl 1D1D 0D 2D 3D
30
© S.J. Coles 2005 ACS 2005, San Diego ‘Value Added’ Studies Five structures based on C1 stacks
31
© S.J. Coles 2005 ACS 2005, San Diego Thanks NCS: Mike Hursthouse, Mark Light, Peter Horton, Ann Bingham CombeChem: Jeremy Frey, Sam Peppe, Paul Walker IT Innovation: Mike Surridge, Ken Meacham, Steve Taylor, Darren Marvin ECS: Dave de Roure, Hugo Mills, Graham Smith, Les Carr, Chris Gutteridge eBank / UKOLN / PSIgate: Liz Lyon, Rachel Heery, Monica Duke, Michael Day, Andy Powell, John Blundon-Ellis ££££($$$$)’s
32
© S.J. Coles 2005 ACS 2005, San Diego Take-Home Message “The internet wasn't created for mockery! It was created so scientists from different universities could share datasets....” Simpson, H. The Simpsons (2005), Eds. Groening, M., Brooks, J.L. & Simon, S., Series 16, Episode 8, Original air date (US) 06-Feb-2005. http://www.tvtome.com/tvtome/servlet/GuidePageServlet/showid-146/epid-346864/
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.