Download presentation
Presentation is loading. Please wait.
Published byAlvin Richardson Modified over 9 years ago
1
BioSimGRID and BioSimGRID ’lite’ - Towards a worldwide repository for biomolecular simulation www.biosimgrid.org Philip C Biggin http://indigo1.biop.ox.ac.uk phil@biop.ox.ac.uk
2
Overview Introduction - Motivation - Consortium - Case studies – added value from comparisons Design - Architecture - Data schema How to use - Deposition - Analysis - Worldwide application The Future - Towards computational systems biology
3
Current Paradigm for MD Simulations Target selection: literature based; interesting protein/problem System preparation: highly interactive; slow; idiosyncratic Simulation: diversity of protocols Analysis: highly interactive; slow; idiosyncratic Dissemination: traditional – papers, posters, talks Archival: ‘archive’ data … and then mislay the tape! No third party involvement
4
Integrating Simulations and Structural Biology of Proteins Novel structure (RCSB) Sequence alignment Biomedically relevant homologue(s) Homology model(s) MD simulations Biomolecular simulation database Comparative analysis Evaluation/refinement of model Biological and pharmacological simulation & modelling e.g. drug discovery bacterial K channel mammalian K channel dynamics in membrane drug docking calculations Interaction site dynamics bioinformatics & structural biology BioSimGRID drug discovery
5
Consortium York Nottingham Oxford RAL Southampton London Bristol Oxford: Mark Sansom, Paul Jeffreys, Bing Wu, Kaihsu Tai Southampton: Jon Essex, Simon Cox, Stuart Murdock, Muan Hong Ng, Hans Fogohr, Steven Johnston London: David Moss Nottingham: Charlie Laughton York: Leo Caves Bristol: Adrian Mulholland
6
Comparative Simulations: Drug Receptors Why? – increase significance of results Sampling – long simulations and multiple simulations Sampling via biology – exploiting evolution Biology emerges from comparisons… e.g. mammalian receptor vs. bacterial binding protein Rat GluR2 EC fragment Major receptor in mammalian brains – drug target MD simulations with/without bound ligands Analyse inter-domain motions glutamate D1 D2
7
GluR2 – Flexibility & Gating… Flexibility depends on ligand occupancy & species Gating mechanism – decrease in flexibility on channel activation But … incomplete sampling Need: longer simulations & comparative simulations empty Kainate Glutamate >> > “OFF” “ON” 01.01.50.5 1 2 3 4 time (ns) RMSD (Å) 0 empty +Kai +Glu 2.0
8
GlnBP – A Bacterial Binding Protein GlnBP – bacterial 2-domain periplasmic binding protein Similar fold to mammalian GluR2 X-ray shows ligand binding induces domain closure MD shows ligand binding reduces inter-domain motions - cf. GluR2 simulations + Gln empty Gln bound X-ray structures MD Simulation empty Gln bound
9
Case Study 2.. Acetylcholinesterase Outer-membrane phospholipase OMPLA AChE
10
So how do compare… Similar active sites or similar motions Different structures Simulated with different MD packages (analysis difficult if not visualization) On different hard drives/tapes/CDs/DVDs. Under different graduate students’ desks Under different postdocs’ beds In different rubbish bins!
11
BioSimGrid = BioSimDB + Toolkits + Integration Answer… Create a wordwide repository of molecular simulations….
12
GUI Service DB/Data Web Application Python Application Apache / Tomcat / SSL / Python Authentication Authorisation Accounting Data Retrieval Tool Analysis Tool HTML Generator Data Deposition Tool SQL Editor Trajectory Query Tool Video/Img Engine BioSim Data Engine / Storage Resource Broker HTTP(S) SSH TCP/IP Middle- ware Database Flat Files BioSimGrid Architecture… DBFlat File Size/GB7.53.0 Random Access /s560.818.6 Sequential Access389.05.5
13
BioSimDB = PDB (or NDB) for MD enable discovery of new science (cf. genomics/proteomic initiatives) BioSimDB CHARMM AMBER NAMD LAMMPS TINKER GROMACS Cross-software Analysis…
14
It’s a Distributed Database Nobody has enough disk space in one place anyway Distributed and duplicate Any piece of information is stored in at least two sites …for resilience
15
DB Interface BioSim Data Engine Services DB Engine Database Flat Files F/F Engine F/F Interface oxford.biosimgrid.org soton.biosimgrid.org Cache BioSim Data Engine Services DB Interface DB Engine Database Flat Files F/F Engine F/F Interface Cache SRB Agent SRB Agent SRB Server MCAT IDA SRB Server MCAT IDA Current Architecture
16
Data Schema The hierachy is like that in the PDB: Chain residue atom coordinate …but also extended in the time dimension: frames
17
Metadata.. …is the data about data MD setup, parameters, instantaneous properties, etc. People currently write this in papers People forget something The disciplined way:- …structured schema
18
Deposition… Unified deposition for trajectories from any packages.
19
Analysis
20
Analysis tools BioSimDB Toolkit Radius of Gyration Surface and Volume RMSD/RMSF Centre of Mass Inter-atomic distances Distance matrix Internal angles Principal Component Analysis Average structure
21
Current Implementation
22
New workflow with BioSimGrid Target selection: literature based; interesting protein/problem Perform simulation (or use someone else’s) Protocals more systematically recorded/checked/confirmed Archive data to BioSimGrid Analyse shared data (either locally or distributed) Dissemination: traditional – papers, posters, talks Store results in BioSimGrid Third parties can analyse data you deposit
23
That’s dandy - but who is this aimed at? Novice and Expert.. Novice (web/GUI) Makes selections Guided through the options Can only do specific things Difficult to make mistakes Expert (employ scripting) Python interpreter Much available Reasonably unrestricted
24
Example sessions
31
Even in script mode the syntax is quite informative:- FC = FrameCollection(`2, 100-200`) myRMSD = RMSD(FC) myRMSD.createPNG() Provide biochemists with little computational experience a means of analysing computational data and obtain meaningful results.
32
Example sessions Viewlet of a session; Demo4.htmlDemo4.html
33
BioSimGrid ‘Lite’ Light version before final rollout Provides equilibrated lipid bilayer boxes Also provides ontogeny: How the box came about… …metadata …equilibration process (all the frames)
34
Deliverables to Date… Database schema Sample database (with test trajectories) Prototype shared between 2 sites Analysis tools – preliminary versions (about 14 tools) Interface to database for data retrieval Python hosting environment
35
Roadmap Dec 2002 – project started July 2003 – (internal) prototype September 2003 – working prototype (All Hands meeting) November 2003 – test ‘real world’ applications December 2003 – multi-site prototype 2004 – multi-site deposition of data 2005 – open up to additional groups for deposition/testing
36
If you are interested… The team would like to hear from interested parties especially with new ideas etc Benefits to you New directions are implemented Toolkit suits your needs Shared development of code Faster and more thorough development BioSimGrid Benefits Larger user community More work gets done Code is efficient. BioSimGrid and community is successful
37
Future Directions in the GRID context 1. HTMD – simulations coupled to structural genomics Diamond light source 2. Computational system biology – virtual outer membrane HPCx 3. Multiscale biomolecular simulations – from QM/MM to meso-scale modelling GRID-enabled simulations 1. HTMD – simulations coupled to structural genomics Diamond light source 2. Computational system biology – virtual outer membrane HPCx 3. Multiscale biomolecular simulations – from QM/MM to meso-scale modelling GRID-enabled simulations BioSimGrid
38
Structural Genomics & HTMD Overall vision – simulation as an integral component of structural genomics Needs capacity computation – GRID? MD database (distributed) – BioSimGRID synchrotron MD database novel biology… compute GRID
39
Towards a Virtual Outer Membrane (vOM) First step towards computational systems biology – a suitable system Bacterial OMs – 5 or 6 proteins = 90% of protein content Structures or good homology models of proteins are available Complex lipid – outer leaflet is lipopolysaccharide (LPS) Minimum system size ca. 2.5x10 6 atoms; simulation times ca. 50 ns cf. current FhuA – 80,000 atoms & 10 ns – need HPCx
40
Multiscale Biomolecular Simulations Membrane bound enzymes – major drug targets (cf. ibruprofen, anti-depressants, endocannabinoids) Complex multi-scale problem: QM/MM; ligand binding; membrane/protein fluctuations; diffusive motion of substrates/drugs in multiple phases Need for GRID-based integrated simulations QM (Bristol) Drug-binding (Southampton) Protein Motions (Oxford) Drug Diffusion (London)
41
References… 1.K. Tai, S. Murdock, B.Wu, MH Ng, S. Johnston, H. Fangohr, S. Cox, P Jeffreys, J. Essex, M.S.P. Sansom. Org. Biomol. Chem :: Under review 2.MH Ng, S. Johnston, S. Murdock, B. Wu, K. Tai, H. fangohr, S. Cox, J. Essex, M.S.P. Sansom, P.Jeffrey. UK E-Science Programme All Hands Meeting 2004 :: Accepted. 3. Python Website – www.python.org 4. BioSimGrid – www.biosimgrid.org
42
Elsewhere Leo Caves (York) Charles Laughton (Nottingham) David Moss (Birkbeck) Oliver Smart (Birmingham) Adrian Mulholland (Bristol) Marc Baaden (Paris) Southampton Dr Stuart Murdock (generic analysis tools) Dr Muan Hong Ng (data retrieval) Dr Hans Fangohr Steven Johnston Prof Simon Cox Dr Jon Essex Oxford Professor Mark Sansom Dr Carmen Domene Dr Alessandro Grottesi Dr Andrew Hung Dr Daniele Bemporad Dr Shozeb Haider Dr Kaihsu Tai (curation and integration) Dr George Patargias Oliver Beckstein Jennifer Johnston Syma Khalid Jorge Pikunic Pete Bond Zara Sands Jonathan Cuthbertson Sundeep Deol Jeff Campbell Yalini Pathy Loredana Vaccaro Shiva Amiri Katherine Cox Robert d’Rozario John HolyoakeSamantha Kaye Anthony Ivetac Sylvanna Ho Oxford e-Science Center Professor Paul Jeffreys Dr Bing Wu (database management) Matthew Dovey Ivaylo Kostadinov BBSRCDTIThe Wellcome TrustGSK EC (TMR) OeSC (EPSRC & DTI) EPSRC OSC (JIF) MRC Acknowledgements
43
More information… team@biosimgrid.org www.biosimgrid.org
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.