High-Throughput Crystallography at Monash Noel Faux Dept of Biochemistry and Molecular Biology Monash University.

Slides:



Advertisements
Similar presentations
Chapter 20 Oracle Secure Backup.
Advertisements

Integrated Environment for Computational Chemistry on the APAC Grid Dr. Vladislav Vassiliev Supercomputer Facility, The Australian National University,
PiMS overview: version 0.3 & beyond Robert Esnouf, PiMS Project Sponsor, Oxford.
ARCHER Overview October e-Research Challenges Acquiring data from instruments Storing and managing large quantities of data Processing large quantities.
NCS Grid Service Ken Meacham, IT Innovation Crystal Grid Workshop, Sept 2004.
The Changing Face of Research Anthony Beitz DART Integration Manager.
UK -Tomato Chromosome Four Sarah Butcher Bioinformatics Support Service Centre For Bioinformatics Imperial College London
Windows Server 2003 Windows Server Family Products Windows Server 2003 Web Edition Windows Server 2003 Standard Edition Windows Server 2003 Enterprise.
Cluster computing facility for CMS simulation work at NPD-BARC Raman Sehgal.
WINDOWS XP PROFESSIONAL Bilal Munir Mughal Chapter-1 1.
23 May June May 2002 From genes to drugs via crystallography 19 May 1996 Experimental and computational approaches to structure based.
The Collaboratory: computing environments and infrastructure for structural biology research Timothy M. McPhillips Stanford Synchrotron Radiation Laboratory.
GLAST LAT ProjectDOE/NASA Baseline-Preliminary Design Review, January 8, 2002 K.Young 1 LAT Data Processing Facility Automatically process Level 0 data.
28 th March 2007 MrBUMP – Automated Molecular Replacement Ronan Keegan, Martyn Winn CCP4, Daresbury Laboratory.
28 Mar 06Automation1 Overview of developments within CCP4 Generation 1 ccp4i tasks Generation 2 isolated scripts / web service Generation 3 integrated.
The TARDIS Framework A Federated Repository Solution For Raw Diffraction Datasets Steve Androulakis, Monash University, Melbourne Australia I2S2 Workshop.
Authors Project Database Handler The project database handler dbCCP4i is a small server program that handles interactions between the job database and.
Collecting and Storing Sequences In the laboratory Heather Helm UPR Sequencing Facilities Manager.
Corral: A Texas-scale repository for digital research data Chris Jordan Data Management and Collections Group Texas Advanced Computing Center.
BALBES (Current working name) A. Vagin, F. Long, J. Foadi, A. Lebedev G. Murshudov Chemistry Department, University of York.
The TARDIS Framework A Federated Repository Solution For Raw Diffraction Datasets Steve Androulakis, Monash University, Melbourne Australia International.
A summary of the outputs of the ARCHER Project David Groenewegen, Nick Nicholas and Anthony Beitz ARCHER Project.
Andrew Treloar, ARCHER Project Director Cathrine Harboe-Ree, University Librarian Alan McMeekin, Executive Director ITS Dancing with.
Crystal-25 April The Rising Power of the Web Browser: Douglas du Boulay, Clinton Chee, Romain Quilici, Peter Turner, Mathew Wyatt. Part of a.
ICTP, April 2007 CIMA in Australia Ian Atkinson HPRC Manager, ITR School of Maths, Physics and IT James Cook University.
Data and Dissemination Core 1. Overview and EFI Website – Heidi Imker, UIUC 2. EFI LabDB LIMS – Wladek Minor, UVA 3. SFLD – Patsy Babbitt, UCSF (post lunch)
Batch Scheduling at LeSC with Sun Grid Engine David McBride Systems Programmer London e-Science Centre Department of Computing, Imperial College.
Crystal25 Hunter Valley, Australia, 11 April 2007 Crystal25 Hunter Valley, Australia, 11 April 2007 JAINIS (JCU and Indiana Instrument Services): A Grid.
A Networked Machine Management System 16, 1999.
R. Keegan 1, J. Bibby 3, C. Ballard 1, E. Krissinel 1, D. Waterman 1, A. Lebedev 1, M. Winn 2, D. Rigden 3 1 Research Complex at Harwell, STFC Rutherford.
Bioinformatics Core Facility Guglielmo Roma January 2011.
CLASS Information Management Presented at NOAATECH Conference 2006 Presented by Pat Schafer (CLASS-WV Development Lead)
Building the e-Minerals Minigrid Rik Tyer, Lisa Blanshard, Kerstin Kleese (Data Management Group) Rob Allan, Andrew Richards (Grid Technology Group)
Crystal Grid Reciprocal Net XPort Crystal Grid Framework Chemical Informatics and Cyberinfrastructure Collaboratory The Crystal Grid A joint project.
E-HTPX: A User Perspective Robert Esnouf, University of Oxford.
Data Workflow Overview Genomics High- Throughput Facility Genome Analyzer IIx Institute for Genomics and Bioinformatics Computation Resources Storage Capacity.
Data Harvesting: automatic extraction of information necessary for the deposition of structures from protein crystallography Martyn Winn CCP4, Daresbury.
Project Database Handler The Project Database Handler is a brokering application that mediates interactions between the project database and the external.
Campus grids: e-Infrastructure within a University Mike Mineter National e-Science Centre 14 February 2006.
06-1L ASTRO-E2 ASTRO-E2 User Group - 14 February, 2005 Astro-E2 Archive Lorella Angelini/HEASARC.
Application Software System Software.
| nectar.org.au NECTAR TRAINING Module 2 Virtual Laboratories and eResearch Tools.
A Remote Collaboration Environment for Protein Crystallography HEPiX-HEPNT Conference, 8 Oct 1999 Nicholas Sauter, Stanford Synchrotron Radiation Laboratory.
Macromolecular Crystallography Workshop 2004 Recent developments regarding our Computer Environment, Remote Access and Backup Options.
Computing at SSRL: Experimental User Support Timothy M. McPhillips Stanford Synchrotron Radiation Laboratory.
AHM04: Sep 2004 Nottingham CCLRC e-Science Centre eMinerals: Environment from the Molecular Level Managing simulation data Lisa Blanshard e- Science Data.
CCP4 Molecular Replacement Model Generation Create a CCP4i task for generating Molecular Replacement models. - Selecting suitable PDB entries, based on.
Goals Structural Biology Collaboratory Allow a team of researchers distributed anywhere in the world to perform a complete crystallographic experiment.
1 SUZAKU HUG 12-13April, 2006 Suzaku archive Lorella Angelini/HEASARC.
CIP HPC CIP - HPC HPC = High Performance Computer It’s not a regular computer, it’s bigger, faster, more powerful, and more.
DART Developing Toolkits for e-Research Dr Jeff McDonell, DART Project Director July 2006.
ISPyB for MX at Diamond Pierre Aller. -Before beamtime Shipping preparation Sample registration -During beamtime Beamline status (remote) Puck allocation.
Afternoon session: The archival problem and infrastructure for solutions Prof John R Helliwell Interactive Publications.
Canadian Bioinformatics Workshops
ARCHER Building data and information management tools for the complete research life-cycle July 2006.
Project Database Handler The Project Database Handler is a brokering application which will mediate interactions between the project database and other.
An short overview of INSTRUCT and its computational requirements Alexandre M.J.J. Bonvin Project coordinator Bijvoet Center for Biomolecular Research Faculty.
Computational Aspects of the Protein Target Selection, Protein Production Management and Structure Analysis Pipeline.
UltraScan Overview Software for the design and comprehensive analysis of sedimentation velocity and sedimentation equilibrium experiments, and for the.
Accessing the VI-SEEM infrastructure
Stony Brook Integrative Structural Biology Organization
CCP4 6.1 and beyond: Tools for Macromolecular Crystallography
Remote Operations and Online Processing
Database Requirements for CCP4 17th October 2005
Graeme Winter STFC Computational Science & Engineering
ALICE Computing Upgrade Predrag Buncic
Autoprocessing updates at the MX beamlines
CCLRC Daresbury Laboratory
The site to download BALBES:
Presentation transcript:

High-Throughput Crystallography at Monash Noel Faux Dept of Biochemistry and Molecular Biology Monash University

Structural Biology Pipe Line Cloning Expression Purification Crystallisation X-ray diffractionDetermine the structure High throughput robots and technologies: Tecan Freedom Evolution ÄKTAxpress™ Trialing crystal storage and imaging facilities Australian synchrotron online in 2007 Data processing and structural determination: major bottle neck Target tracking / LIMS Data Management Phasing ( CCP4/CNS GRID computing )

The problems Target-tracking/Data management The process of protein structure determination creates a large volume of data. Storage, security, traceability, management and backup of files is ad-hoc. Remote access of the files is limited and requires different media formats. Structure determination CPU intensive

Part of a National Project for the development of eResearch platforms for the management and analysis of data for research groups in Australia. Aim: establish common standardised software / middleware applications that are adaptable to many research capabilities

Solution Central repository of files Attach metadata to the files World wide secure access to the files Automated collection and annotation of the files from in-house and synchrotron detectors

The infrastructure Instrument Rep Kepler Crystal Temp Lab Temp X-ray image Mounted crystal Streaming Video (SV) Lab SV Lab Still Pics Sensor Data Storage Resource Broker Monash University ITs Sun GRID: 54 dual 2.3 GHz CPUs GB (3.8 GB per node) >10 TB storage capacity Running Gridsphere Lab PC Collection PC

Central web portal

Automated X-ray data reduction Automated processing of the diffraction data Investigating the incorporation of Xia2 : Automated Data Reduction: New automated data reduction system designed to work from raw diffraction data and a little metadata, and produce usefully reduced data in a form suitable for immediately starting phasing and structure determination (CCP4) 1. (Graeme Winter) The CCP4 suite: programs for protein crystallography. (1994). Acta Crystallogr. D50,

Divide and Conquer A large number of CPUs available across different computer clusters at different locations: Monash ITs Sun grid VPAC: (Brecca – 97 dual Xeon 2.8 GHz CPUs, 160 GB (2 GB per node) total memory; Edda – 185 Power5 CPUs, 552 GB (8-16 GB per node) total memory) APAC: 1680 processors, 3.56 terabytes of memory, 100 terabytes of disk Personal computers

DART and CCP4 Aims: Use the CCP4 interface locally but run the jobs remotely across a distributed system Nimrod to distribute the CCP4 jobs across the different Grid systems Investigating the possibility of incorporating the CCP4 interface into the DART web portal

No phasing data No sequence identity (<20%) No search model Is there a possible fold homolog Exhaustive Phaser scan of the PDB Exhaustive searches with different parameters and search models Exhaustive Molecular Replacement 2. Acta Cryst. (2005). D61, Likelihood-enhanced fast translation functions A. J. McCoy, R. W. Grosse-Kunstleve, L. C. Storoni and R. J. Read. 2

Exhaustive Molecular Replacement Proteins building blocks are domains Use subset of SCOP as search models in a PHASER calculation. The use of Grid computing will make this possible ~1000 CPUs = days for typical run SCOP Class Fold Superfamily Families Domains Search at the family level Take the highest resolution structure Mutate to poly-alanine, and delete loops and turns Phaser Families with z-score  6 search with each of their domain members

Exhaustive Molecular Replacement Database containing: ToDo list Parameters Results ITs Sun GRID Each node runs a perl script: Requests a job Launch phaser Returns the results Repeats until the list is exhausted 56 dual dual AMD OpteronCPUs GB (3.8 GB per node) >10 TB storage capacity, 160 GB (2 GB per node) total memory Will be extended to use Nimrod to gain access to APAC and the Pacific Rim Grid (Pragma)

Final Pipeline Cloning Expression Purification Crystallisation X-ray diffractionDetermine the structure Data collection, management, storage, and remote access DART Xia2 Data processing, exhaustive experimental (e.g., SAD, SIRAS, MIRAS) and MR phasing for final refinement Grid Computing NIMROD PHASER AutoSHARP CCP4 DART High through put robotics and technologies

Acknowledgments Monash University Anthony Beitz Nicholas McPhee James Whisstock Ashley Buckle James Cook University Frank Eilert Tristan King DART Team