12 th April 2007 What’s new and Automation developments in CCP4 Ronan Keegan CCP4, STFC Daresbury Laboratory, U.K.

Slides:



Advertisements
Similar presentations
Molecular Replacement in CCP4
Advertisements

Molecular Replacement
CCP4 Molecular Graphics (CCP4MG)
Automated phase improvement and model building with Parrot and Buccaneer Kevin Cowtan
Martyn Winn, STFC Daresbury Laboratory. 1. CCP4 as a suite 2. Overview of CCP4 functionality 3. Future directions.
M.D.Winn, DL, March 28th 2007 Session 4 Core activities Intro Overview: Martyn CCP4 Releases: Charles Installation issues: Francois Meetings: Maeri Python.
A Molecular Replacement Pipeline Garib Murshudov Chemistry Department, University of York 
Automated protein structure solution for weak SAD data Pavol Skubak and Navraj Pannu Automated protein structure solution for weak SAD data Pavol Skubak.
26-28 th April 2004BioXHIT Kick-off Meeting: WP 5.2Slide 1 WorkPackage 5.2: Implementation of Data management and Project Tracking in Structure Solution.
CCP4 helpdesk: What’s New in CCP4 6.0? Martyn Winn CCP4 CCLRC Daresbury Laboratory Cheshire.
CCP4 Study Weekend 3rd January 2003 CCP4i - “Tricks and Tools” Peter Briggs CCP4 Daresbury.
CCP4mg Liz Potterton, Stuart McNicholas, Martin Noble, Jan Gruber.
Peter J. Briggs, Liz Potterton *, Pryank Patel, Alun Ashton, Charles Ballard, Martyn Winn CLRC Daresbury Laboratory, Warrington, Cheshire WA4 4AD, UK *
23 rd August 2005CCP4-RCSB Workshop IUCr 2005 Florence Italy 1 N6: A Protein Crystallographic Toolbox: The CCP4 Software Suite and RCSB PDB Deposition.
28 th March 2007 MrBUMP – Automated Molecular Replacement Ronan Keegan, Martyn Winn CCP4, Daresbury Laboratory.
28 Mar 06Automation1 Overview of developments within CCP4 Generation 1 ccp4i tasks Generation 2 isolated scripts / web service Generation 3 integrated.
Kevin Cowtan, DevMeet CCP4 Wiki ccp4wiki.org Maintainer: YOU.
Molecular Replacement Martyn Winn CCP4 group, Daresbury Laboratory, UK.
Authors Project Database Handler The project database handler dbCCP4i is a small server program that handles interactions between the job database and.
3rd March 2004PR Conferences and Workshops CCP4: PR, Conferences and Workshops Peter Briggs CCP4, CCLRC/Daresbury Laboratory.
A Molecular Replacement Pipeline Garib Murshudov Chemistry Department, University of York 
BALBES (Current working name) A. Vagin, F. Long, J. Foadi, A. Lebedev G. Murshudov Chemistry Department, University of York.
High-Throughput Crystallography at Monash Noel Faux Dept of Biochemistry and Molecular Biology Monash University.
23 rd March 2005CCP4 Annual Developers’ Meeting 1 DL: Releases, Conferences and Other Activities Peter Briggs, CCP4 Daresbury.
Coot Tools for Model Building and Validation
CCP4 Developers Meeting 2007 CCP4 Molecular Graphics Liz Potterton and Stuart McNicholas.
EBI is an Outstation of the European Molecular Biology Laboratory. Annotation Procedures for Structural Data Deposited in the PDBe at EBI.
An Introduction to CCP4i The CCP4 Graphical User Interface Peter Briggs CCP4.
Crank and Databases Steven Ness Leiden University The Netherlands.
Bulk Model Construction and Molecular Replacement in CCP4 Automation Ronan Keegan, Norman Stein, Martyn Winn.
R. Keegan 1, J. Bibby 3, C. Ballard 1, E. Krissinel 1, D. Waterman 1, A. Lebedev 1, M. Winn 2, D. Rigden 3 1 Research Complex at Harwell, STFC Rutherford.
MrBUMP – Molecular Replacement with Bulk Model Preparation Automated search model discovery and preparation for structure solution by molecular replacement.
17 th October 2005CCP4 Database Meeting (York) CCP4(i)/BIOXHIT Database Project: Scope, Aims, Plans, Status and all that jazz Peter Briggs, Wanjuan Yang.
Developments with CCP4i & the Database Handler Peter Briggs.
POINTLESS & SCALA Phil Evans. POINTLESS What does it do? 1. Determination of Laue group & space group from unmerged data i. Finds highest symmetry lattice.
In context…. xia2: what is it? Automated expert data reduction – images in, reflections suitable for phasing out. Handles: –MAD data –Multiple passes.
Project Database Handler The Project Database Handler dbCCP4i is a brokering application that mediates interactions between the project database and an.
Data Harvesting: automatic extraction of information necessary for the deposition of structures from protein crystallography Martyn Winn CCP4, Daresbury.
Project Database Handler The Project Database Handler is a brokering application that mediates interactions between the project database and the external.
1 MrBUMP – Molecular Replacement with Bulk Model Preparation Ronan Keegan, Martyn Winn CCP4 group, Daresbury Laboratory Como May 23rd 2006.
SR Users Meeting 10-11th September 2003 CCP4 Release 5.0 Peter Briggs CCP4/CCLRC Daresbury Laboratory.
Almost at the end … “If you don’t remember anything else, remember this”
M.D.Winn, York, March 22nd/23rd 2005 CCP4 Library Development.
28 th May 2005CCP4 Workshop ACA 2005 Orlando FL 1 WK04: A Protein Crystallographic Toolbox: The CCP4 Software Suite ACA 2005 Orlando May 28th 2005.
17 th March 2008 MrBUMP progress report Ronan Keegan & Martyn Winn Daresbury Laboratory.
Software automation – What STAB sees as key aims? 1.Brief review of activities and recommendations (so far) 2.Reality checks 3. Things to do…
17 th October 2005CCP4 Database Meeting (York) CCP4i Database Overview Peter Briggs.
CCP4 Molecular Replacement Model Generation Create a CCP4i task for generating Molecular Replacement models. - Selecting suitable PDB entries, based on.
CCP4 Study Weekend 2013 “Molecular Replacements”
RAPPER Nick Furnham Blundell Group – Department of Biochemistry Cambridge University UK
CCP4 Version The most recent version of the CCP4 suite is 4.1, which was released at the end of January 2001, with a minor patch release shortly.
Peter J. Briggs, Alun Ashton, Charles Ballard, Martyn Winn and Pryank Patel CCLRC Daresbury Laboratory, Warrington, Cheshire WA4 4AD, UK The CCP4 project.
What does the future hold? SAPHIRE CCP4 libraries Program Developments More automation 3D viewer Project CCP4 Study Weekend 2003 BAR!
Stony Brook Integrative Structural Biology Organization
CCP4 6.1 and beyond: Tools for Macromolecular Crystallography
Database Requirements for CCP4 17th October 2005
CCP4 from a user perspective
Project tracking system for the structure solution software pipeline
Almost at the end … “If you don’t remember anything else, remember this !!!!”
Releases, Conferences and Other Activities
Experimental phasing in Crank2 Pavol Skubak and Navraj Pannu Biophysical Structural Chemistry, Leiden University, The Netherlands
CCP4-PDB Workshop ACA 2004 Chicago
Ingleton and Clapham (N.Yorks)
Automated Molecular Replacement
MrBUMP: progress and plans
The site to download BALBES:
N6: A Protein Crystallographic Toolbox:
Presentation transcript:

12 th April 2007 What’s new and Automation developments in CCP4 Ronan Keegan CCP4, STFC Daresbury Laboratory, U.K.

12 th April 2007 Quick Overview Brief introduction to CCP4 New programs and features in CCP4 Upcoming features in version 6.1 Automation projects –MrBUMP – automated Molecular Replacement –Other automation projects

12 th April 2007 What is CCP4? Collaborative Computational Project Number 4 Set up in the late 70’s to support collaboration between researchers working on Protein Crystallography software in the UK and to assemble a comprehensive collection of software to satisfy the computational requirements of the relevant UK groups. Many functions: –Support and distribution of the CCP4 suite of programs for PX –Education – workshops, university visits, summer schools, study weekend –Maintaining the CCP4 bulletin board and website Academic users can use the suite for free. Licence fee for commercial users

12 th April 2007 CCP4 Organisational Structure DL CCP4 Group Core developments & activities Project Leader WG 1 WG 2 Funded Developers Associated Developers Occasional Contributors STAB Exec Core projects e.g : CCP4mg, mmdb, PIMS, Automation, BIOXHIT … Major programs e.g: Mosflm, Refmac, Scala, Phaser, Clipper, Coot … Lots of other useful software e.g. PDBExtract Steering Committees

12 th April 2007

New programs and features in CCP4 New Packages in CCP4 6.0: –CCP4mg – Molecular Graphics –Coot – graphical toolkit for model building, model completion and validation –Phaser – molecular replacement (version 1.3.3) –Chainsaw – MR model preparation –Pirate: statistical phase improvement –Superpose: secondary structure alignment –BP3: heavy atom phasing and refinement –Chooch: anomalous scattering factors from raw fluorescence spectra –New features in CCP4i

12 th April 2007 CCP4mg The aim is to provide a molecular graphics program that is fully compatible with the CCP4 environment and programs. Features: –Displays molecules with simple, flexible selection tools and a variety of display styles and colouring schemes. –A simple graphical interface to select the atoms to display, the colour scheme and the display style. –Surfaces and electrostatic potential calculations –Displays maps with a 'continuous crystal' and real time update of contouring level.

12 th April 2007 Superpose two or more protein structures automatically. Also structure analysis: secondary structure, solvent accessible surface area, hydrogen bonds, close contacts. Writes 'snapshot' images, create movies. Also creates POV-Ray input files and PostScript files. Runs on Linux and Windows (2000, NT and XP) and Mac OSX.

12 th April 2007 Normal mode Analysis CCP4MG can currently perform approximate normal mode calculations using two elastic network models. –Only consider one atom per residue (CA) –Assume all force constants to be the same –Gaussian Network and Anisotropic Network methods employed

12 th April 2007 Coot Coot is for model building, model completion and validation. It will display maps and models and allows model manipulations such as idealization, real space refinement, manual rotation/translation, rigid-body fitting, ligand search, solvation, mutations, rotamers, and Ramachandran plots. File formats handled: PDB, mmCIF, MTZ files, Phases (.phs) and others. Most of its functions are also accessible for scripting.

12 th April 2007 Coot

Phaser Phaser is a program for phasing macromolecular crystal structures with maximum likelihood methods. Version in CCP supports the molecular replacement method. The next version will include the experimental phasing method. Features: –brute- force rotation and translation searches –FFT- based fast rotation and translation searches –correction for anisotropic diffraction –search for multiple molecules in multiple space groups /

12 th April 2007 Pirate & Superpose Pirate: –Pirate is a new statistical phase improvement program. –'pirate' performs statistical phase improvement by classifying the electron density map by sparseness/denseness and order/disorder, with the aim of obtaining superior results to conventional solvent mask based methods without requiring knowledge of the solvent content. –Currently available for Linux and MAC OSX. Superpose: –superpose aligns two structures by matching graphs built on the protein's secondary-structure elements, followed by an iterative three-dimensional alignment of protein backbone C-alpha atoms.

12 th April 2007 BP3 BP3 is a new program for obtaining phase information from an S/MIR(AS) and/or S/MAD experiment(s) by multivariate likelihood estimation. It will refine heavy and/or anomalously scattering atomic parameters along with error parameters to generate phase information.

12 th April 2007 Chooch Program to determine what wavelengths to use to do your MAD experiment. Determines values of anomalous scattering factors from raw fluorescence spectra and pinpoints the position of the f'' maximum and the f' minimum values. Command line driven with all options controlled by switches. Optional PGPLOT visual output. Publication quality PS output generated on request.

12 th April 2007 Chainsaw Molecular replacement model preparation utility that mutates a template PDB file according to a sequence alignment. Features: –examines the sequence alignment between target and template and modifies the template PDB file by pruning non-conserved residues back to the gamma atom –more atoms are preserved than in a polyalanine model, but parts of the model which are unlikely to be present in the crystal structure and thus would only degrade the signal are pruned. 1mr6 used as a template for 1tgx (38% sequence identity). From left to right: unmodified template, chainsaw template, polyalanine template.

12 th April 2007 New features in CCP4i Interfaces for new programs: –Phaser, –Pirate/Clipper, –BP3, –Chainsaw, –CCP4mg launcher, –CRANK, –Shelx_C/D/E.

12 th April 2007 New features in CCP4i Database search and sort Project shortcuts Customise job database view Help shortcuts

12 th April 2007 CCP4 6.1 and beyond Version 6.1 in 6-12 months time New Programs for 6.1 –Rapper – Protein modelling, automated conformer generation –Rampage - generate Ramachandran plots for structure validation –Buccaneer – chain tracing –Pointless – determine space/laue group from umerged data –Oasis –Crunch2 –Afro –Clipper2 libraries –Automation scripts MrBUMP XIA2

12 th April 2007 iMosflm New improved mosflm graphical user interface. More user friendly than the old one.

12 th April 2007 Updates to popular CCP4 programs Acorn –ab initio procedure for the determination of protein structure using atomic resolution data or artificially extended data to atomic resolution, and for finding sub-structures from anomalous or isomorphous differences. Truncate (Uboat) –New improved version written in C++. –In the longer term there will be new tests for twinning, anisotropy corrections and the ability to handle unmerged data (useful if radiation damage occurs), but these won't be in the initial release. Phaser 2.0/2.1 –Will include experimental phasing Refmac 5.3/6.0 –The latest version of Refmac, and will supersede the version 5.2.x in the CCP4 6.0.x series.

12 th April 2007 CCP4 6.1 and beyond Plans for CCP4i –CCP4i Classic reworked –CCP4i Auto – automation scripts CCP4i database –New database handler –Allow for greater flexibility and control of jobs –Job/DB viewer program built on top of the DB (more about this later)

12 th April 2007 CCP4 6.1 and beyond Long term plans –Better integration between CCP4i, CCP4mg and Coot –More intuitive interfaces to programs –More automation

12 th April 2007 CCP4 Automation Reasons –Higher throughput at synchrotron beamlines –Crystallography is increasingly becoming a tool for researchers in other fields. Not all have the time to learn how to use the complex set of programs for solving structures. Users prefer to concentrate on the Biology

12 th April 2007

MrBUMP - Molecular Replacement with Bulk Model Preparation

12 th April 2007 Aim of MrBUMP Automated framework for Molecular Replacement Particular emphasis on generating variety of search models Wraps Phaser, Molrep and Acorn Uses a variety of helper applications (eg Chainsaw) and bioinformatics tools (eg FASTA, Mafft) Uses on-line databases (eg PDB, Scop) Can make use of computational cluster resources to speed up the processing In favourable cases, gives “one-button” solution In unfavourable cases, suggests likely search models for manual investigation

12 th April 2007 Pipeline ` ` ` ` Target MTZ & Sequence Target Details Template Search Model Preparation Molecular Replacement & Refinement Check scores and exit or select the next model

12 th April 2007 Template Search Sequence based search (FASTA) Secondary structure based search (SSM) Domain search (SCOP) Identification of possible multimers (PQS & PISA) Users can also enter their own templates by ID or from locally held files.

12 th April 2007 Model Preparation Search models can be prepared for MR in several ways –Chainsaw – non-conserved residues are pruned (sequence provided) –Molrep – pruning of non-conserved side-chains (internal sequence alignment) –Polyalanine – all side chain atoms are pruned beyond the CB atom –PDBclip – models are not modified An ensemble of the best models is also created for Phaser

12 th April 2007 Molecular Replacement & Refinement For each search model, MR is done with Molrep or Phaser or both. MR programs run mostly with defaults MrBUMP provides LABIN columns, MW of target, sequence identity of search model, number of copies to search for, number of clashes tolerated Allow Molrep / Phaser to set resolution limits and weights After MR, models are passed to Refmac for restrained refinement otherwise final Rfree < 0.48 or final Rfree < 0.52 and dropped by 5% final Rfree < 0.35 or final Rfree < 0.5 and dropped by 20%    “ success” “marginal” “failure”

12 th April 2007 MrBUMP and cluster computing MrBUMP is usually run on a desktop from ccp4i or the command line However, MrBUMP can take advantage of a compute cluster to farm out the Molecular Replacement jobs. Currently Sun Grid Engine enabled clusters are supported but support will be added for other types of queuing system (e.g. LSF, Condor) if there is enough demand. Job control: All nodes terminate when one finds a solution Current (known) cluster installations at Daresbury, Diamond and University of Dundee.

12 th April 2007 MrBUMP on the Grid Currently under development Large parameter space searches. Submit many jobs to U.K. computational grid resources using recently developed e- Science tools (MCS, AgentX, Rcommands, SRB) Goals: –To improve the performance/success rate of the method –Possibly extract useful Biological information –Make grid-enabled version available to users

12 th April 2007 MrBUMP Output Currently produces a long log file listing search results, model preparation steps, summaries from each MR and refinement job and relevant references for programs used. Not ideal, there’s a lot of information to trawl through. Summary of results now provided at the end of log file. Future versions will provide results in marked-up web page format for more clarity.

12 th April 2007 MrBUMP Output – CCP4i dbviewer

12 th April 2007 MrBUMP pre-release Beta version first released in Jan’ 06 (current version is 0.3.3) Currently supported on Linux and Mac OSX, Windows version will be available when included in suite. Will be included in next release of CCP4 (version 6.1) MrBUMP paper to be published in Acta Cryst. D in April ‘07 First citations in Obiero et al., Acta Cryst. (2006). F62, ; El Omari et al., Acta Cryst. (2006). F62,

12 th April 2007 New features Run Acorn after refinement for phase improvement (high resolution data) Support for searching in enantiomorphic spacegroups. Users can now specify template models by PDB ID or add local PDB files. “Generate models only” option. XML Output. Additional multiple alignment programs supported – Tcoffee and Probcons.

12 th April 2007 Future versions Improvements to multimeric search models (using PISA) Supplement multiple alignment with additional sequences and/or structural information Model completion and/or re-building Target complexes. Improved output presentation

12 th April 2007 Conclusions Test cases and the examples demonstrated the utility of trying a range of search models, a protocol that can only be attempted adequately by automation. MrBUMP is not meant to compete with careful analysis of the data and model by an experienced crystallographer. However, it may succeed in difficult cases by finding a combination of models and protocols that would not otherwise have been tried. In more straight forward cases the advantage is simply one of convenience.

12 th April 2007 CCP4 Automation - BALBES Authors: Garib Murshudov, Alexei Vagin, Fei Long (YSBL) Built around Molrep MR and model preparation, Refmac and Sfcheck Model preparation based on using a custom database derived from the PDB database Best model is derived from the database and used in Molrep. Protocols –Simple molecular replacement –Domains iterated with refinement –Use of tertiary structure if available –Completion of MR using phased MR and refinement Released early 2007

12 th April 2007 XIA2 Automated Data Reduction xia2 is a new automated data reduction system designed to work from raw diffraction data and a little metadata, and produce usefully reduced data in a form suitable for immediately starting phasing and structure solution. Pre-release version is currently available.

12 th April 2007 XIA2 BEGIN PROJECT TM1553 BEGIN CRYSTAL BEGIN AA_SEQUENCE MHKMWPSDSNDHRVTRRNVIIFSSLLLGSLAILLALLLIRTKDQYYELRDFALGTSVRIV VSSQKINPRTIAEAILEDMKRITYKFSFTDERSVVKKINDHPNEWVEVDEETYSLIKAAC AFAELTDGAFDPTVGRLLELWGFTGNYENLRVPSREEIEEALKHTGYKNVLFDDKNMRVM VKNGVKIDLGGIAKGYALDRARQIALSFDENATGFVEAGGDVRIIGPKFGKYPWVIGVKD PRGDDVIDYIYLKSGAVATSGDYERYFVVDGVRYHHILDPSTGYPARGVWSVTIIAEDAT TADALSTAGFVMAGKDWRKVVLDFPNMGAHLLIVLEGGAIERSETFKLFERE END AA_SEQUENCE BEGIN HA_INFO ATOM SE NUMBER_PER_MONOMER 5 END HA_INFO BEGIN WAVELENGTH INFL WAVELENGTH F' F'' 5.8 END WAVELENGTH INFL BEGIN WAVELENGTH LREM WAVELENGTH F' -2.5 F'' 0.5 END WAVELENGTH LREM BEGIN SWEEP INFL WAVELENGTH INFL BEAM IMAGE 13185_2_E1_001.img DIRECTORY /data/jcsg/als1/8.2.1/ /collection/TM1553/13185/ END SWEEP BEGIN SWEEP LREM WAVELENGTH LREM BEAM IMAGE 13185_2_E2_001.img DIRECTORY /data/jcsg/als1/8.2.1/ /collection/TM1553/13185/ END SWEEP END CRYSTAL END PROJECT TM1553 Requires image data + input specification script with target and experiment data: Sequence Number of heavy atoms Wavelength Location of image data

12 th April 2007 Through your favourite phasing pipeline…

12 th April 2007 CCP4 Automation - HAPPy – Heavy Atom Phasing in Python What it is: Automated Experimental Phasing Pipeline Replaces and expands on the capabilities of the CHART package What it will do: Take integrated and merged experimental data amplitudes (post- TRUNCATE),de-twinned,consistently indexed. Determine the heavy atom structure and phase probabilities. Optimize the density map to give interpretable map. Build structure. First release will handle SAD data only.MAD,MIR,MIRAS modes later.

12 th April 2007 Acknowledgements: Core Group (Daresbury): –Martyn Winn, Charles Ballard, Peter Briggs, Francois Remacle, Norman Stein, Wendy Yang, Maeri Howard. CCP4MG (York): –Liz Potterton, Stuart McNicholas Coot (Oxford & York): –Paul Emsley, Kevin Cowtan Program Developers (York, Cambridge, Diamond & Leiden University): –Garib Murshudov, Alexei Vagin, Fei Long, Randy Read, Airlie McCoy, Harry Powell, Gwyndaf Evans, Phil Evans, Eleanor Dodson, Nick Furnham, Steve Ness. BBSRC for their funding And many others…