Presentation is loading. Please wait.

Presentation is loading. Please wait.

Crystallomics Core Overview

Similar presentations


Presentation on theme: "Crystallomics Core Overview"— Presentation transcript:

1 Crystallomics Core Overview
Scott Lesley JCSG Annual Meeting 4/6/06 SERIES OF FIRSTS

2 The JCSG pipeline and automation
cloning robotics parallel fermentation baculovirus expression parallel purification SERIES OF FIRSTS crystal plate setup fine screen setup plate imager beamline robotics

3 Tiered Strategy Driving Factors
Success rate for individual targets is low Time for target to structure is long Cost and resource limitations Diversity of proteins versus need for a common approach Strategy Provide sufficient throughput to drive needed output Develop necessary technology Focus on generalizable approaches Use experimentally-determined behavior to organize targets Prioritize approach based on effort and resources Used to be that you had a set of targets and you divided them up, and beat people with a stick until they were done. The concept here is to utilize like behaviors to categorize and Early on we recognized the need to treat proteins as individuals while retaining parallel processing

4 Evolution of our Tiered Strategy
Tier 1 - determines target behavior JCSG1: 1000's of targets evaluated at scale through pipeline lack of correlation of small scale with large-scale success the major predictor of success results for pipeline optimization JCSG2: more selective targets small-scale expression behavior signatures recognizable and predictable

5 Evolution of our Tiered Strategy
Tier 2 - output (structure) focused JCSG1: Soluble expressers beaten into submission SeMet dropout rate high Standardized secondary purification JCSG2: Bioanalytical-based target decisions QC "holding-pen" New purification approaches

6 Evolution of our Tiered Strategy
Tier 3 - success rate focused JCSG1: Salvage efforts applied only after significant time on target Emphasis on fine screening and purification Exploration of alternate salvage methods JCSG2: Emphasis on target optimization through molecular biology Earlier and broader application of salvage approaches Combination of small-scale screening and bioanalytical to identify signatures versus success-based screening of all possibilities

7 NEW PFAM (BIG) ?? new targets PFAM1 replacement 629 new targets CML Target List 628 new targets PFAM1 705 new targets FG 1 264 new targets FG2 261 new targets PFAM1 transmembrane deletion 19 salvage targets "CitySoluble" 96 salvage targets 2208 clones PFAM1 ?? salvage targets "SDC/CC Hotlist" 60 salvage targets 7/05 8/05 9/05 10/05 11/05 12/05 1/06 2/06 3/06 4/06 5/06 6/06

8 It takes lots of clones and proteins to define what works at the level of protein expression and crystallization prior to shipping a crystal.

9 Lots of crystals are shipped and screened per target.
How many are necessary? In half the targets, the "golden nugget" leading to a structure is found in the first 10 crystals screened. Suggests that crystals on a construct should be enough to know if it is going to diffract. Don't fine screen targets to death. Use molecular biology and other salvage efforts instead.

10 The economics of HT structural biology in our Production Center
Simple math: Current Rates (125 structures/year) JCSG Budget: $10.1 M = $80,800 / structure average costs CC costs for target to delivered diffraction quality crystal = $23,667 / structure CC breakdown: % Budget Units per year (CID, PID, Setup) Cost per Unit Cloning 15% 8186 $54 Expr/Pur 25% 25663 $28* ($336/SeMet) Crystallization 5594 $132 Target Mgmt./ Analysis 4496 $99 Process Impr./Tech. 20% This numbers are for discussion purposes only. Any use of this material without the expressed written consent of the author is expressly forbidden. Results may vary. May cause unexpected side effects including vomiting and diarrhea. May cause drowsiness. Do not use when driving or operating heavy equipment. All other caveats apply ;-)

11 The economics of HT structural biology our Production Center
The cost to succeed: Using current per unit cost estimates, the average successful structure costs CC $4280 The cost of salvaging a difficult target Project Clones PIDs Plates Setup Xtals Shipped CC Costs TM0771 Thermotoga 11 108 118 210 $19,317 PC02663D Pfam1 1 13 4 60 $958 The cost to fail: Subtracting out fixed costs of Target Mgmt and Process Improvement and Technology Development, 62% of budget goes to failed targets ($700/failed target). Fail faster and cheaper. Succeed a higher percentage of the time. This numbers are for discussion purposes only. Any use of this material without the expressed written consent of the author is expressly forbidden. Results may vary. May cause unexpected side effects including vomiting and diarrhea. May cause drowsiness. Do not use when driving or operating heavy equipment. All other caveats apply ;-)

12 Effecting the denominator – cost and process efficiencies
Materials - Resin blocks TEV and in-house materials generation Genomic cloning versus gene synthesis Process efficiencies - Focused secondary purification Smart and flexible protein processing PIPE cloning Microexpression - reduction to overall purification throughput (failed targets cost ~$700, find failures faster and cheaper) Process optimization through data mining and careful selection of test target sets Resin Blocks PIPE Cloning

13 Genome Expansion Project
JCSG1 - Pipeline development targets In the beginning… T. maritima and C. elegans with additional clones provided through GNF mouse collection. Addition of the ortholog targets were made possible through accessing commercially available gDNAs JCSG2 - PFAM targets Genomic database sequence ≠ genomic clone Gene synthesis ~$900/target materials, genomic clone ~$20/target materials In anticipation of a need for expanded genome coverage to effectively access PFAM targets, CC actively solicited gDNAs from sequencing centers, collaborators and microbiologist friends. Because of this effort we have access to a much broader set of targets so that we have more opportunities per PFAM to solve an assigned family. Impact: gDNAs available available orf target pool July ,813 March (more pending) 350,451 Since February 75% of structures have come from Genome Expansion targets!

14 Learning from the past (finally): optimizing targets for success
JCSG1 targets: T. maritima genome - all potential targets Other targets of <30% identity to PDB - defined by structural novelty -little consideration of practical issues as pipeline needs not fully defined JCSG2 targets: Biomedical theme Central Machinery of Life (CML) - defined by evolutionary structural conservation NIH target selection (PFAM) - defined by structure/function conservation and novelty Need to apply lessons learned and pipeline needs to target selection process CML targets provide a controlled set of proteins to validate practical considerations for target selection which can be applied when selecting representative PFAM proteins and other future targets

15 These targets represent a reasonable spectrum of protein properties
The "City Plates" The CML targets were selected for structural conservation from bacteria to mammals without much practical consideration to pipeline preferences. These targets represent a reasonable spectrum of protein properties Can we use our pipeline experience to predict winners from losers? Can we use this algorithm to be more successful when selecting individual representatives from PFAM targets?

16 Application to our first CML target set "City" Designations
The "City Plates" Scoring factors Molecular Weight - size distribution of past successes used in weighted score pI - preference for acidic pIs Relative methionine content - necessary for SeMet phasing Cysteine content - preference for low numbers to avoid disulfide problems Trp/Tyr content - preference for inclusion for detection by UV in purification Annotation - bias against hypothetical, desire for predicted ligands Predicted disorder - weighted score based on SEG predictions Transmembrane helices - already accounted for in target selection gDNA availability - have to be able to clone Application to our first CML target set "City" Designations Vegas and Reno house odds on these targets Cleveland and Philadelphia blue collar targets Dubuque and Kabul depressing and dangerous

17

18

19 All "cities" can be solved (e.g. T. maritima targets)
%Total Solved Score SolvedTM SolvedCML SolvedPFAM 58-67 Vegas 23.8% 50.0% 69.2% 55-57 Reno 22.0% 38.9% 7.7% 52-54 Cleveland 12.8% 0.0% 23.1% 45-51 Philadelphia 22.6% 40-44 Dubuque 13.4% 11.1% 22-39 Kabul 5.5% %Targets Solved Score SolvedTM SolvedCML SolvedPFAM 58-67 Vegas 16.3% 8.7% 2.3% 55-57 Reno 16.4% 7.7% 0.7% 52-54 Cleveland 13.0% 0.0% 4.2% 45-51 Philadelphia 8.3% 40-44 Dubuque 7.0% 2.7% 22-39 Kabul 1.9% All "cities" can be solved (e.g. T. maritima targets) TM targets have been in-progress for 6 years CML targets have been in-progress for 6 months PFAMs are assigned, but selection of members within should be biased towards better "cities" What is the nature of these differences and how do we address them? - MK presentation

20 Effecting the numerator – salvage and success rates Best targets
DXMS studies SER mutations Redmet Partial proteolysis Bioanalytic process feedback Partial proteolysis of CML targets Reductive methylation improvement to crystallization

21 Proteins which crystallize well lead to structures (duh)
Coarse screen hit rates are an indication that the target is ok Historically, targets need >1% hit rate to lead to structure If target is not crystallizing, target needs optimizing deletions mutations modifications (reductive methylation) Analysis of Tier 1 coarse screen crystallization trials of shows that structures come from well-behaved proteins (>1% hit rate).

22 Tier 3 Salvage Pathway: Deuterium exchange mapping (DXMS)
Virgil Woods, UCSD

23 Coarse Harvestable Rate
DXMS data from 148 unique targets in crystal trials Full-Length Based on DXMS 44% 29% 12% 5% 0% n.d. 8% 10% 36% Solved Targets 1 2 3 4 5 Disorder Score % Targets 18% 36% 28% 14% 3% Exchange Maps DXMS Score Coarse Setups Wells Coarse Hit Rate Coarse Harvestable Rate 1 149 14304 2.6% 0.7% 2 339 32544 1.8% 0.6% 3 332 31872 0.4% 0.0% 4 215 20640 0.1% 5 8 768

24 Salvaging a problematic target
NP_ (Reno) Salvaging a problematic target Soluble expression but highly aggregated ANSEC score = 1 (poor) NMR score = B (good) No disorder predicted DXMS analysis shows localized N-terminal disorder N- and C-terminal truncation series performed Multiple soluble deletions but only few showing monodispersity

25 ANSEC score = 4 (very good) NMR score = D (very poor)
Poor coarse hit rate (2/384) for FL Some predicted C-terminal disorder (GlobPlot) DXMS indicates that localized disorder at C-term Fine truncation series defines boundaries (Philadelphia) predicted disorder by GlobPlot

26 Regions of disorder appear to be good targets for surface mutagenesis
DXMS results Keith Dunker, Molecular Kinetics

27 Deletion DXMS Results UCLA Surface Entropy Reduction Prediction Server
E41A E42A E88A D90A K100A E101A R72A K74A UCLA Surface Entropy Reduction Prediction Server

28 Initial expression/detergent screen run at GNF 2002
Membrane Protein Structure Collaborations Expression and detergent screening of membrane protein expression (L. Columbus/K. Wüthrich TSRI/JCSG; S. Eshaghi GNF/Karolinska Inst.) XSAS and NMR screening of protein/detergent complexes (L. Columbus L. Columbus/K. Wüthrich TSRI/JCSG; S. Doniach Stanford) Future membrane protein expression screening and optimization with robotics platform (GNF) Unnatural amino acid incorporation (P. Schultz/TSRI) TM0561 CorA Mg++ transporter integral membrane protein Initial expression/detergent screen run at GNF 2002 Initial crystal hits 2003 INS interlude Transfer to Eshaghi/Nordlund 2005 Detergent optimization 2.8Å crystal structure Said Eshaghi

29 Membrane Protein Structure Collaborations
TM1514 Linda Columbus

30 Scientific Advisory Board
GNF & TSRI Crystallomics Core Scott Lesley Mark Knuth Dennis Carlton Marc Deller Thomas Clayton Michael DiDonato Glen Spraggon Andreas Kreusch Daniel McMullan Heath Klock Polat Abdubek Eileen Ambing Joanna C. Hale Eric Hampton Eric Koesema Edward Nigoghossian Aprilfawn White Sanjay Agarwalla Christina Trout Ylva Elias Hope Johnson Jessica Paulsen Linda Okach Bernhard Geierstanger Julie Feuerhelm Jessica Canseco Stanford /SSRL Structure Determination Core Keith Hodgson Ashley Deacon Mitchell Miller Herbert Axelrod Hsiu-Ju (Jessica) Chiu Kevin Jin Christopher Rife Qingping Xu Silvya Oommachen Henry van den Bedem Scott Talafuse Ronald Reyes Abhinav Kumar Jonathan Caruthers Chloe Zabieta Amanda Prado UCSD & Burnham Bioinformatics Core John Wooley Adam Godzik Slawomir Grzechnik Lukasz Jaroszewski Sri Krishna Subramanian Andrew Morse Tamara Astakhova Lian Duan Piotr Kozbial Naomi Cotton Dana Weekes Lukasz Slabinski Josie Alaoen Scientific Advisory Board Sir Tom Blundell Univ. Cambridge Homme Helinga Duke University Medical Center James Naismith The Scottish Structural Proteomics facility Univ. St. Andrews James Paulson, Consortium for Functional Glycomics, The Scripps Research Institute Robert Stroud, Center for Structure of Membrane Proteins, Membrane Protein Expression Center UC San Francisco Todd Yeates, UCLA-DOE, Inst. for Genomics and Proteomics Soichi Wakatsuki, Photon Factory, KEK, Japan James Wells, TSRI NMR Core Kurt Wüthrich Reto Horst Maggie Johnson Marcius Almeida Michael Gerault Wojtek Augustyniak Pedro Serrano Bill Pedrini TSRI Administrative Core Ian Wilson Marc Elsliger Jason Kay Gye Won Han David Marciano The JCSG is supported by the NIH Protein Structure Initiative grant U54 GM from the National Institute of General Medical Sciences (


Download ppt "Crystallomics Core Overview"

Similar presentations


Ads by Google