Karl Clauser Proteomics and Biomarker Discovery Sample Experiment Summary 1 Spider 1 ug venom Reduce/alkylate De-salt LC-MS/MS 300ng C18 300A Microsorb.

Slides:



Advertisements
Similar presentations
Tandem MS (MS/MS) on the Q-ToF2
Advertisements

Protein Quantitation II: Multiple Reaction Monitoring
From Genome to Proteome Juang RH (2004) BCbasics Systems Biology, Integrated Biology.
UC Mass Spectrometry Facility & Protein Characterization for Proteomics Core Proteomics Capabilities: Examples of Protein ID and Analysis of Modified Proteins.
In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry Lian Yang 2 ; Baozhen Shan 1 ; Bin Ma 2 1 Bioinformatics.
Proteomics Informatics – Protein identification III: de novo sequencing (Week 6)
How to identify peptides October 2013 Gustavo de Souza IMM, OUS.
Fa 05CSE182 CSE182-L8 Mass Spectrometry. Fa 05CSE182 Bio. quiz What is a gene? What is a transcript? What is translation? What are microarrays? What is.
20-30% of a trypsinised proteome are constituted of peptides with Mw≥3000 (TReP) Identification of large peptides by shotgun MS is not efficient Isolation.
Basics of 2-DE and MALDI-ToF MS
Proteomics Informatics – Protein identification II: search engines and protein sequence databases (Week 5)
Proteomics Informatics Workshop Part I: Protein Identification
Previous Lecture: Regression and Correlation
De Novo Sequencing of MS Spectra
Each results report will contain:
Scaffold Download free viewer:
Proteomics Informatics (BMSC-GA 4437) Course Director David Fenyö Contact information
My contact details and information about submitting samples for MS
Goals in Proteomics 1.Identify and quantify proteins in complex mixtures/complexes 2.Identify global protein-protein interactions 3.Define protein localizations.
Antibody Sequencing by LC-MS/MS Paul Shan Bioinformatics Solutions Inc.
Facts and Fallacies about de Novo Sequencing & Database Search.
Proteomics Informatics (BMSC-GA 4437) Course Director David Fenyö Contact information
Protein sequencing and Mass Spectrometry. Sample Preparation Enzymatic Digestion (Trypsin) + Fractionation.
Peptide Sequencing by LC-MS/MS for Organisms with Unsequenced Genomes
Karl Clauser Proteomics and Biomarker Discovery Taming Errors for Peptides with Post-Translational Modifications Bioinformatics for MS Interest Group ASMS.
Production of polypeptides, Da, and middle-down analysis by LC-MSMS Catherine Fenselau 1, Joseph Cannon 1, Nathan Edwards 2, Karen Lohnes 1,
2007 GeneSpring MS GeneSpring for Metabolite BioMarker Analysis using Mass Spectrometry data Agilent Q-TOF VIP Visit Jan 16-17, 2007 Santa Clara, CA Thon.
HPP Preliminary Results La Cristalera, August 2012 Montserrat Carrascal, Joan Villanueva, Joaquín Abián LP-CSIC/UAB.
ESI and MALDI LC/MS-MS Approaches for Larger Scale Protein Identification and Quantification: Are They Equivalent? 1P. Juhasz, 1A. Falick,1A. Graber, 1S.
Common parameters At the beginning one need to set up the parameters.
Karl Clauser Proteomics and Biomarker Discovery Breast Cancer Proteomics and the use of TCGA Mutational Data - Broad Institute update/issues Karl Clauser.
Analysis of Complex Proteomic Datasets Using Scaffold Free Scaffold Viewer can be downloaded at:
The iPlant Collaborative
Laxman Yetukuri T : Modeling of Proteomics Data
RNA-Seq Assembly 转录组拼接 唐海宝 基因组与生物技术研究中心 2013 年 11 月 23 日.
In-Gel Digestion Why In-Gel Digest?
Genomics II: The Proteome Using high-throughput methods to identify proteins and to understand their function.
Proteomics What is it? How is it done? Are there different kinds? Why would you want to do it (what can it tell you)?
RNA-Seq Primer Understanding the RNA-Seq evidence tracks on the GEP UCSC Genome Browser Wilson Leung08/2014.
Proteomic Analysis of Ribosome Heterogeneity Proteomics Group Meeting April 1, 2010 Namrata Udeshi, PavanVaidyanathan, Jacob Jaffe, Karl Clauser, Steve.
Proteogenomic Novelty in 105 TCGA Breast Tumors
Multiple flavors of mass analyzers Single MS (peptide fingerprinting): Identifies m/z of peptide only Peptide id’d by comparison to database, of predicted.
Overview of Mass Spectrometry
Aggressive Enumeration of Peptide Sequences for MS/MS Peptide Identification Nathan Edwards Center for Bioinformatics and Computational Biology.
No reference available
Proteomics Informatics (BMSC-GA 4437) Instructor David Fenyö Contact information
The observed and theoretical peptide sequence information Cal.MassObserved. Mass ±da±ppmStart Sequence EndSequenceIon Score C.I%modification FLPVNEK.
Novel Peptide Identification using ESTs and Genomic Sequence Nathan Edwards Center for Bioinformatics and Computational Biology University of Maryland,
Peptide-assisted annotation of the Mlp genome Philippe Tanguay Nicolas Feau David Joly Richard Hamelin.
Deducing protein composition from complex protein preparations by MALDI without peptide separation.. TP #419 Kenneth C. Parker SimulTof Corporation, Sudbury,
Proteomics Informatics (BMSC-GA 4437) Course Directors David Fenyö Kelly Ruggles Beatrix Ueberheide Contact information
Constructing high resolution consensus spectra for a peptide library
a) b) c) d) e)
B Monoisotopic mass of neutral peptide M r (calc): Fixed modifications: Carbamidomethyl Ions score: 45 † Expect: ‡ Matches (red): 18/50.
10/30/2013BCHB Edwards Project/Review BCHB Lecture 17.
Canadian Bioinformatics Workshops
Post translational modification n- acetylation Peptide Mass Fingerprinting (PMF) is an analytical technique for identifying unknown protein. Proteins to.
MassMatrix Search Results Explained
The Web frame for NGS output
Bioinformatics Solutions Inc.
Proteomics Informatics David Fenyő
Interpretation of Mass Spectra I
Proteomics Informatics –
Volume 24, Issue 13, Pages (July 2014)
Rachel L Winston, Joel M Gottesfeld  Chemistry & Biology 
Top-down protein identification.
A, Averaged full MS (ions converted to monoisotopic MW by Xcalibur Xtract) of Segment I-3 (see supplemental Fig. A, Averaged full MS (ions converted to.
High level view of the MAE algorithm.
Proteomics Informatics David Fenyő
Interpretation of Mass Spectra
Presentation transcript:

Karl Clauser Proteomics and Biomarker Discovery Sample Experiment Summary 1 Spider 1 ug venom Reduce/alkylate De-salt LC-MS/MS 300ng C18 300A Microsorb 3um Orbitrap Elite CID/HCD/ETD Spectrum Mill Cys iodoacetamide Cys MeAziridine Digest Lys-C Database Venom gland mRNA library Ion Torrent sequencing bp reads Transcriptome assembly (Trinity) Translation (extractORFS.pl) Database Venom gland mRNA library 454 sequencing bp reads Transcriptome assembly (MIRA) Translation (extractORFS.pl) Centipede 1 ug venom Reduce/alkylate Cys-ethanolyl-gp Digest 500ng & 500 ng TrypsinGlu-C De-salt LC-MS/MS 150 ng 300A Microsorb 3um C18 C18 120A Reprosil 3um Orbitrap Elite CID/HCD/ETD Spectrum Mill Scorpion 1 ug venom Reduce/alkylate De-salt LC-MS/MS 300,600 ng C18 300A Microsorb 3um Orbitrap Elite CID/HCD/ETD Spectrum Mill Cys-ethanolyl-sol Cys-iodoacetamide C18 200A Magic 3um Cys-MeAziridine Digest Lys-C Database Venom gland mRNA library 454 sequencing bp reads Transcriptome assembly (Trinity) Translation (extractORFS.pl)

Karl Clauser Proteomics and Biomarker Discovery Spider Protein Group.subgroup List 2 Matches to Arachnoserver but not RNA-Seq Suggests need to improve paralog inclusion in RNA-Seq Transcript Assembly Samples are from Different animals Nov 2012Feb 2012 MeAzCysIAAcys Lys-Cundigested

Karl Clauser Proteomics and Biomarker Discovery Group 2 omega-hexatoxin Hi2 3 6 Cys Sub Group MTPSCTLGICAPSVGGLVGGLLG reads: 4382 MTPSCTLGICAPSVGGIVGGLLG reads: 0251 MTPSCTLGICAPNVGGLVGGLLG reads: 0201 MTPSCTMGLCVPNVGGLVGGILG reads: 0000 MTPSCTMGICVPNVGGLVGGILG reads: 0000 MTPSCTMGICVPNVGGLVGGLLG reads: 0000 CAPNVGG reads: 0442 CVPNVGG reads: 0000 Evidence exists in spider.q17 reads for paralogs missing from the assembly. ERGVLDCVVNTLGC reads: 635 ERGVVDCVLNTLGC reads: 106 ERGLVDCVLNTLGC reads: 553 ERGLADCVLNTLGC reads: 21 ER_QLDCVLNTLGC reads: 26 ERGVVGCVLNTLGC reads: 8 VLDCVVNTLGCSSDKDCCGMTPSCTLGICAPSVGGL reads: 99 VVDCVLNTLGCSSDKDCCGMTPSCTLGICAPSVGGL reads: 33 LVDCVLNTLGCSSDKDCCGMTPSCTLGICAPSVGGL reads: 72 LADCVLNTLGCSSDKDCCGMTPSCTLGICAPSVGGL reads: 4 VLDCVVNTLGCSSDKDCCGMTPSCTLGICAPNVGGL reads: 2 QLDCVLNTLGCSSDKDCCGMTPSCTLGICAPSVGGL reads: 1 QLDCVLNTLGCSSDKDCCGMTPSCTLGICAPNVGGL reads: 2

Karl Clauser Proteomics and Biomarker Discovery Only 1 spectrum supports LCVPN 4 (G)L C V P n\V\G\G\L\V\G\G\I\L(G) MTPSCTLGICAPSVGGLVGGLLG reads: 4382 MTPSCTLGICAPSVGGIVGGLLG reads: 0251 MTPSCTLGICAPNVGGLVGGLLG reads: 0201 MTPSCTMGLCVPNVGGLVGGILG reads: 0000 MTPSCTMGICVPNVGGLVGGILG reads: 0000 MTPSCTMGICVPNVGGLVGGLLG reads: 0000 CAPNVGG reads: 0442 CVPNVGG reads: 0000

Karl Clauser Proteomics and Biomarker Discovery Group 23 Fusion Toxin or Transcript Misassembly 5 23 looks like a fusion toxin, but could be transcript misassembly? 12 Cys, 2 pairs of adjacent CC’s maybe 11 Cys in mature form, active dimer? perhaps 2 concatenated 6 Cys toxins Present in 2 libraries >as:pi-theraphotoxin-Pc1a|sp:P Toxin from venom of the spider Psalmopoeus cambridgei that inhibits ASIC1a channels Score = 64.7 bits (156), Expect = 1e-13 Identities = 26/39 (66%), Positives = 28/39 (71%) Query: 52 QECIAKWKSCAGRKLDCCEGLECWKRRWGHEVCVPITQK 90 ++CI KWK C R DCCEGLECWKRR EVCVP T K Sbjct: 1 EDCIPKWKGCVNRHGDCCEGLECWKRRRSFEVCVPKTPK 39 Top BLAST Hit does not span fusion junction pro/mature

Karl Clauser Proteomics and Biomarker Discovery Alignment of CCC 8 Cys containing groups 7,14,16,30 6 Protein Group WCAKNEDCCCPMKCIGAWYN reads:312 WCGKEGDCCCPWKCIGQWYN reads: 25 RCNANSDCCCPLKCVIRLVG reads: 3 YCEKDKDCCCPMRCVKSYWK reads: 10 GroupProposed names 16.1delta-hexatoxin-Hi3 7.1delta-hexatoxin-Hi1 14.1delta-hexatoxin-Hi2 30.1delta-hexatoxin-Hi4

Karl Clauser Proteomics and Biomarker Discovery Coverage of CCC 8 Cys containing groups 7,14,16,30 7

Karl Clauser Proteomics and Biomarker Discovery Alignment of noCCC 8 Cys containing groups 8 Protein Group ,5,8,12,13,15,20,21 5,13,15,20,21 Protein Group CC, CXC, CXC

Karl Clauser Proteomics and Biomarker Discovery Group 1 9 topBLAST hit not in Arachnoserver gi| |gb|ADF |putative mature sequence toxin-like ACSKQ [Pelinobius muticus] EQIAAEENQLVEDLVQYAGTRLTQKRATRCSKKLGEKCNYHCECCGATVACSTVYVGGKETNFCSDKTSNNGALNTVGQGLNVVSNGLSAFQCWG + A+E ++L+E L + + Q+ A CSK++GEKC + C+CCGATV C T+YVGG C KTSNN LNT+G G+N V N ++ CWG RKTASETSKLLEKL-GVSREAIPQEMARACSKQIGEKCEHDCQCCGATVVCGTIYVGGNAVEQCMSKTSNNAVLNTMGHGMNAVQNAFTSVMCWG 8 Cys CSDKTSNNGALNTVGQGLNVVSNGLSAFQC reads: 1023 CSKKLGEKCNYHCECCGATVACS reads: 371 CSKKLGEKCDYHCECCGATVACD reads: 49 CSTVYVGGKETNFC reads: 5251 TVGQGLNVVSNGLSAFQC reads: 4739 CSDKTSNNGALNTVGQGLN reads: 1887 GGKETNFCSDKTSNN reads: 3771 GGRETNFCSDKTSNN reads: 10 CSKKLGEKCNYHC reads: 769 CSKKLGEKCDYHC reads: 111 Evidence exists in spider.q17 reads for paralogs missing from the assembly. N to D supported in MS/MS matched as N to n deamidation NVVVNGFSAFQC reads: 25

Karl Clauser Proteomics and Biomarker Discovery Alignment of CECCG 8 Cys containing groups 1,11,29 10 Groups 1,11,29 are highly related. Lower level variant reads not assembled with group 1 may be the missing N-term for Group 11,29 Protein Group

Karl Clauser Proteomics and Biomarker Discovery Alignment of CECCG 8 Cys containing groups 1,11,12 11 Protein Group Protein Group Group 1 and 11 are highly related. Lower level variant reads not assembled with group 1 may be the missing N-term for Group 11

Karl Clauser Proteomics and Biomarker Discovery Alignment of CC, CXC, CXC 8 Cys containing groups 5,13,15,20,21, ,13,15,20,21 CC, CXC, CXC Protein Group Protein Group

Karl Clauser Proteomics and Biomarker Discovery Coverage of CC, CXC, CXC 8Cys containing groups 5,20 13

Karl Clauser Proteomics and Biomarker Discovery Coverage of CC,CXC,CXC 8Cys containing groups 13,15,21,26 14 Protein Group

Karl Clauser Proteomics and Biomarker Discovery Coverage of CC, CYIC 8Cys containing group 8,

Karl Clauser Proteomics and Biomarker Discovery Alignment of 10 Cys containing protein groups 3,4,10,19 16 Protein Group Groups 3,4,10 too divergent to be paralogs. Examine group 19 to determine spectral coverage of variant AAs GroupProposed names 4.1mu1-hexatoxin-Hi1a 19.1mu1-hexatoxin-Hi1b 3.1mu1-hexatoxin-Hi2a 3.2mu1-hexatoxin-Hi2b 10.1mu1-hexatoxin-Hi3a

Karl Clauser Proteomics and Biomarker Discovery Coverage of 10 Cys containing groups 3,4,10,19 17

Karl Clauser Proteomics and Biomarker Discovery PSM Overlap for Group

Karl Clauser Proteomics and Biomarker Discovery Coverage of 6Cys containing groups 6, 9 19

Karl Clauser Proteomics and Biomarker Discovery Coverage of 6Cys groups 17, 22, 25, 27,

Karl Clauser Proteomics and Biomarker Discovery Group Cys - only intact MS/MS z (R)M F C K P L D\Q|Q\C\N K|D L H\C C|K P L/K C/R|R/S/N/N/G/R/K Y C/K P(-) (R)M F\C\K P L D\Q|Q\C N K\D L H\C\C|K P/L/K C/R/R/S/N/N/G R/K Y C/K P(-) (R)M F\C\K P L D\Q|Q\C N K\D L H\C C K P L K C|R/R/S/N/N/G/R/K/Y C/K P(-) (R)M F C\K P L D\Q\Q\C N K\D/L H\C\C|K P L K\C/R/R S/N/N/G/R/K/Y C K P(-) (R)M/F C/K P L\D\Q|Q\C N K D L H C C K P L K\C\R\R/S/N/N/G/R/K Y C K P(-) z9 z8 z7 z6 z5 (R)M/F/C\K/P L D\Q Q C N K D|L H/C C K P L K C R R S N N G R K\Y C K\P(-) z6 z5 (R)M/F/C|K/P L D\Q Q/C N/K/D|L H C C K P L K C R R S N N G R K\Y C K\P(-) (R)M F C|K/P/L/D/Q/Q/C/N/K/D/L H/C C K P L K C R R S N N G R K\Y C K P(-) z5 ETD CID HCD Cys Iodoacetamide 56 spectra 2 peptides

Karl Clauser Proteomics and Biomarker Discovery Group Cys - only intact MS/MS z Cys Iodoacetamide 56 spectra 2 peptides MFCKPLDQQCN reads: 6 CRRSNNGRKYCKP reads: 12 CNKDLHCCKPLKC reads: 1 CCKPLKCRR reads: 11 CRKPLKKRL reads: 66

Karl Clauser Proteomics and Biomarker Discovery Group Cys - only intact MS/MS z8-9 ETD 23 (R)M F C K P L D\Q|Q\C\N K|D L H\C C|K P L/K C/R|R/S/N/N/G/R/K Y C/K P(-) (R)M F\C\K P L D\Q|Q\C N K\D L H\C\C|K P/L/K C/R/R/S/N/N/G R/K Y C/K P(-)

Karl Clauser Proteomics and Biomarker Discovery Group Cys - only intact MS/MS z6-7, ETD 24 (R)M F\C\K P L D\Q|Q\C N K\D L H\C C K P L K C|R/R/S/N/N/G/R/K/Y C/K P(-) (R)M F C\K P L D\Q\Q\C N K\D/L H\C\C|K P L K\C/R/R S/N/N/G/R/K/Y C K P(-)

Karl Clauser Proteomics and Biomarker Discovery Group Cys - only intact MS/MS z5 25 (R)M/F C/K P L\D\Q|Q\C N K D L H C C K P L K\C\R\R/S/N/N/G/R/K Y C K P(-) (R)M F C|K/P/L/D/Q/Q/C/N/K/D/L H/C C K P L K C R R S N N G R K\Y C K P(-) ETD HCD

Karl Clauser Proteomics and Biomarker Discovery BLAST2GO – Functional Annotation 26 Export FASTA of Valid Hits from SM Run BLAST step Run GO mapping step Run annotation step Run InterProScan, SIGNALP Run GO-Slim Export results to SM categories file

Karl Clauser Proteomics and Biomarker Discovery Venom Toxin Nomenclature 27 King GF, Gentz MC, Escoubas P, Nicholson GM A rational nomenclature for naming peptide toxins from spiders and other venomous animals. Toxicon 52 (2008) 264–276.

Karl Clauser Proteomics and Biomarker Discovery Next Steps 28 Improve paralog inclusion in RNA-Seq Transcript Assembly. Call full length gene, predict signal & propeptide. 10 Cys toxins lack propeptide? CCC 8 cys toxins propeptide so long that the assembly doesn’t extend upstream enough to cover signal peptide? Group 23 looks like a fusion toxin, misassembled? 12 Cys, 2 pairs of adjacent CC’s perhaps 2 concatenated 6 Cys toxins Present in 2 libraries Run SM homology searches Improve SM monoisotopic m/z assignment for z>4. Obtain ETD MS/MS of intact toxins after MeAziridine Cys mod. Assemble spectra de novo. Name novel toxins.

Karl Clauser Proteomics and Biomarker Discovery Adding Charge to Cys for better ETD 29 Fig. 1. Overview of the de novo sequencing strategy. (I) UV trace of HPLC separation of crude venom extract from C. textile. (II) MALDI TOF MS of fraction i after no treatment, reduction, and alkylation. (III) On-line LC ESIMS/MS using CAD and ETD on reduced and alkylated aliquots of fraction i. The final step shows the conversion of Cys residues to dimethylated Lys analogs followed by ETD fragmentation; MS/MS is shown for the (M5H)5 ion of the 1, Da species in II. c ions are indicated by and z ions by. Shell image Copyright 2005, Richard Ling. Ueberheide BM, Fenyo¨ D, Alewood PF, and Chait BT. PNAS , 6910– Methylaziridine

Karl Clauser Proteomics and Biomarker Discovery Lys-C Cleaves at MeAziridine Cys mod 30 Cys cleavages observed only in most abundant proteins Kinetics: K >> C

Karl Clauser Proteomics and Biomarker Discovery 31 MeAziridine Cys mod Yields Great ETD Spectra (K)S/Y\W|K/G|H/G\V\C\S A\S|L|F|E|R|L|K\G\C(-) (K)C/I|G|Q|W|Y/N|G|Q|A S C|Q|S/T\F|m|G|L\F\K(S)

Karl Clauser Proteomics and Biomarker Discovery MeAziridine Cys mod Yields Great ETD Spectra 32 (K)C/N|Y|H\C|E\C\C|G/A T V\A\C|S/T|V\Y/V|G G\K(E) (K)C Y\C/D|Y|G|L|F|G|N\C|N\C|Y\K(R)

Karl Clauser Proteomics and Biomarker Discovery MeAziridine Cys mod Yields Great ETD Spectra 33 (Q)T C/G|G P|D D\C G|E|G/S C\C V|G|S|F|S|R\K(C) (N)W C|A|K/N\E|D\C\C\C P M\K(C)

Karl Clauser Proteomics and Biomarker Discovery Centipedes Spiders Scorpions S. Morsitans A. xerolimniorum L. weigensis H. jugulans C. westwoodi E. rubripes U. manicatus L. variatus L. buchari C. squama I. vescus Thereuopoda sp. S. foelschei C. Tropix H. infensa 7 scorpions 4 centipedes 1 spider