Identify proteins. Proteomic workflow Trypsin A typical sample We add a solution of 50 mM NH 4 HCO 3 (pH 7.8) containing trypsin (0.01-5 µg/µl). Volume.

Slides:



Advertisements
Similar presentations
In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry Lian Yang 2 ; Baozhen Shan 1 ; Bin Ma 2 1 Bioinformatics.
Advertisements

How to identify peptides October 2013 Gustavo de Souza IMM, OUS.
Peptide Mass Fingerprinting
Mass Fingerprint. Protease A protease is any enzyme that conducts proteolysis, that is, begins protein catabolism by hydrolysis of the peptide bonds that.
Fa 05CSE182 CSE182-L7 Protein sequencing and Mass Spectrometry.
ProReP - Protein Results Parser v3.0©
Lawrence Hunter, Ph.D. Director, Computational Bioscience Program University of Colorado School of Medicine
Basics of 2-DE and MALDI-ToF MS
Proteomics Informatics – Protein identification II: search engines and protein sequence databases (Week 5)
Previous Lecture: Regression and Correlation
HOW MASS SPECTROMETRY CAN IMPROVE YOUR RESEARCH
Each results report will contain:
Scaffold Download free viewer:
My contact details and information about submitting samples for MS
Protein sequencing and Mass Spectrometry. Sample Preparation Enzymatic Digestion (Trypsin) + Fractionation.
Proteome.
Tryptic digestion Proteomics Workflow for Gel-based and LC-coupled Mass Spectrometry Protein or peptide pre-fractionation is a prerequisite for the reduction.
The dynamic nature of the proteome
PROTEIN STRUCTURE NAME: ANUSHA. INTRODUCTION Frederick Sanger was awarded his first Nobel Prize for determining the amino acid sequence of insulin, the.
INF380 - Proteomics-91 INF380 – Proteomics Chapter 9 – Identification and characterization by MS/MS The MS/MS identification problem can be formulated.
Common parameters At the beginning one need to set up the parameters.
Analysis of Complex Proteomic Datasets Using Scaffold Free Scaffold Viewer can be downloaded at:
A Comprehensive Comparison of the de novo Sequencing Accuracies of PEAKS, BioAnalyst and PLGS Bin Ma 1 ; Amanda Doherty-Kirby 1 ; Aaron Booy 2 ; Bob Olafson.
Laxman Yetukuri T : Modeling of Proteomics Data
INF380 - Proteomics-101 INF380 – Proteomics Chapter 10 – Spectral Comparison Spectral comparison means that an experimental spectrum is compared to theoretical.
Structural Characterization of Bacterial Levansucrase by Matrix- assisted Laser Desorption/Ionization Mass Spectrometry Hong Liu 03/23/04.
INF380 - Proteomics-71 INF380 – Proteomics Chap 7 –Protein Identification and Characterization by MS Protein identification in our context means that we.
CSE182 CSE182-L11 Protein sequencing and Mass Spectrometry.
Peptide Identification via Tandem Mass Spectrometry Sorin Istrail.
Proteomics Informatics (BMSC-GA 4437) Instructor David Fenyö Contact information
The observed and theoretical peptide sequence information Cal.MassObserved. Mass ±da±ppmStart Sequence EndSequenceIon Score C.I%modification FLPVNEK.
ISA Kim Hye mi. Introduction Input Spectrum data (Protein database) Peptide assignment Peptide validation manual validation PeptideProphet.
2014 생화학 실험 (1) 6주차 실험조교 : 류 지 연 Yonsei Proteome Research Center 산학협동관 421호
Constructing high resolution consensus spectra for a peptide library
B Monoisotopic mass of neutral peptide M r (calc): Fixed modifications: Carbamidomethyl Ions score: 45 † Expect: ‡ Matches (red): 18/50.
Yonsei Proteome Research Center Peptide Mass Finger-Printing Part II. MALDI-TOF 2013 생화학 실험 (1) 6 주차 자료 임종선 조교 내선 6625.
Database Search Algorithm for Identification of Intact Cross-Links in Proteins and Peptides Using Tandem Mass Sepctrometry 신성호.
Mass Spectrometry 101 (continued) Hackert - CH 370 / 387D
Tandem MS.
‘Protein sequencing’: Determining protein sequences
Open source tools for data analysis
Sample Preparation Enzymatic Digestion (Trypsin) + Fractionation.
A Database of Peak Annotations of Empirically Derived Mass Spectra
LC-MS/MS Identification of Impurities Present in Synthetic Peptide Drugs Dr Anna Meljon*, Dr Alan Thompson, Dr Osama Chahrour, and Dr John Malone Almac.
The Syllabus. The Syllabus Safety First !!! Students will not be allowed into the lab without proper attire. Proper attire is designed for your protection.
MassMatrix Search Results Explained
View  text zoom  large Set properties text size to 14 point
Tandem MS.
Accelerating Research in Life Sciences
Proteomics Lecture 4 Proteases.
Bioinformatics Solutions Inc.
Proteomics Informatics David Fenyő
Now, More Than Ever, Proteomics Needs Better Chromatography
Proteomic Approaches to Cancer Biomarkers
Interpretation of Mass Spectra I
A perspective on proteomics in cell biology
Protein Identification by Peptide Mass Fingerprinting
Proteomics Informatics –
Volume 6, Issue 6, Pages (December 2000)
Complementary Structural Mass Spectrometry Techniques Reveal Local Dynamics in Functionally Important Regions of a Metastable Serpin  Xiaojing Zheng,
Accelerating Research in Life Sciences
Volume 20, Issue 12, Pages (December 2013)
NoDupe algorithm to detect and group similar mass spectra.
2D-LC-MS/MS analysis of tryptic digest of HEK293-SUMO3 cells (2 μg inj
Shotgun Proteomics in Neuroscience
High level view of the MAE algorithm.
Accelerating Research in Life Sciences
Proteomics Informatics David Fenyő
Interpretation of Mass Spectra
Operation manual of AI SIDA
Presentation transcript:

Identify proteins

Proteomic workflow

Trypsin A typical sample We add a solution of 50 mM NH 4 HCO 3 (pH 7.8) containing trypsin ( µg/µl). Volume depends on the dimension of the sample Incubation o/n at 37°C Surnatants are 0.22 µm filtered Depending on the volume, samples are concentrated / vacuum dried Peptide mixtures are then analysed by mass spectrometry

Trypsin

HIQK LHSMK VNELSK TTMPLW EDVPSER EGIHAQQK YLGYLEQLLR FFVAPFPEVFGK VPQLEIVPNSAEER LLILTCLVAVALARPK HQGLPQEVLNENLLR DIGSESTEDQAMEDIK EPMIGVNQELAYFYPELFR QMEAESISSSEEIVPNSVEQK This set of masses consitutes a fingerprint of the protein. An MS analysis can allow identification of this protein. MH + Peptide sequence

Identification by search in protein sequences databases

MALDI-TOF spectrum of the trypsin digestion of the pictorial sample containing milk Signals corresponding to peptides of bovine α-S1 casein

Database search with MS data only MH + data

Database search results Several unassigned signals All the signals in the spectrum are inserted in the search box

Effect of multiple Peptide Masses on Protein Identification by Mass Fingerprinting Search m/z Mass Tolerance Da N° of Hits Moreover…… The number of peptides detected that belong to the same protein strongly influences the identification

Effect of Mass Accuracy and Mass Tolerance on Peptide Mass Fingerprinting Search Results Search m/z Mass Tolerance Da N° of Hits Moreover……

Identification by MS data, (generally MALDI-TOF), suffers from: The complex mixtures of proteins of organic materials The uneven relative quantities of the different components

MS analysis What can give us more informations? The sequence of two or more peptides

Despite the quite similar name Casein α–S1 and Casein α–S2 are quite different proteins

Peptide sequencing by MS

MS/MS of Peptide Mixtures LC MS MS/MS

CHIP LC-MSMS Q-Tof

Single peptides are selected for fragmentation MS/MS Fragmentation spectra Tandem mass spectra Peptide mixture MS of the single peptides What is MS/MS?

Interpretation of an MSMS spectrum to derive structural information is analogous to solving a puzzle Use the fragment ion masses as specific pieces of the puzzle to put the intact molecule back together

-HN--CH--CO--NH--CH--CO--NH- RiRi CH-R’ cici z n-i R” d i+1 v n-i w n-i low energy high energy Cleavages Observed in MS/MS of Peptides aiai x n-i bibi y n-i

Simple Fragmentation rules Ions of the “b” serieIons of the “y” serie

Fragmentation rules

Precursor ion Doubly charged m/z = MH+ = C-term Arg

HIQK LHSMK VNELSK TTMPLW (C-terminus of the protein) EDVPSER EGIHAQQK YLGYLEQLLR FFVAPFPEVFGK VPQLEIVPNSAEER LLILTCLVAVALARPK HQGLPQEVLNENLLR DIGSESTEDQAMEDIK EPMIGVNQELAYFYPELFR QMEAESISSSEEIVPNSVEQK Since the proteolytic enzyme is trypsin, all the peptides end either with Arg (R) or Lys (K). Y 1 ion will always be either 147 (K), or 175 (R) MH + Peptide sequence

m/z = MH+ = C-term Arg 288 – 175 = 113 = Leu (L) RL 401 – 288 = 113 = Leu (L) L – 401 = 128 = GLn (Q) Q 529 E =E 113=L 163=Y 113=L 57=G L 771 Y 934 G 991 L 1104 MH + - y 9 = 163 = Tyr (Y) Y Precursor ion

m/z = MH+ = RLL Q 529 E =E 113=L 57=G L 771 Y 934 G 991 L 1104 Y Precursor ion =Y =Q 867

m/z = MH+ = RLLQELYGLY y1y1 y8y8 y7y7 y6y6 y5y5 y4y4 y3y3 y2y2 y9y9 b2b2 b3b3 b4b4 b6b6 b5b5 b7b7

MSMS Peptide Fragmentation signal b 1 y 1 b 2 y 2 b 3 y 3 b 4 y 4 b 5 y 5 Ala-Gly-His-Leu-….Phe-Glu-Cys-Tyr

Should we manually interpeter each fragmentation spectrum?

Peptide Sequencing This set of numbers are identificative of the fragmentation spectrum of this peptide

Signals in the fragmentation spectra can be predicted b seriey serie ---b1b1 Y y b2b2 L y9y b3b3 G y8y b4b4 Y y7y b5b5 L y6y b6b6 E y5y b7b7 Q y4y b8b8 L y3y b9b9 L y2y b 10 R y1y b seriey serie ---b1b1 Y y b2b2 L y9y b3b3 G y8y b4b4 Y y7y b5b5 W y6y b6b6 E y5y b7b7 Q y4y b8b8 L y3y b9b9 L y2y b 10 R y1y A single change in the sequence changes the profile of expected signals

m/z Database IIGHFYDDWCPLK SPAFDSIMAETLK AFDSLPDDIHEK GGILAQSPFLIIK Real spectrum cross- correlated with theoretical spectrum x m/z Database searches compares the experimental MSMS spectra with the virtual spectra of the peptides generated by the in silico digestion of the proteins in the database

100 fmol BSA injected on column. BPC of m/z , and typical MS/MS spectrum (right inset) Time [min ] MS trace MS/MS trace MS m/z y2 b3 y3 y4 y5 b7 y7 b8 y6 b9 y9 y10 b11 b12 MS/MS NanoLC MS/MS

Database search

Database search with raw data from LC-MSMS MH + data Fragmentation spectra are automatically uploaded

Why Trypsin is preferred? Because MS sees only ions….. Upon fragmentation, the presence of a positively charged residue (Lys or Arg), will ensure the presence of a charge on fragments on the C- terminal side, generating the y serie The amino group at the N-terminus should ensure the presence of a charge on fragments on the N- terminal side, generating the b serie

For this kind of sample we do not use any fixed modification

Variable Modifications take into account modifications induced by sample treatment and/or sample deterioration

These informations depend on the instrument

A typical output of the results

Proteins are ranked as a function of decreasing score Proteins are grouped into families using a novel hierarchical clustering algorithm. If the family contains multiple members, the accessions, scores and descriptions are aligned with a dendrogram, which illustrates the degree of similarity between members. A short description is given of the hit, with the organism of provenance

-The Report Builder tab allows you to build a customised table of protein hits, which is particularly useful if you need a minimal list of proteins for a publication. Number of peptide matchesNumber of significant peptide matches (above the significance threshold) Number of indipendent sequences Number of significant indipendent sequences (above the significance threshold)

A hit Experimental m/z valueExperimental m/z transformed to a relative molecular mass molecular mass calculated from the matched peptide sequence Difference (error) between the experimental and calculated masses Ions score - If there are duplicate matches to the same peptide, then the lower scoring matches are shown in brackets. Expectation value for the peptide match. (The number of times we would expect to obtain an equal or higher score, purely by chance. The lower this value, the more significant the result). A letter U if the peptide sequence is unique to the protein family sequence Sequence of the peptide in 1-letter code. If the peptide sequence is modified, each affected residue is underlined.

Clicking the number These are other possible interpretations of the same fragmentation spectrum Highlighting the number (it is the spectrum number in the query)

The matched fragment ions are shown in tabular format below the spectrum. The ion series are those specified by the INSTRUMENT search parameter. If you choose to label the matches used for scoring, bold italic red means the series contributed to the score. Bold red means that the number of matches in the ion series is greater than would be expected by chance, indicating that the ion series is present. Non-bold red means that the number of matches in the ion series is no greater than would be expected by chance, so that the matches themselves may be by chance.

Advantages: - Univoque identifications - Multiple identifications - Few peptides are sufficient - Proteins in mixtures can be distinguished - Reduced relevance of protein contamination - Deductive results: no hypothesys is requested - Organisms can be recognized and differentiated