Goals in Proteomics 1.Identify and quantify proteins in complex mixtures/complexes 2.Identify global protein-protein interactions 3.Define protein localizations.

Slides:



Advertisements
Similar presentations
Genomes and Proteomes genome: complete set of genetic information in organism gene sequence contains recipe for making proteins (genotype) proteome: complete.
Advertisements

Protein Purification Molecular weight Charge Solubility Affinity.
Ch.5 Proteins: Primary structure Polypeptide diversity Protein purification and analysis Protein sequencing Protein evolution.
Review: Amino Acid Side Chains Aliphatic- Ala, Val, Leu, Ile, Gly Polar- Ser, Thr, Cys, Met, [Tyr, Trp] Acidic (and conjugate amide)- Asp, Asn, Glu, Gln.
MN-B-C 2 Analysis of High Dimensional (-omics) Data Kay Hofmann – Protein Evolution Group Week 5: Proteomics.
How to identify peptides October 2013 Gustavo de Souza IMM, OUS.
De Novo Sequencing v.s. Database Search Bin Ma School of Computer Science University of Waterloo Ontario, Canada.
Proteomics The proteome is larger than the genome due to alternative splicing and protein modification. As we have said before we need to know All protein-protein.
PROTEIN IDENTIFICATION BY MASS SPECTROMETRY. OBJECTIVES To become familiar with matrix assisted laser desorption ionization-time of flight mass spectrometry.
PROTEOMICS LECTURE. Genomics DNA (Gene) Functional Genomics TranscriptomicsRNA Proteomics PROTEIN Metabolomics METABOLITE Transcription Translation Enzymatic.
ProReP - Protein Results Parser v3.0©
Lawrence Hunter, Ph.D. Director, Computational Bioscience Program University of Colorado School of Medicine
Basics of 2-DE and MALDI-ToF MS
Proteomics Informatics – Protein identification II: search engines and protein sequence databases (Week 5)
Announcements: Proposal resubmissions are due 4/23. It is recommended that students set up a meeting to discuss modifications for the final step of the.
Proteomics Informatics Workshop Part I: Protein Identification
Previous Lecture: Regression and Correlation
Physical Methods to Characterize Proteins. Molecular weight Physical properties of key interest Oligomerization state Structure Interactors.
My contact details and information about submitting samples for MS
Analysis of tandem mass spectra - II Prof. William Stafford Noble GENOME 541 Intro to Computational Molecular Biology.
Proteomics Josh Leung Biology 1220 April 13 th, 2010.
Fa 05CSE182 CSE182-L9 Mass Spectrometry Quantitation and other applications.
Proteomics Informatics – Overview of Mass spectrometry (Week 2)
Proteome.
Tryptic digestion Proteomics Workflow for Gel-based and LC-coupled Mass Spectrometry Protein or peptide pre-fractionation is a prerequisite for the reduction.
Karl Clauser Proteomics and Biomarker Discovery Taming Errors for Peptides with Post-Translational Modifications Bioinformatics for MS Interest Group ASMS.
Chapter Five Protein Purification and Characterization Techniques
2D-Gel Analysis Jennifer Wagner Image retrieved from
Table 5-1 Protein Purification Essential for characterizing individual proteins (determining their enzymatic activities, 3D structures, etc.) Two main.
Analysis of Proteins and Peptides Amino acid composition Molecular weight Isoelectric point Subunit structure Prosthetic groups Solubility Biological activity.
PROTEIN STRUCTURE NAME: ANUSHA. INTRODUCTION Frederick Sanger was awarded his first Nobel Prize for determining the amino acid sequence of insulin, the.
UPDATE! In-Class Wed Oct 6 Latil de Ros, Derek Buns, John.
INF380 - Proteomics-91 INF380 – Proteomics Chapter 9 – Identification and characterization by MS/MS The MS/MS identification problem can be formulated.
Common parameters At the beginning one need to set up the parameters.
1 Chemical Analysis by Mass Spectrometry. 2 All chemical substances are combinations of atoms. Atoms of different elements have different masses (H =
Analysis of Complex Proteomic Datasets Using Scaffold Free Scaffold Viewer can be downloaded at:
Laxman Yetukuri T : Modeling of Proteomics Data
Proteomics The science of proteomics Applications of proteomics Proteomic methods a. protein purification b. protein sequencing c. mass spectrometry.
Lecture 9. Functional Genomics at the Protein Level: Proteomics.
Genome of the week - Enterococcus faecalis E. faecalis - urinary tract infections, bacteremia, endocarditis. Organism sequenced is vancomycin resistant.
In-Gel Digestion Why In-Gel Digest?
Genomics II: The Proteome Using high-throughput methods to identify proteins and to understand their function.
Proteomics What is it? How is it done? Are there different kinds? Why would you want to do it (what can it tell you)?
Multiple flavors of mass analyzers Single MS (peptide fingerprinting): Identifies m/z of peptide only Peptide id’d by comparison to database, of predicted.
Overview of Mass Spectrometry
Separation techniques ?. Molecules can be separated: Chemically: by charge, by action with specific reagents Physically: by solubility, by molecular weight,
Separates charged atoms or molecules according to their mass-to-charge ratio Mass Spectrometry Frequently.
Proteome and Gene Expression Analysis Chapter 15 & 16.
Proteomics Informatics (BMSC-GA 4437) Instructor David Fenyö Contact information
Fundamentals of Biochemistry
ISA Kim Hye mi. Introduction Input Spectrum data (Protein database) Peptide assignment Peptide validation manual validation PeptideProphet.
2014 생화학 실험 (1) 6주차 실험조교 : 류 지 연 Yonsei Proteome Research Center 산학협동관 421호
RANIA MOHAMED EL-SHARKAWY Lecturer of clinical chemistry Medical Research Institute, Alexandria University MEDICAL RESEARCH INSTITUTE– ALEXANDRIA UNIVERSITY.
Tymoczko • Berg • Stryer © 2015 W. H. Freeman and Company
Yonsei Proteome Research Center Peptide Mass Finger-Printing Part II. MALDI-TOF 2013 생화학 실험 (1) 6 주차 자료 임종선 조교 내선 6625.
Goals in Proteomics Identify and quantify proteins in complex mixtures/complexes Identify global protein-protein interactions Define protein localizations.
Post translational modification n- acetylation Peptide Mass Fingerprinting (PMF) is an analytical technique for identifying unknown protein. Proteins to.
Purification Of Proteins.
Mass Spectrometry makes it possible to measure protein/peptide masses (actually mass/charge ratio) with great accuracy Major uses Protein and peptide identification.
Amino Acids, Peptides, and Proteins
The Syllabus. The Syllabus Safety First !!! Students will not be allowed into the lab without proper attire. Proper attire is designed for your protection.
2 Dimensional Gel Electrophoresis
Mass spectrometry-based proteomics
2D-Gel Analysis Jennifer Wagner
Bioinformatics Solutions Inc.
Proteomics Informatics David Fenyő
Proteomics Informatics –
Shotgun Proteomics in Neuroscience
Proteomics Informatics David Fenyő
Presentation transcript:

Goals in Proteomics 1.Identify and quantify proteins in complex mixtures/complexes 2.Identify global protein-protein interactions 3.Define protein localizations within cells 4. Measure and characterize post-translational modifications 5.Measure and characterize activity (e.g. substrate specificity, etc) 1

Goals in Proteomics 1.Identify and quantify proteins in complex mixtures/complexes MS and MS/MS 2.Identify global protein-protein interactions MS and MS/MS, Y2H 3.Define protein localizations within cells High-throughput microscopy, organelle pull down 4. Measure and characterize post-translational modifications MS techniques 5.Measure and characterize activity (e.g. substrate specificity, etc) Protein arrays 2

Coon et al Basic overview of Tandem mass-spectrometry (MS/MS) 3

Mass SpecMS Spectrum Ion sourceMass analyzerDetector Intro to Mass Spec (MS) Separate and identify peptide fragments by their Mass and Charge (m/z ratio) Basic principles: 1. Ionize (i.e. charge) peptide fragments 2. Separate ions by mass/charge (m/z) ratio 3. Detect ions of different m/z ratio 4. Compare to database of predicted m/z fragments for each genome 4

Intro to Mass Spec (MS) Separate and identify peptide fragments by their Mass and Charge (m/z ratio) 1.Ionization Goal: ionize (i.e. charge) peptide fragments without destroying molecule Positive ionization (protonate amine groups) especially useful for trypsinized proteins (cleaved after R and K) vs. Negative ionization (deprotonate carboxylics and alcohols) 5

Liquid chromatography + Electrospray ionization electric field * Commonly used with liquid solutions, more sensitive to contaminants, used for complex mixtures 6

Liquid chromatography + Electrospray ionization MALDI electric field * Commonly used with liquid solutions, more sensitive to contaminants, used for complex mixtures * Less sensitive to contaminants, more common for less complex mixtures 7

Intro to Mass Spec (MS) Separate and identify peptide fragments by their Mass and Charge (m/z ratio) 2. Separation of ions based on m/z ratio (mass m versus charge c) Multiple flavors of mass analyzers use different technology * TOF (‘time of flight’): separates based on velocity * Triple quadrupole: separation based on pulsed electrical pulse 8

Multiple flavors of mass analyzers Single MS (peptide fingerprinting): Identifies m/z of peptide only Peptide id’d by comparison to database, of predicted m/z of trypsinized proteins Tandem MS/MS (peptide sequencing): Pulls each peptide from the first MS Breaks up peptide bond Identifies each fragment based on m/z Collision cell 9

Multiple flavors of mass analyzers … can be hooked together in multiple configs. 10 g. Orbitrap

Multiple flavors of mass analyzers Single MS (peptide fingerprinting): Identifies m/z of peptide only Peptide id’d by comparison to database, of predicted m/z of trypsinized proteins Tandem MS/MS (peptide sequencing): Pulls each peptide from the first MS Breaks up peptide bond Identifies each fragment based on m/z Collision cell 11 Now multiple types of collision cells: CID: collision induced dissociation ETD: electron transfer dissociation HCD: high-energy collision dissociation

Peptide can fragment along 3 possible bonds … charge stays on either the ‘left’ (a,b, or c) or ‘right’ (x, y, or z) side of cleavagee Fragmentation happens in fairly defined way along peptide backbone Cleavage along the CO-NH bond is most common, generating ‘b’ and ‘y’ ions 12

MS spectrum (i.e. peptide ions) Mann Nat Reviews MBC. 5:699:711 Each peak is a different peptide, separated based on m/z A single peptide is selected by the instrument for the second MS Each peak often surrounded by smaller peaks of similar m/z Sensitivity of instrument determines resolution 13

Second MS identifies y (or b) ions to read out amino-acid sequence Mann Nat Reviews MBC. 5:699:711 14

Trypsin often used to digest proteins (cleaves after Arg and Lys) WHY? Because of challenges distinguishing spectra, simplified mixtures are typically injected into the MS: -either excised proteins -purified complexes -fractionated pools of complex mixtures 15

The first dimension (separation by isoelectric focusing) - gel with an immobilised pH gradient - electric current causes charged proteins to move until it reaches the isoelectric point (pH gradient makes the net charge 0) The second dimension (separation by mass) -pH gel strip is loaded onto a SDS gel -SDS denatures the protein (to make movement solely dependent on mass, not shape) and eliminates charge. 2 dimensional gel separation (largely outdated) Ahna Skop 16

2D-SDS PAGE gel Ahna Skop 17

TAP-tag: Tandem Affinity Purification (for IP’ing individual proteins and proteins bound to them) 18

Ion exchange chromatography Anion exchange: Column is postively charged (can bind negativey charged proteins). Cation exchange: Column is negativey charged (can bind positively charged proteins). Exploit the isoelectric point of a protein to Separate it from other macromolecules. Ahna Skop 19

Size exclusion chromatography Porous beads made of different but controlled sizes. Smaller proteins go in and out of beads and will be retained in the resin. Large proteins will only go into large beads and will be retained less. Very large proteins will not go into any of the beads (exclusion limit). Can be used as a preparative method or to determine the molecular weight of a protein in solution. Ahna Skop 20

A ligand with high affinity to the protein is attached to a matrix. Protein of interest binds to ligand And is retained by resin. Everything else flows through. Can use excess of the soluble ligand to elute the protein. Affinity chromatography Ahna Skop 21

Mann Nat Reviews MBC. 5:699: How does each spectrum translate to amino acid sequence?

1.De novo sequencing: very difficult and not widely used (but being developed) for large-scale datasets 2.Matching observed spectra to a database of theoretical spectra 23

Theoretical spectra: - in silico digestion of a known protein database - set of limited set of theoretical spectra based on enzyme, instrument sensitivity, others - this reduces search space - can miss some peptides - comparisons based on several different scores (eg. correlation between obs. and theoretical profiles) Mann Nat Reviews MBC. 5:699:711 24

1.De novo sequencing: very difficult and not widely used (but being developed) for large-scale datasets 2.Matching observed spectra to a database of theoretical spectra 3.Matching observed spectra to a spectral database of previously seen spectra How does each spectrum translate to amino acid sequence? 25

Nesvizhskii (2010) J. Proteomics, 73: spectral matching is supposedly more accurate but … -limited to the number of peptides whose spectra have been observed before With either approach, observed spectra are processed to: group redundant spectra, remove bad spectra, recognized co-fragmentation, improve z estimates Many good spectra will not match a known sequence due to: absence of a target in DB, PTM modifies spectrum, constrained DB search, incorrect m or z estimate. 26

Result: peptide-to-spectral match (PSM) A major problem in proteomics is bad PSM calls … therefore statistical measures are critical Methods of estimating significance of PSMs: p- (or E-) value: compare score S of best PSM against distribution of all S for all spectra to all theoretical peptides FDR correction methods: 1.B&H FDR 2.Estimate the null distribution of RANDOM PSMs: - match all spectra to real (‘target’) DB and to fake (‘decoy) DB - often decoy DB is the same peptides in the library but reverse sequence one measure of FDR: 2*(# decoy hits) / (# decoy hits + # target hits) 3. Use #2 above to calculate posterior probabilities for EACH PSM 27

3. Use #2 above to calculate posterior probabilities for EACH PSM - mixture model approach: take the distribution of ALL scores S - this is a mixture of ‘correct’ PSMs and ‘incorrect’ PSMs - but we don’t know which are correct or incorrect - scores from decoy comparison are included, which can provide some idea of the distribution of ‘incorrect’ scores -EM or Bayesian approaches can then estimate the proportion of correct vs. incorrect PSM … based on each PSM score, a posterior probability is calculated FDR can be done at the level of PSM identification … but often done at the level of Protein identification 28

Error in PSM identification can amplify FDR in Protein identification Often focus on proteins identified by at least 2 different PSMs (or proteins with single PSMs of very high posterior probability) Nesvizhskii (2010) J. Proteomics, 73: Some methods combine PSM FDR to get a protein FDR 29

Some practical guidelines for analyzing proteomics results 1.Know that abundant proteins are much easier to identify 2.# of peptides per protein is an important consideration - proteins ID’d with >1 peptide are more reliable - proteins ID’d with 1 peptide observed repeatedly are more reliable - note than longer proteins are more likely to have false PSMs 3.Think carefully about the p-value/FDR and know how it was calculated 4.Know that proteomics is no where near saturating … many proteins will be missed 30