Session III How we analyzed proteomic data? 台大生技教改暑期課程.

Slides:



Advertisements
Similar presentations
Tandem MS (MS/MS) on the Q-ToF2
Advertisements

In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry Lian Yang 2 ; Baozhen Shan 1 ; Bin Ma 2 1 Bioinformatics.
MN-B-C 2 Analysis of High Dimensional (-omics) Data Kay Hofmann – Protein Evolution Group Week 5: Proteomics.
Proteomics Informatics – Protein identification III: de novo sequencing (Week 6)
How to identify peptides October 2013 Gustavo de Souza IMM, OUS.
Database Searches. Peptide mass fingerprinting digestMS Search HIT SCORE Protein X 1000 Protein Y 50 Protein Z 5 Protein X theoretical digestProtein Y.
Peptide Mass Fingerprinting
Mass Fingerprint. Protease A protease is any enzyme that conducts proteolysis, that is, begins protein catabolism by hydrolysis of the peptide bonds that.
Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005.
PROTEIN IDENTIFICATION BY MASS SPECTROMETRY. OBJECTIVES To become familiar with matrix assisted laser desorption ionization-time of flight mass spectrometry.
ProReP - Protein Results Parser v3.0©
Proteomics. What for? Disease targets Gene finding Secondary modifications Measuring expression levels Protein-protein interactions.
Basics of 2-DE and MALDI-ToF MS
Proteomics Informatics – Protein identification II: search engines and protein sequence databases (Week 5)
Proteomics Informatics Workshop Part I: Protein Identification
Previous Lecture: Regression and Correlation
De Novo Sequencing of MS Spectra
Scaffold Download free viewer:
My contact details and information about submitting samples for MS
Proteomics Informatics (BMSC-GA 4437) Course Director David Fenyö Contact information
“Proteomics & Bioinformatics” MBI, Master's Degree Program in Helsinki, Finland 8 May, 2007 Sophia Kossida, BRF, Academy of Athens, Greece Esa Pitkänen,
Proteome.
Identification of regulatory proteins from human cells using 2D-GE and LC-MS/MS Victor Paromov Christian Muenyi William L. Stone.
Mueller LN, Brusniak MY, Mani DR, Aebersold R
Proteomics Informatics – Data Analysis and Visualization (Week 13)
The Student Research and Scholarship Center Grove School of Engineering, And Pathways Bioinformatics Center, CCNY Present Winter Bioinformatics Workshop.
PROTEIN CHARACTERIZATION
HPP Preliminary Results La Cristalera, August 2012 Montserrat Carrascal, Joan Villanueva, Joaquín Abián LP-CSIC/UAB.
UPDATE! In-Class Wed Oct 6 Latil de Ros, Derek Buns, John.
INF380 - Proteomics-91 INF380 – Proteomics Chapter 9 – Identification and characterization by MS/MS The MS/MS identification problem can be formulated.
Common parameters At the beginning one need to set up the parameters.
Analysis of Complex Proteomic Datasets Using Scaffold Free Scaffold Viewer can be downloaded at:
Laxman Yetukuri T : Modeling of Proteomics Data
Protein Identification by Database Searching John Cottrell Matrix Science.
Lecture 9. Functional Genomics at the Protein Level: Proteomics.
In-Gel Digestion Why In-Gel Digest?
Protein Identification via Database searching Attila Kertész-Farkas Protein Structure and Bioinformatics Group, ICGEB, Trieste.
Anti-Importin  3 NIH3T3 cells Primary hippocampal neurons Mouse ES cells Anti-Importin  1 Anti-Importin  1 NIH3T3 cells Primary hippocampal neurons.
INF380 - Proteomics-71 INF380 – Proteomics Chap 7 –Protein Identification and Characterization by MS Protein identification in our context means that we.
CSE182 CSE182-L11 Protein sequencing and Mass Spectrometry.
Peptide Identification via Tandem Mass Spectrometry Sorin Istrail.
Eat Raw & Fresh: Introducing isotopic Mass-to-charge Ratio and Envelope Fingerprinting (iMEF) and ProteinGoggle for Protein Database Search Zhixin(Michael)
EBI is an Outstation of the European Molecular Biology Laboratory. In silico analysis of accurate proteomics, complemented by selective isolation of peptides.
Proteomics Informatics (BMSC-GA 4437) Instructor David Fenyö Contact information
Protein Identification Using Tandem Mass Spectrometry Nathan Edwards Center for Bioinformatics and Computational Biology University of Maryland, College.
The observed and theoretical peptide sequence information Cal.MassObserved. Mass ±da±ppmStart Sequence EndSequenceIon Score C.I%modification FLPVNEK.
Salamanca, March 16th 2010 Participants: Laboratori de Proteomica-HUVH Servicio de Proteómica-CNB-CSIC Participants: Laboratori de Proteomica-HUVH Servicio.
Mascot Example Slides. MS/MS Database Search Example Data: BSAonespectra.mgf (one spectra) Database: bovine Fixed modifications: Carboxymethyl(C )
Deducing protein composition from complex protein preparations by MALDI without peptide separation.. TP #419 Kenneth C. Parker SimulTof Corporation, Sudbury,
2014 생화학 실험 (1) 6주차 실험조교 : 류 지 연 Yonsei Proteome Research Center 산학협동관 421호
Constructing high resolution consensus spectra for a peptide library
Lecture 2.31 Mass Spectrometry: Applications to Proteomics David Wishart University of Alberta Edmonton, AB
김지형. Introduction precursor peptides are dynamically selected for fragmentation with exclusion to prevent repetitive acquisition of MS/MS spectra.
B Monoisotopic mass of neutral peptide M r (calc): Fixed modifications: Carbamidomethyl Ions score: 45 † Expect: ‡ Matches (red): 18/50.
Yonsei Proteome Research Center Peptide Mass Finger-Printing Part II. MALDI-TOF 2013 생화학 실험 (1) 6 주차 자료 임종선 조교 내선 6625.
Post translational modification n- acetylation Peptide Mass Fingerprinting (PMF) is an analytical technique for identifying unknown protein. Proteins to.
Database Search Algorithm for Identification of Intact Cross-Links in Proteins and Peptides Using Tandem Mass Sepctrometry 신성호.
Mass Spectrometry 101 (continued) Hackert - CH 370 / 387D
MassMatrix Search Results Explained
Protein Identification via Database searching
Figure SI-15. Detailed experimental procedures.
Methodology for LC-MS/MS data analysis
Interpretation of Mass Spectra I
Protein Identification by Peptide Mass Fingerprinting
Proteomics Informatics –
Protein Identification Using Tandem Mass Spectrometry
Processing of fragment ion information in DTA files to remove isotope ions and noise. Processing of fragment ion information in DTA files to remove isotope.
Sim and PIC scoring results for standard peptides and the test shotgun proteomics dataset. Sim and PIC scoring results for standard peptides and the test.
Interpretation of Mass Spectra
Kuen-Pin Wu Institute of Information Science Academia Sinica
Presentation transcript:

Session III How we analyzed proteomic data? 台大生技教改暑期課程

Topics for session III 1. Image analysis (for 2-DE gel) 2. Mass data analysis 3. Protein structure analysis

1. Image analysis

Examples of 2-DE results D Healthy controlPatient Digest to peptide fragment MS analysis

Unexpected variation between gels Interior variation between 2-DE experiments –Same loading amount? –Same gel condition? –Same staining condition? Exterior variation after gel developed –Unwanted spots (dye or reagent deposit) –Dirty spots (hair, dust)

What image analysis software do? – spot detection – unwanted spot filtering – background subtraction – normalization – image matching – expression comparison – pI/MW calibration – data organization

Currently available 2-DE image analysis software – Melanie 4 (Swiss Institute of Bioinformatics, SIB) – Phoretix 2D (Nonlinear dynamics) – Progenesis (Nonlinear dynamics) – Z3 and Z4000 (Compugen) – Delta2D (Sunergia group) – A-GelFox 2D (Alpha innotech) – Flicker (NCI, through internet)

Spot detection One of the first and most important steps in 2-DE analysis. Locating the spots in the gel image Locating the spots in the gel image defining their shape defining their shape calculating measurement information (volume and area) calculating measurement information (volume and area)

Spot detection

Filtering Removal of unwanted spots: What’s unwanted spots: dust on gel dust on gel stain deposit stain deposit bulky spots bulky spots

Background subtraction General background subtraction method: No background No background Mode of non-spot Mode of non-spot Manual background Manual background Lowest boundary Lowest boundary Average boundary Average boundary

Normalization General normalization method: Total spot volume Total spot volume Single spot Single spot Total volume ratio Total volume ratio

Image matching

Expression comparison Two fold up or down expression are thought to be significant.

pI/MW calibration PI calibrationMW calibration observed or experimental pI/MW

Data organization

Spot annotation

2. Mass data analysis

Useful proteomic resource

Useful proteomic resource

Matrix Science - Mascot

Three major functions in Mascot Peptide Mass Fingerprint (PMF): The experimental data are a list of peptide mass values from an enzymatic digest of a protein. (MALDI-TOF) Sequence Query: One or more peptide mass values associated with information such as partial or ambiguous sequence strings, amino acid composition information, MS/MS fragment ion masses, etc. A super-set of a sequence tag query. MS/MS Ion Search: Identification based on raw MS/MS data from one or more peptides. (LC/MS/MS)

Difference between MALDI-TOF and LC/MS/MS MALDI-TOF LC/MS/MS

2-1 PMF analysis

Raw data for PMF m/z Relative intensity

Mascot PMF query form

Mascot PMF parameters Your name; Search title Database Taxonomy Enzyme Monoisotopic or Average Modifications Protein Mass Peptide tol. ± Mass values Missed cleavages Data file Query

Database Comment EST EST divisions of Genbank, (currently EST_human, EST_mouse, EST_others) MSDBComprehensive, non-identical protein database NCBInrComprehensive, non-identical protein database OWLNon-identical protein database (obsolete) RandomRandom sequences for verifying scoring statistics SwissProtHigh quality, curated protein database

Taxonomy Ensure the hit list will only contain entries from the selected species speed up a search bring a weak match

Enzyme NameCleaveDon't cleaveN or C term TrypsinKRPCTERM Arg-CRPCTERM Asp-NBD NTERM Asp-N_ambicDE NTERM ChymotrypsinFYWLPCTERM CNBrM CTERM Formic_acidD CTERM Lys-CKPCTERM Lys-C/PK CTERM PepsinAFL CTERM Tryp-CNBrKRMPCTERM TrypChymoFYWLKRPCTERM Trypsin/PKR CTERM V8-DEBDEZPCTERM V8-EEZPCTERM CNBr+Trypsin M CTERM KRPCTERM Nonesee notes semiTrypsinsee notes

Enzyme "None" is not an allowed choice for a Peptide Mass Fingerprint, where the specificity of an enzyme is essential. If the search fails to produce a positive match, then try again with semiTrypsin (below) before resorting to "None". "semiTrypsin" means that Mascot will search for peptides that show tryptic specificity (KR not P) at one terminus, but where the other terminus may be a non-tryptic cleavage. This is a half-way house between choosing "Trypsin" and "None".

Monoisotopic or Average nominal mass values: nominal mass values: calculated from integer atomic weights. (H=1, C=12, N=14, O=16), not practical in proteomics. Average mass values: Average mass values: equivalent to taking the centroid of the complete isotopic envelope Monoisotopic mass value: Monoisotopic mass value: the mass of the first peak of the isotope distribution.

Monoisotopic or Average For peptides and proteins, the difference between an average and a monoisotopic weight is approximately 0.06%. Insulin (5.8 kD)Albumin (66.4 kD)

Monoisotopic Tol: 1 Da Monoisotopic MW

Average Tol: 1 Da Average MW

Monoisotopic Tol: 2 Da

Average Tol: 2 Da

Modifications Most protein samples exhibit some degree of modification. Natural post-translational modifications: Natural post-translational modifications: phosphorylation and glycosylation. Deliberate modifications: Deliberate modifications: deliberately introduced during sample work-up, such as cysteine derivatisation.

Modifications Fixed modifications Fixed modifications are applied universally, to every instance of the specified residue(s) or terminus. Example: Carboxymethyl (Cys) means that all calculations will use 161 Da as the mass of cysteine. Variable modifications Variable modifications are those which may or may not be present. Example: if Oxidation (Met) is selected, and a peptide contains 3 methionines, Mascot will test for a match with the experimental data for that peptide containing 0, 1, 2, or 3 oxidised methionine residues.

Modifications Fixed modifications Fixed modifications are applied universally, to every instance of the specified residue(s) or terminus. Example: Carboxymethyl (Cys) means that all calculations will use 161 Da as the mass of cysteine. Variable modifications Variable modifications are those which may or may not be present. Example: if Oxidation (Met) is selected, and a peptide contains 3 methionines, Mascot will test for a match with the experimental data for that peptide containing 0, 1, 2, or 3 oxidised methionine residues.

Peptide tol. ± %fraction expressed as a percentage mmuabsolute milli-mass units, i.e. units of.001 Da ppmfraction expressed as parts per million Daabsolute units of Da The error window on experimental peptide mass values

Missed cleavages Missed cleavage = 0, complete digestion Missed cleavage >=1, incomplete digestion

Submit and processing

Concise protein summary

protein summary

PMF protein view (I) Score and Expect Protein name MW and pI coverage

PMF protein view (II) Match peptides No match peptides RMS error Protein information

2-2 MS/MS analysis

Raw data for MS/MS Parent ion Daughter ion

Mascot MS/MS query form

Protein summary Most possible candidate

MS/MS Protein view (I) The sum of all highest scores within each peptide group

MS/MS Protein view (II) Protein score: The sum of all highest scores within each peptide group

Peptide view

3. Protein structure analysis

Research Collaboratory for Structural Bioinformatics (RCSB)

Protein data bank (PDB) Proteosome

Example: 3D structure for proteosome

Stereo view

Rotation

Secondary Structure